Mitch Pronschinske is the Lead Research Analyst at DZone. Researching and compiling content for DZone's research guides is his primary job. He likes to make his own ringtones, watches cartoons/anime, enjoys card and board games, and plays the accordion. Mitch is a DZone Zone Leader and has posted 2577 posts at DZone. You can read more from them at their website. View Full User Profile

LinkedIn Open Sources IndexTank - A Search System Kinda Like Solr

  • submit to reddit
Big news for the world of open source search.  IndexTank, a real-time search and indexing system recently acquired by LinkedIn, has just been officially open sourced under the Apache 2.0 license. 

LinkedIn promised to open source the technology after they acquired the technology from another company. 

  • IndexEngine: a real-time fulltext search-and-indexing system designed to separate relevance signals from document text. This is because the life cycle of these signals is different from the text itself, especially in the context of user-generated social inputs (shares, likes, +1, RTs).
  • API: a RESTful interface that handles authentication, validation, and communication with the IndexEngine(s). It allows users of IndexTank to access the service from different technology platforms (Java, Python, .NET, Ruby and PHP clients are already developed) via HTTP.
  • Nebulizer: a multitenant framework to host and manage an unlimited number of indexes running over a layer of Infrastructure-as-a-Service. This component of IndexTank will instantiate new virtual instances as needed, move indexes as they need more resources, and try to be reasonably efficient about it.
    --Diego Basch, LinkedIn Director of Engineering

And how, you ask, is IndexTank different from Lucene and Solr?  An old FAQ describes, but I'm pretty sure the more recent versions of Solr have these features now too.  Solr definitely has Geospatial searching:

There are many functional differences. For example, documents added to an IndexTank index are immediately searchable; document variables can be updated without having to re-index the whole document and they can be updated very quickly and at a very rapid pace without affecting the index performance; results can be sorted by arbitrary functions that include geolocation support.  --IndexTank FAQ

So go try it out if you're interested.  IndexTank was previously provided as a paid SaaS, but those services (which were actually used by Reddit) were shut down late in October because of the impending open sourcing.


Otis Gospodnetic replied on Thu, 2011/12/22 - 9:35pm

Small note: IndexTank was not used by LinkedIn for (m)any years. LinkedIn uses internally built Zoie, Bobo, and now Sensei. Zoie uses Lucene and Sensei uses Zoie and Bobo, among others.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.