Monday, January 26, 2009

Lucid Imagination and Sematex

Kudos to the Solr/Lucene gang for launching Lucid Imagination. Grant Ingersoll's announcement. People involved here and here. Some time ago Otis Gospodnetić launched Semtext. Good luck to Lucid and Sematext!

Both of these companies are in the 'support and consulting' model. This is wise, as going into Enterprise search directly is a tough road competing with Endeca, Verity(Autonomy), FAST(Microsoft), GoogleBox and the other vendors would be suicidal.

Long ago (2003) I thought of hanging up a shingle for supporting HtDig (a once popular CGI based search engine), but wisely decided that would be a mistake given that even then I could see that Doug Cutting's Java Lucene and Nutch were going to smoke the creaky 8+ year old C++ indexing kernel. Ended up getting RightNow Tech to sponsor conversion of the guts to CLucene, where it still runs today indexing many many tens of millions of documents. Then Solr was announced .... and HtDig development died and I started using Solr.

Just touched base with Geoff Hutchinson the other day and we're going to release the 4.0 CLucene branch of HtDig, and put up an announcement of HtDig end-of-life and encourage people to migrate to Solr.


Anthony said...

I would think the logical replacement to ht://Dig would be Nutch. However, if you're interested in using Solr for searching, check out this. The Foo Factory guys referenced on the Nutch site guys posted their own tutorial, too.

Neal said...

True enough, it needs a spider. To an extent all one really need for HtDig migration is a script to convert the various configurations over to Nutch.

Last time I looked at Nutch (ie not writing my own spidering code) it seemed really configuration heavy versus htDig.

Full Disclosure: Anthony here is the other part of the 'we' in the we converted the gits of HtDig to CLucene.