Nanobeagle

23.11.2005 21:43

I was planning to add a search function to tablix.org for some time. Specially the mailing list archive got so large that it is very time consuming to find posts about a specific subject.

All free search engines I found on the net resembled Google and other general internet search engines, which is not what I want. These search engines use a web crawler to index data on the web page which isn't very efficient if you want to index a trusted and a well known web site. Also simple web crawlers are bad because 1) They index all text (including menus, etc.) on web pages, not only content 2) They don't have any idea what exactly are they indexing. If you get a search hit, you have to check it manually if it is a news article, a mailing list post or a page in the documentation.

On the other hand, desktop searches, like Beagle are much more advanced - they use plugins to recognize and properly index each type of file on the filesystem. So I tried to make a web search engine that would be as flexible as that. It must use plugins to properly index mailing list archives, wiki pages, Nanoblogger posts, articles, Doxygen reference, etc. I used this Beagle mockup from Beagle UI Hackfest as a guide.

The result: Nanobeagle.

It's written in Perl, uses Swish-e indexing engine and it's not yet very stable. It currently has three indexing plugins: Hypermail archives, Nanoblogger news and Nanoblogger static articles. Icons are from the gnome-icon-theme part of the GNOME CVS repository.

Posted by Tomaž | Categories: Code

Add a new comment


(No HTML tags allowed. Separate paragraphs with a blank line.)