What is Lucene? Apache LuceneTM is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform. Lucene can plain text, integers, index PDF, Office Documents. etc., How Lucene enables Faster Search? Lucence creates something called Inverted Index. Normally we map document -> terms in ...
Read More »Home » Apache Lucene »
Lucene Overview Part One: Creating the Index
Introduction I’ve recently been working with the open source search engine Lucene. I’m no expert, but since I have just pored through some rather sparse documentation and migrated an application from a very old version of Lucene to the latest version, 2.4, I’m pretty clear on the big picture. The documentation for Lucene leaves a bit to the imagination, so ...
Read More »“Did you mean” feature with Apache Lucene Spell-Checker
Google’s “Did you mean” feature After making an introduction to Lucene in a previous post, now it is time to take it up a notch and create a more sophisticated application. You are most surely familiar with Google’s “Did you mean” feature (other search engines support this too). Here is an example of that: Lucene SpellChecker Subproject This feature can ...
Read More »An Introduction to Apache Lucene for Full-Text Search
In this tutorial I would like to talk a bit about Apache Lucene. Lucene is an open-source project that provides Java-based indexing and search technology. Using its API, it is easy to implement full-text search. I will deal with the Lucene Java version, but bear in mind that there is also a .NET port available under the name Lucene.NET, as ...
Read More »