Home » Apache Lucene » Page 3

Tag Archives: Apache Lucene

Testing Lucene’s index durability after crash or power loss

One of Lucene’s useful transactional features is index durability which ensures that, once you successfully call IndexWriter.commit, even if the OS or JVM crashes or power is lost, or you kill -KILL your JVM process, after rebooting, the index will be intact (not corrupt) and will reflect the last successful commit before the crash. Of course, this only works if ...

Read More »

Using Lucene’s search server to search Jira issues

You may remember my first blog post describing how the Lucene developers eat our own dog food by using a Lucene search application to find our Jira issues. That application has become a powerful showcase of a number of modern Lucene features such as drill sideways and dynamic range faceting, a new suggester based on infix matches, postings highlighter, block-join ...

Read More »

Finding long tail suggestions using Lucene’s new FreeTextSuggester

Lucene’s suggest module offers a number of fun auto-suggest implementations to give a user live search suggestions as they type each character into a search box. For example, WFSTCompletionLookup compiles all suggestions and their weights into a compact Finite State Transducer, enabling fast prefix lookup for basic suggestions. AnalyzingSuggester improves on this by using an Analyzer to normalize both the ...

Read More »

Three exciting Lucene features in one day

Yesterday was a productive day: suddenly, there are three exciting new features coming to Lucene. Expressions module The first feature, committed yesterday, is the new expressions module. This allows you to define a dynamic field for sorting, using an arbitrary String expression. There is builtin support for parsing JavaScript, but the parser is pluggable if you want to create your ...

Read More »

Screaming fast Lucene searches using C++ via JNI

At the end of the day, when Lucene executes a query, after the initial setup the true hot-spot is usually rather basic code that decodes sequential blocks of integer docIDs, term frequencies and positions, matches them (e.g. taking union or intersection for BooleanQuery), computes a score for each hit and finally saves the hit if it’s competitive, during collection. Even ...

Read More »

Searching made easy with Apache Lucene 4.3

Lucene is a Full Text Search Engine written in Java which can lend powerful search capabilities to any application. At heart of Lucene lies a file based Full Text Index. Lucene provides APIs to create this index and then add and delete contents to this index. Further it allows search and retrieval of information from this index using powerful search ...

Read More »

Transactional Lucene

Many users don’t appreciate the transactional semantics of Lucene’s APIs and how this can be useful in search applications. For starters, Lucene implements ACID properties: Atomicity: when you make changes (adding, removing documents) in an IndexWriter session, and then commit, either all (if the commit succeeds) or none (if the commit fails) of your changes will be visible, never something ...

Read More »

Lucene – Quickly add Index and Search Capability

What is Lucene? Apache LuceneTM is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform. Lucene can plain text, integers, index PDF, Office Documents. etc., How Lucene enables Faster Search? Lucence creates something called Inverted Index. Normally we map document -> terms in ...

Read More »