Home » Tag Archives: Apache Lucene

Tag Archives: Apache Lucene

The structure of Apache Lucene

apache-lucene-logo

The inestimably noble Apache Software Foundation produces many of the blockbuster products (Ant, CouchDB, Hadoop, JMeter, Maven, OpenOffice, Subversion, etc.) that help build our digital universe. One perhaps less well-known gem is Lucene, which, ” … provides Java-based indexing and search technology, as well as spellchecking, hit highlighting and advanced analysis/tokenization capabilities.” Despite its shying from headlines, Lucene forms a ...

Read More »

Apache Lucene 5.0.0 is coming!

apache-lucene-logo

At long last, after a strong series of 4.x feature releases, most recently 4.10.2, we are finally working towards another major Apache Lucene release! There are no promises for the exact timing (it’s done when it’s done!), but we already have a volunteer release manager (thank you Anshum!). A major release in Lucene means all deprecated APIs (as of 4.10.x) ...

Read More »

A new proximity query for Lucene, using automatons

apache-lucene-logo

The simplest Apache Lucene query, TermQuery, matches any document that contains the specified term, regardless of where the term occurs inside each document. Using BooleanQuery you can combine multiple TermQuerys, with full control over which terms are optional (SHOULD) and which are required (MUST) or required not to be present (MUST_NOT), but still the matching ignores the relative positions of ...

Read More »

Choosing a fast unique identifier (UUID) for Lucene

apache-lucene-logo

Most search applications using Apache Lucene assign a unique id, or primary key, to each indexed document. While Lucene itself does not require this (it could care less!), the application usually needs it to later replace, delete or retrieve that one document by its external id. Most servers built on top of Lucene, such as Elasticsearch and Solr, require a ...

Read More »

Testing Lucene’s index durability after crash or power loss

apache-lucene-logo

One of Lucene’s useful transactional features is index durability which ensures that, once you successfully call IndexWriter.commit, even if the OS or JVM crashes or power is lost, or you kill -KILL your JVM process, after rebooting, the index will be intact (not corrupt) and will reflect the last successful commit before the crash. Of course, this only works if ...

Read More »

Using Lucene’s search server to search Jira issues

apache-lucene-logo

You may remember my first blog post describing how the Lucene developers eat our own dog food by using a Lucene search application to find our Jira issues. That application has become a powerful showcase of a number of modern Lucene features such as drill sideways and dynamic range faceting, a new suggester based on infix matches, postings highlighter, block-join ...

Read More »

Finding long tail suggestions using Lucene’s new FreeTextSuggester

apache-lucene-logo

Lucene’s suggest module offers a number of fun auto-suggest implementations to give a user live search suggestions as they type each character into a search box. For example, WFSTCompletionLookup compiles all suggestions and their weights into a compact Finite State Transducer, enabling fast prefix lookup for basic suggestions. AnalyzingSuggester improves on this by using an Analyzer to normalize both the ...

Read More »

Three exciting Lucene features in one day

apache-lucene-logo

Yesterday was a productive day: suddenly, there are three exciting new features coming to Lucene. Expressions module The first feature, committed yesterday, is the new expressions module. This allows you to define a dynamic field for sorting, using an arbitrary String expression. There is builtin support for parsing JavaScript, but the parser is pluggable if you want to create your ...

Read More »
Want to take your Java Skills to the next level?
Grab our programming books for FREE!
  • Save time by leveraging our field-tested solutions to common problems.
  • The books cover a wide range of topics, from JPA and JUnit, to JMeter and Android.
  • Each book comes as a standalone guide (with source code provided), so that you use it as reference.
Last Step ...

Where should we send the free eBooks?

Good Work!
To download the books, please verify your email address by following the instructions found on the email we just sent you.