Enterprise Java

Apache Lucene Fundamentals Tutorial

Course Overview

Apache Lucene is a free/open source information retrieval software library, which provides Java-based indexing and search technology, as well as spellchecking, hit highlighting and advanced analysis/tokenization capabilities.

Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

In this course, you will get an introduction to Lucene. You will see why a library like this is important and then learn how searching works in Lucene.

Moreover, you will learn how to integrate Lucene Search into your own applications in order to provide robust searching capabilities.

About the Author

Piyas is a Sun Microsystems Certified Enterprise Architect with more than 10 long years of professional IT experience in various areas such as Architecture Definition, Enterprise Application, Client-server/ e-business solutions. He possesses hands on experience to handle a wide range of databases ranging from PostGreSQL, SQL Server7.0/2000, Oracle 8i, 10g to Sybase, MySQL and NoSQL databases like MongoDB.

He learns and writes about different aspects of open source technologies like Angular.js, Node.js, MongoDB, Google DART, Apache Lucene, Text Analysis with GATE and related Big Data technologies in his blog (www.phloxblog.in).

Lessons

Introduction to Lucene

In the first lesson, you will get introduced to this amazing library. You will learn about full-text search and the engines to run them. The Lucene workflow is also explained, along with its basic components for indexing and searching. Moreover, you will build a fully functional sample application from scratch. A Lucene based application using Eclipse and Maven will be discussed. The app will index folders and provide search functionality for them.

Lucene Components Overview

In this lesson, you will learn about the Lucene Query (Search) Syntax. You will learn how to leverage the Query class and its subclasses (TermQuery, PhraseQuery, BooleanQuery, etc.) in order to build powerful queries and convert human written search phrases to representative structures.

Lucene Query (Search) Syntax Examples

In this lesson, you will delve into more advanced Query (Search) Syntax Examples. You will learn the specifics of the Lucene Query API, along with the various classes that comprise it. Multiple examples are presented, showcasing the use of each of the subclasses.

Advanced Lucene Query Examples

In this lesson, you will delve into more advanced Query (Search) Syntax Examples. You will learn the specifics of the Lucene Query API, along with the various classes that comprise it. Multiple examples are presented, showcasing the use of each of the subclasses.

Building a Search Index with Lucene

We are now going to build a Search Index with Lucene. The Index is the heart of any component that utilizes Lucene. Much like the index of a book, it organizes all the data so that it is quickly accessible. You will learn how the indexing operation works, how to create an index and perform basic operations on it, and how to work with Documents and fields.

Integrating Lucene Search into an Application

In this lesson, we will discuss how to integrate Lucene Search into an Application. We will see how to parse query strings, create indexes and utilize different types of queries, depending on the type of search we want to perform.

Lucene Analysis Process Guide

In this final lesson, we will discuss how to Analysis. Analysis, in Lucene, is the process of converting field text into its most fundamental indexed representation, terms. In general, the tokens are referred to as words (we are discussing this topic in reference to the English language only) to the analyzers. However, for special analyzers the token can be with more than one words, which includes spaces also. These terms are used to determine what documents match a query during searching. We will see how to choose the right analyzer among a list of several of them (e.g. Whitespace analyzer, Standard Analyzer, Snowball Analyzer, etc.) and how the relevant process actually works.

Make sure to retweet this, let your social followers know!

Piyas De

Piyas is Sun Microsystems certified Enterprise Architect with 10+ years of professional IT experience in various areas such as Architecture Definition, Define Enterprise Application, Client-server/e-business solutions.Currently he is engaged in providing solutions for digital asset management in media companies.He is also founder and main author of "Technical Blogs(Blog about small technical Know hows)" Hyperlink - http://www.phloxblog.in
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments
Back to top button