Getting started with Hibernate Search

Hibernate Search is an extension of Hibernate, a popular Object-Relational Mapping (ORM) framework for Java applications. It provides full-text search capabilities by integrating with powerful search libraries like Apache Lucene or Elastic Search. In this tutorial we will learn how to create a sample application using Hibernate Search.

Hibernate search in a nutshell

Hibernate Core is the core module of Hibernate that focuses on mapping Java objects to relational databases and managing their persistence. It provides features like object-relational mapping, transaction management, and caching.

if you are new to Hibernate and JPA we recommend checking this introduction article: HelloWorld JPA application

On the other hand, Hibernate Search adds an additional layer on top of Hibernate Core to enable full-text search functionality. It integrates search engine libraries, to index and search data efficiently.

Some popular search engine include:

Apache Lucene: . Lucene is a text search library that enables fast and accurate full-text indexing and searching. It operates at a lower level than Hibernate Search and provides the underlying mechanisms to create and manage indexes, tokenize text, and perform complex search queries.

Elasticsearch: Elasticsearch is a distributed search and analytics engine built on top of Lucene. It provides a RESTful API for performing full-text search, real-time data analytics, and data visualization. Elasticsearch offers scalability, fault tolerance, and distributed search capabilities, making it suitable for large-scale applications.

In the following section we will show how to create a simple Hibernate Search application which uses Lucene as text search library.

Hibernate Search project set up

Firstly, to build an Hibernate Search application you need to add to your project the search-mapper-orm library and an indexing library:

<dependency>
	<groupId>org.hibernate.search</groupId>
	<artifactId>hibernate-search-mapper-orm</artifactId>
	<version>${hibernate.search.version}</version>
</dependency>
<dependency>
	<groupId>org.hibernate.search</groupId>
	<artifactId>hibernate-search-backend-lucene</artifactId>
	<version>${hibernate.search.version}</version>
</dependency>

Making the Entity Search-aware

Next step will be adding the relevant Hibernate Search annotations on top of your Entity Class. In our case, we will use the following Book Entity:

Let’s break down the Hibernate Search annotations in the above Entity:

@Indexed
@Entity
@Table(name = "book")
public class Book {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    @Column(name = "id")
    private Long id;

    @FullTextField
    @Column(name = "title")
    private String title;

    @FullTextField
    @Column(name = "author")
    private String author;

    @Column(name = "publication_year")
    private int publicationYear;
    
    // Getters/Setters omitted for brevity
}

@Indexed: This annotation marks the entity class as eligible for indexing and searching with Hibernate Search. It indicates that the “Book” class should be included in the search index.

@FullTextField: This annotation is a Hibernate Search-specific annotation used to mark fields that should be indexed and searchable as full-text. The “title” and “author” fields are marked with this annotation, indicating that their values should be indexed and made available for full-text search.

Adding the Search Session

In Hibernate Search, the SearchSession represents the entry point for performing search operations. It is a key component provided by Hibernate Search that allows you to interact with the search functionality.

We will include in our Repository Class the following getSearchResult method to start an Hibernate Search Session:

private SearchResult<T> getSearchResult(String text, int limit, String[] fields) {
        SearchSession searchSession = Search.session(entityManager);

        SearchResult<T> result =
                searchSession
                        .search(getDomainClass())
                        .where(f -> f.match().fields(fields).matching(text).fuzzy(2))
                        .fetch(limit);
        return result;
}

This method perform the following steps:

It obtains a SearchSession by invoking Search.session(entityManager), where entityManager is an instance of javax.persistence.EntityManager.
It uses the obtained SearchSession to start the search operation by invoking the search() method on the appropriate domain class (represented by getDomainClass()).

It defines the search criteria using a lambda expression (f -> ...) within the where() method. The lambda expression specifies the matching conditions for the search. In this case, it uses the match() query to search for the specified text (matching(text)) within the specified fields (fields).
It applies additional options to the search, such as fuzzy matching with a distance of 2 (fuzzy(2)).
It fetches the results by invoking the fetch() method with the specified limit.

Finally, it returns the SearchResult<T> object representing the search results.

We can access the Repository Class from a @Service Class which sends the arguments that we need for our search:

@Service
public class BookService {

    private BookRepository bookRepository;

    private static final List<String> SEARCHABLE_FIELDS = Arrays.asList("title","author");

    public BookService(BookRepository bookRepository) {
        this.bookRepository = bookRepository;
    }

    public List<Book> searchBooks(String text, List<String> fields, int limit) {

        List<String> fieldsToSearchBy = fields.isEmpty() ? SEARCHABLE_FIELDS : fields;

        boolean containsInvalidField = fieldsToSearchBy.stream(). anyMatch(f -> !SEARCHABLE_FIELDS.contains(f));

        if(containsInvalidField) {
            throw new IllegalArgumentException();
        }

        return bookRepository.searchBy(
                text, limit, fieldsToSearchBy.toArray(new String[0]));
    }
}

Adding a Controller to execute Searches

Finally, to allow searching from a REST client, we will add a REST Controller which will expose a “/search” Endpoint:

@RestController
@RequestMapping("/book")
public class BookController {

    private BookService bookService;

    public BookController(BookService bookService) {
        this.bookService = bookService;
    }

    @GetMapping("/search")
    public List<Book> searchBooks(SearchRequestDTO searchRequestDTO) {

        System.out.println("Request for Book search received with data : " + searchRequestDTO);
        return bookService.searchBooks(searchRequestDTO.getText(), searchRequestDTO.getFields(), searchRequestDTO.getLimit());
    }
}

As you can see from the above Controller, we are wrapping the Search arguments (text, fields and limit) in a SearchRequestDTO object. For the sake of brevity, we will omit to show the SearchRequestDTO which you can find in the project source code at the end of this article.

Configuring the DataSource and the Search

Next, let’s add the application.yml file to our sample project

server:
    port: 8080

spring:
    datasource:
        url: jdbc:h2:mem:mydb
        username: sa
        password: sa
    jpa:
        open-in-view: false
        properties:
            hibernate:
                search:
                    backend:
                        type: lucene
                        directory.root: ./data/index

As you can see, we are using Hibernate H2 embedded Database for the persistence. With regards to Hibernate Search we are setting the following properties:

backend.type: lucene: Specifies the backend search engine to be used as Lucene. This indicates that Hibernate Search will utilize Apache Lucene as the search engine.
directory.root: ./data/index: Defines the root directory where the search indexes will be stored. In this case, the indexes will be stored in the ./data/index directory relative to the application’s working directory.

Finally, in order to have some data that we can search, we will add the following import.sql in our project:

insert into Book (id, title, author, publication_year) values (1,'The Great Gatsby', 'Francis Scott Fitzgerald', 1922);
insert into Book (id, title, author, publication_year) values (2,'To Kill a Mockingbird', 'Harper Lee', 1960);
insert into Book (id, title, author, publication_year) values (3,'1984', 'George Orwell', 1949);
insert into Book (id, title, author, publication_year) values (4,'Pride and Prejudice', 'Jane Austen', 1813);
insert into Book (id, title, author, publication_year) values (5,'The Catcher in the Rye', 'J.D. Salinger', 1951);
insert into Book (id, title, author, publication_year) values (6,'The Lord of the Rings', 'J.R.R. Tolkien', 1954);
insert into Book (id, title, author, publication_year) values (7,'The Hobbit', 'J.R.R. Tolkien', 1937);
insert into Book (id, title, author, publication_year) values (8,'Harry Potter and the Philosopher Stone', 'J.K. Rowling', 1997);
insert into Book (id, title, author, publication_year) values (9,'To Kill a Kingdom', 'Alexandra Christo', 2018);
insert into Book (id, title, author, publication_year) values (10,'The Alchemist', 'Paulo Coelho', 1988);

Running our Searches

Everything is ready for testing! Build and start the Spring Boot application with:

mvn install spring-boot:run

Here is a sample search which uses the text “The” within the Books titles:

curl -s GET 'http://localhost:8080/book/search?text=The&limit=5&fields=title' | jq

[
  {
    "id": 5,
    "title": "The Catcher in the Rye",
    "author": "J.D. Salinger",
    "publicationYear": 1951
  },
  {
    "id": 6,
    "title": "The Lord of the Rings",
    "author": "J.R.R. Tolkien",
    "publicationYear": 1954
  },
  {
    "id": 7,
    "title": "The Hobbit",
    "author": "J.R.R. Tolkien",
    "publicationYear": 1937
  },
  {
    "id": 10,
    "title": "The Alchemist",
    "author": "Paulo Coelho",
    "publicationYear": 1988
  },
  {
    "id": 1,
    "title": "The Great Gatsby",
    "author": "Francis Scott Fitzgerald",
    "publicationYear": 1922
  }
]

Finally, here’s another search which we perform on the author’s name:

curl -s GET 'http://localhost:8080/book/search?text=Tolkien&limit=5&fields=author' | jq
[
  {
    "id": 7,
    "title": "The Hobbit",
    "author": "J.R.R. Tolkien",
    "publicationYear": 1937
  },
  {
    "id": 6,
    "title": "The Lord of the Rings",
    "author": "J.R.R. Tolkien",
    "publicationYear": 1954
  }
]

Conclusion

In conclusion, Hibernate Search is a powerful and versatile library that brings full-text search capabilities to Hibernate-based applications. By seamlessly integrating with the Hibernate ORM framework and leveraging the robustness of Apache Lucene, Hibernate Search simplifies the implementation of advanced search functionality.

Source code: https://github.com/fmarchioni/mastertheboss/tree/master/hibernate/hibernate-search

Found the article helpful? if so please follow us on Socials