Hibernate fetching performance tuning

Joining tables in SQL is the foundation of a relational database, as joins allow you to actually model relationships between tables. In plain JDBC, joins between tables are composed using native SQL statements. On the other hand, when using Hibernate/JPA this is greatly simplified as you will specify the relation between tables in terms of Objects. A key aspect of it is that, when you define a relationship in an ORM tool like Hibernate or JPA, you’ll also need to define the fetch type.

TIP: The fetch type essentially determines whether or not to load all of the relationships of a particular object/table as soon as the object/table is initially fetched.

There are two strategies for fetch type:

EAGER loading of related tables means that they are fetched fully at the time their parent is fetched. So if you have a `Customer` Entity and it has `List`, all the Customer’s requests are fetched from the database at the time the Customer is fetched.
LAZY loading means that the contents of the List are fetched only when you try to access them. For example, by calling `customer.getRequests().iterator()`. Calling any access method on the List will initiate a call to the database to retrieve the elements. This is implemented by creating a Proxy around the List (or Set).

As you might understand, the recommended approach (and the default one too) is to use **Lazy initialization** which improves performance by avoiding unnecessary computation and reduce memory requirements.

On the other hand, by using the default LAZY loading strategy does not mean that you are using automatically the best strategy for optimizing performance. Let’s consider the following relation:

hibernate fetching performance tuning

As you can see, the Customer has a 1:n relationship with the Request table. So what would happen if you try to access all Requests from a Customer? this is the naive approach to it:

int total = 0;
List<Customer> list = em.createQuery("from Customer").getResultList();
for (Customer c : list) {
    List<Request> listRequest = c.getRequests();
    for (Request request : listRequest) {

        total += request.getQuantity();
    }
}

So, we retrieve all requests for each Customer and we sum their quantity. Let’s see what Hibernate will do behind the hoods:

SELECT customer0_.id           AS id1_0_,
       customer0_.address      AS address2_0_,
       customer0_.email        AS email3_0_,
       customer0_.NAME         AS name4_0_,
       customer0_.phone_number AS phone_nu5_0_
FROM   customer customer0_  

SELECT requests0_.id_customer AS id_custo3_1_1_,
       requests0_.id          AS id1_1_1_,
       requests0_.id          AS id1_1_0_,
       requests0_.id_customer AS id_custo3_1_0_,
       requests0_.quantity    AS quantity2_1_0_
FROM   request requests0_
WHERE  requests0_.id_customer IN ( ?, ?, ?, ?,
                                   ?, ?, ?, ?,
                                   ?, ? )

As you can see, Hibernate performed an additional query to initialize the relation. That doesn’t sound like a huge problem, however, what happens in a real world scenario where you might have dozens of associations and thousands of customers? That is called the `n+1 select issue`, and it is known to be the biggest performance penalty when using Hibernate or JPA. Let’s see some options which could help to solve this issue.

Option 1: Use a Fetch Join in your JPQL

The JOIN FETCH in JPQL is not a regular SQL JOIN. The purpose of this construct is to specify related objects that should be fetched from the database with the query results on the same round trip. Using this query improves the efficiency of iteration over the Customer objects because it eliminates the need for retrieving the associated Request objects separately.

Here is how you could apply a Fetch Join in our example:

List<Customer> list = em.createQuery("FROM Customer c JOIN FETCH c.requests i WHERE c.id = :id").getResultList();

Looking at the Hibernate logs, you will see a single query has been issued to retrieve the Customer’s requests:

SELECT customer0_.id           AS id1_0_0_,
       requests1_.id           AS id1_1_1_,
       customer0_.address      AS address2_0_0_,
       customer0_.email        AS email3_0_0_,
       customer0_.NAME         AS name4_0_0_,
       customer0_.phone_number AS phone_nu5_0_0_,
       requests1_.id_customer  AS id_custo3_1_1_,
       requests1_.quantity     AS quantity2_1_1_,
       requests1_.id_customer  AS id_custo3_1_0__,
       requests1_.id           AS id1_1_0__
FROM   customer customer0_
       INNER JOIN request requests1_
               ON customer0_.id = requests1_.id_customer
WHERE  customer0_.id = 1

So we have solved the performance problem. However, the cost of it is that you need to change your SQL code which executes the query. This can get quite messy if you have multiple associations in your query, therefore it’s not always the best approach to limit the number of performed queries.

Option 2: Use the Criteria API

By using the Criteria API, you will be able to use a **CriteriaBuilder** object to restrict the query results based on specific conditions. By using CriteriaQuery `where()` method, we can specify our expression that will return the actual Result. Let’s see in practice how to apply Criteria API to our example:

CriteriaBuilder cb = em.getCriteriaBuilder();
CriteriaQuery q = cb.createQuery(Customer.class);
Root o = q.from(Customer.class);
o.fetch("requests", JoinType.INNER);
q.select(o);
q.where(cb.equal(o.get("id"), 1));

Customer customer = (Customer) this.em.createQuery(q).getSingleResult();

List<Request> listRequest = customer.getRequests();
for (Request request : listRequest) {

    total += request.getQuantity();
}

As for the Fetch Join, the output on the console shows that a single Select (using an inner join) is used to retrieve all Request for a Customer:

SELECT customer0_.id           AS id1_0_0_,
       requests1_.id           AS id1_1_1_,
       customer0_.address      AS address2_0_0_,
       customer0_.email        AS email3_0_0_,
       customer0_.NAME         AS name4_0_0_,
       customer0_.phone_number AS phone_nu5_0_0_,
       requests1_.id_customer  AS id_custo3_1_1_,
       requests1_.quantity     AS quantity2_1_1_,
       requests1_.id_customer  AS id_custo3_1_0__,
       requests1_.id           AS id1_1_0__
FROM   customer customer0_
       INNER JOIN request requests1_
               ON customer0_.id = requests1_.id_customer
WHERE  customer0_.id = 1

Also in this case, the pros and cons are the same discussed in the JPQL query with a fetch join. That is, you will be able to resolve the n+1 problem and improve the performance of your queries, however you need to adapt your code to use Criteria API to perform your searches. On the other hand, if you are already using the Criteria API in your applications, then you have nothing to worry about!

Option 3: Use Entity Graph

A new feature of JPA 2.1 is the Entity Graph You can use it to define a graph of entities that will be queried from the database. Defining an Entity Graph is done via annotations, therefore you don’t need to vary your Query. For example, here is how to define the Entity graph on your Customer Entity:

@Entity
@Table
@NamedEntityGraph(name = "graph.Customer.requests", attributeNodes = @NamedAttributeNode("requests"))

public class Customer implements Serializable {

}

An Entity Graph only requires a name and the list of attributes which needs to be fetched eagerly. In the above example, the field `requests` returns the collection of Request for one Customer:

@OneToMany(mappedBy = "customer", fetch = FetchType.LAZY)
private List<Request> requests;

Now you can use the Entity graph to control how data is fetched for a specific query. You have to instantiate an EntityGraph (based on the @NamedEntityGraph definition) and provide it as a hint to the `EntityManager.find()` method or your query. The named entity graph can then be used by the find method of the EntityManager:

EntityGraph graph = this.em.getEntityGraph("graph.Customer.requests");

Map hints = new HashMap();
hints.put("javax.persistence.fetchgraph", graph);

Customer customer = this.em.find(Customer.class, new Long(1), hints);

List<Request> listRequest = customer.getRequests();

The resulting SQL query follows:

SELECT customer0_.id          AS id1_0_0_,
       customer0_.address      AS address2_0_0_,
       customer0_.email        AS email3_0_0_,
       customer0_.NAME         AS name4_0_0_,
       customer0_.phone_number AS phone_nu5_0_0_,
       requests1_.id_customer  AS id_custo3_1_1_,
       requests1_.id           AS id1_1_1_,
       requests1_.id           AS id1_1_2_,
       requests1_.id_customer  AS id_custo3_1_2_,
       requests1_.quantity     AS quantity2_1_2_
FROM   customer customer0_
       LEFT OUTER JOIN request requests1_
                    ON customer0_.id = requests1_.id_customer
WHERE  customer0_.id = ?

As you can see from the above SQL, a left outer join is used to fetch the data. This join query can be useful where we want to see the matching records but at the same time want to see the record from the first table where matching values in the join condition may be absent.

Using Dynamic Entity Graph

Besides using annotations to define a named Entity Graph, it is also possible to use a Java API to achieve a Dynamic Entity Graph. Let’s have a look at it:

EntityGraph graph = em.createEntityGraph(Customer.class);
Subgraph requestGraph = graph.addSubgraph("requests");
    
Map hints = new HashMap();
hints.put("javax.persistence.loadgraph", graph);
  
Customer customer = em.find(Customer.class, id, hints);

Using a Dynamic Entity Graph can be useful for creating a specific use case which you don’t want to reuse in other parts of your application which might need to access your relations using a different fetch strategy.

This is an excerpt from WildFly Performance Tuning book, a comprhensive guide about improving the performance of applications running on the top of WildFly application server.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.