About Martin Mois

Martin is a Java EE enthusiast and works for an international operating company. He is interested in clean code and the software craftsmanship approach. He also strongly believes in automated testing and continuous integration.

Efficiently delete data with JPA and Hibernate

You may come to the situation where you have to perform a bulk deletion on a huge amount of datasets stored in a relational database. If you use JPA with Hibernate as underlying OR mapper, you might try to call the remove() method of the EntityManager in a way like the following:
 
 
 
 
 
 
 

public void removeById(long id) {
    RootEntity rootEntity = entityManager.getReference(RootEntity.class, id);
    entityManager.remove(rootEntity);
}

First of all, we load a reference representation of the entity we want to delete and then pass this reference to the EntityManager. Let’s assume the RootEntity from above has a child relation to a class called ChildEntity:

@OneToMany(mappedBy = "rootEntity", fetch = FetchType.EAGER, cascade = CascadeType.ALL)
private Set childEntities = new HashSet(0);

If we now turn on the property show_sql of hibernate, we will wonder what SQL statements are issued:

select
        rootentity0_.id as id5_1_,
        rootentity0_.field1 as field2_5_1_,
        rootentity0_.field2 as field3_5_1_,
        childentit1_.PARENT as PARENT5_3_,
        childentit1_.id as id3_,
        childentit1_.id as id4_0_,
        childentit1_.field1 as field2_4_0_,
        childentit1_.field2 as field3_4_0_,
        childentit1_.PARENT as PARENT4_0_
    from
        ROOT_ENTITY rootentity0_
    left outer join
        CHILD_ENTITY childentit1_
            on rootentity0_.id=childentit1_.PARENT
    where
        rootentity0_.id=?

    delete
    from
        CHILD_ENTITY
    where
        id=?

   delete
   from
       ROOT_ENTITY
   where
       id=?

Why does Hibernate first load all data into memory in order to delete this data immediately afterwards? The reason is that JPA’s lifecycle requires that the object is in “managed” state, before it can be deleted. Only in this state all lifecycle functionality like interceptors is available (see here). Therefore Hibernate issues a SELECT query before the deletion in order to transfer both RootEntity and ChildEntity to the “managed” state. But what can we do, if we just want to delete RootEntity and ChildEntity, if we know the id of RootEntity? The answer is to use a simple DELETE query like the following one. But due to the integrity constraint on the child table, we first have to delete all depending child entities. The following code demonstrates how:

List childIds = entityManager.createQuery("select c.id from ChildEntity c where c.rootEntity.id = :pid").setParameter("pid", id).getResultList();
for(Long childId : childIds) {
    entityManager.createQuery("delete from ChildEntity c where c.id = :id").setParameter("id", childId).executeUpdate();
}
entityManager.createQuery("delete from RootEntity r where r.id = :id").setParameter("id", id).executeUpdate();

The above code results in the three SQL statements we would have expected by calling remove(). Now you may argue, that this way of deletion is more complicated than just calling the EntityManager’s remove() method. It also ignores annotations like @OneToMany and @ManyToOne we have placed in the two entity classes. So why not write some code that uses the knowledge about the two entities that already exists in the two class files? First of all, we look for @OneToMany annotations using reflection in the RootEntity class, extract the type of the child entity and then look for its back relation field annotated with @ManyToOne. Having done this, we can easily write the three SQL statements in a more generic way:

public void delete(EntityManager entityManager, Class parentClass, Object parentId) {
    Field idField = getIdField(parentClass);
    if (idField != null) {
        List oneToManyFields = getOneToManyFields(parentClass);
        for (Field field : oneToManyFields) {
            Class childClass = getFirstActualTypeArgument(field);
            if (childClass != null) {
                Field manyToOneField = getManyToOneField(childClass, parentClass);
                Field childClassIdField = getIdField(childClass);
                if (manyToOneField != null && childClassIdField != null) {
                    List childIds = entityManager.createQuery(String.format("select c.%s from %s c where c.%s.%s = :pid", childClassIdField.getName(), childClass.getSimpleName(), manyToOneField.getName(), idField.getName())).setParameter("pid", parentId).getResultList();
                    for (Long childId : childIds) {
                        entityManager.createQuery(String.format("delete from %s c where c.%s = :id", childClass.getSimpleName(), childClassIdField.getName())).setParameter("id", childId).executeUpdate();
                    }
                }
            }
        }
        entityManager.createQuery(String.format("delete from %s e where e.%s = :id", parentClass.getSimpleName(), idField.getName())).setParameter("id", parentId).executeUpdate();
    }
}

The methods getFirstActualTypeArgument(), getManyToOneField(), getIdField() and getOneToManyFields() in the code above are not depicted here, but do what their name sounds like. Once implemented we can easily delete all entities beginning with the root of the tree.

  • A simple example application that can be used to examine the behavior and solution described above, can be found on github.

 

Related Whitepaper:

Functional Programming in Java: Harnessing the Power of Java 8 Lambda Expressions

Get ready to program in a whole new way!

Functional Programming in Java will help you quickly get on top of the new, essential Java 8 language features and the functional style that will change and improve your code. This short, targeted book will help you make the paradigm shift from the old imperative way to a less error-prone, more elegant, and concise coding style that’s also a breeze to parallelize. You’ll explore the syntax and semantics of lambda expressions, method and constructor references, and functional interfaces. You’ll design and write applications better using the new standards in Java 8 and the JDK.

Get it Now!  

3 Responses to "Efficiently delete data with JPA and Hibernate"

  1. Marcelo says:

    What about second level cache? Are the records removed from the cache when HQL/JPQL are used?

    • Martin says:

      When you use entityManager.remove() they will be removed from the cache. In Hibernate you can also use the dedicated method session.evict() to remove an entity explicitly.

      But in the use case above we do not want to load the entites into the first or second level cache before deleting them. The goal was to delete without loading them into the cache. This way you can delete hundreds of entites with one statement.

  2. josete says:

    Then, why do we use JPA ? Let’s ‘rollback’ to plain old JDBC

Leave a Reply


+ two = 4



Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.

Sign up for our Newsletter

20,709 insiders are already enjoying weekly updates and complimentary whitepapers! Join them now to gain exclusive access to the latest news in the Java world, as well as insights about Android, Scala, Groovy and other related technologies.

As an extra bonus, by joining you will get our brand new e-books, published by Java Code Geeks and their JCG partners for your reading pleasure! Enter your info and stay on top of things,

  • Fresh trends
  • Cases and examples
  • Research and insights
  • Two complimentary e-books