About Vlad Mihalcea

Vlad Mihalcea is a software architect passionate about software integration, high scalability and concurrency challenges.

Hibernate Facts: Favoring bidirectional Set(s) vs List(s)

Hibernate is a great ORM tool, and it eases development considerably, but it has a lot of gotchas you must be aware of if you want to use it properly.

On medium to large projects it’s very common to have bidirectional parent-child associations, which allow us to navigate both ends of a given relationship.

When it comes to controlling the persist/merge part of the association, there are two options available. One would be to have the @OneToMany end in charge of synchronizing the collection changes, but this is an inefficient approach, which is very well described here.

The most common approach is when the @ManyToOne side controls the association and the @OneToMany end is using the “mappedBy” option.

I will discuss the latter approach, since it’s the most common and the most efficient one, in terms of the executed queries number.

So, for bidirectional collections we could use a java.util.List or a java.util.Set.

According to Hibernate docs, lists and bags are more efficient than sets.

But I am still getting anxious when I see the following code:

@Entity
public class Parent {

...

@OneToMany(cascade = CascadeType.ALL, mappedBy = "parent", orphanRemoval = true)
private List children = new ArrayList()

public List getChildren() {
return children;
}

public void addChild(Child child) {
children.add(child);
child.setParent(this);
}

public void removeChild(Child child) {
children.remove(child);
child.setParent(null);
}
}

@Entity
public class Child {

...

@ManyToOne
private Parent parent;

public Parent getParent() {
return parent;
}

public void setParent(Parent parent) {
this.parent = parent;
}
}

Parent parent = loadParent(parentId);
Child child1 = new Child();
child1.setName("child1");
Child child2 = new Child();
child2.setName("child2");
parent.addChild(child1);
parent.addChild(child2);
entityManager.merge(parent);

This is because for the last five years I’ve been getting duplicate children inserted when the merge operation is called on the parent association. This happens because of the following issues: HHH-3332 and HHH-5855.

I’ve been testing some Hibernate versions lately and this still replicates on 3.5.6, 3.6.10 and 4.2.6 versions. So, after 5 years of seeing this on many projects you understand why I’m being skeptical of using Lists vs Sets.

This is what I get when running a test case replicating this issue, so for adding two children we get:

select parent0_.id as id1_2_0_ from Parent parent0_ where parent0_.id=?
insert into Child (id, name, parent_id) values (default, ?, ?)
insert into Child (id, name, parent_id) values (default, ?, ?)
insert into Child (id, name, parent_id) values (default, ?, ?)
insert into Child (id, name, parent_id) values (default, ?, ?)

This issue only replicates if a merge operations is cascaded from parent to children, and there are workarounds like:

  • merging the child instead of the parent
  • persisting the children prior to merging the parent
  • removing the Cascade.ALL or Cascade.MERGE from parent, since it only affects the merge operation and not the persist one.

But all of those are hacks, and are very difficult to follow on a large-scale project, with many developers working on the same code base.

So, the my preferred way is to use Sets, even if sometimes they’ll be less efficient than Lists, but since I always favor correctness vs performance optimization, I am better off using Sets.

When it comes to this types of problems, it’s good to have code conventions, as they are easy to add in a project development guideline, and are also easier to remember and be adopted.

One advantage of using Sets is that it forces you to define a proper equals/hashCode strategy (which should always include the entity’s business key. A business key is a field combinations that’s unique, or unique among a parent’s children, and that’s consistent even before and after the entity is persisted into the database).

If you are worried you are going to lose the List ability of saving the children in the same order you’ve added them, then you can still emulate this for Sets too.

By default Sets are unordered and unsorted, but even if you can’t order them you may still sort them by a given column, by using the @OrderBy JPA annotation like this:

@Entity
public class LinkedParent {

...

@OneToMany(cascade = CascadeType.ALL, mappedBy = "parent", orphanRemoval = true)
@OrderBy("id")
private Set children = new LinkedHashSet();

...

public Set getChildren() {
return children;
}

public void addChild(LinkedChild child) {
children.add(child);
child.setParent(this);
}

public void removeChild(LinkedChild child) {
children.remove(child);
child.setParent(null);
}
}

When the parent’s children are loaded, the generated SQL is like:

select children0_.parent_id as parent_i3_3_1_, children0_.id as id1_2_1_, children0_.id as id1_2_0_, children0_.name as name2_2_0_, children0_.parent_id as parent_i3_2_0_ from LinkedChild children0_ where children0_.parent_id=? order by children0_.id

Conclusion:

If your domain model requires using a List than a Set will break your constraint, disallowing duplicates. But if you need duplicates you can still use an Indexed List. A Bag is said to be unsorted and “unordered” (even if it retrieves the children in the order they were added in the database table). So an indexed List would be also a good candidate, right?

I also wanted to draw attention on a 5 year bug, affecting multiple Hibernate versions, and one that I replicated on multiple projects. There are workarounds of course, like removing the Cascade.Merge or merging the Children vs the Parent, but there are many developers unaware of this issue and its workarounds.

According to Hibernate docs: Sets are “the recommended way to represent many-valued associations“, and I’ve seen many cases where Bags were employed as the default bi-directional collection, even if a set would have been a better choice anyway.

So, I’m still caution about Bags, and if my domain model imposes using a List I’d always pick the indexed one.

 

Related Whitepaper:

Functional Programming in Java: Harnessing the Power of Java 8 Lambda Expressions

Get ready to program in a whole new way!

Functional Programming in Java will help you quickly get on top of the new, essential Java 8 language features and the functional style that will change and improve your code. This short, targeted book will help you make the paradigm shift from the old imperative way to a less error-prone, more elegant, and concise coding style that’s also a breeze to parallelize. You’ll explore the syntax and semantics of lambda expressions, method and constructor references, and functional interfaces. You’ll design and write applications better using the new standards in Java 8 and the JDK.

Get it Now!  

Leave a Reply


six × = 6



Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.

Sign up for our Newsletter

20,709 insiders are already enjoying weekly updates and complimentary whitepapers! Join them now to gain exclusive access to the latest news in the Java world, as well as insights about Android, Scala, Groovy and other related technologies.

As an extra bonus, by joining you will get our brand new e-books, published by Java Code Geeks and their JCG partners for your reading pleasure! Enter your info and stay on top of things,

  • Fresh trends
  • Cases and examples
  • Research and insights
  • Two complimentary e-books