Careful With Native SQL in Hibernate

I really like Hibernate, but I also don’t know a tool that would be nearly as powerful and deceptive at the same time. I could write a book on surprises in production and cargo cult programming related to Hibernate alone. It’s more of an issue with the users than with the tool, but let’s not get too ranty.

So, here’s a recent example.

Problem

We need a background job that lists all files in a directory and inserts an entry for each of them to a table.

Naive Solution

The job used to be written in Bash and there is some direct SQL reading from the table. So, blinders on and let’s write some direct SQL!

for (String fileName : folder.list()) {
    SQLQuery sql = session.getDelegate().createSQLQuery(
        "insert into dir_contents values (?)");
    sql.setString(0, fileName);
    sql.executeUpdate();
}

Does it work? Sure it does.

Now, what happens if there are 10,000 files in the folder? What if you also have a not so elegant domain model, with way too many entity classes, thousands of instances and two levels of cache all in one context?

All of a sudden this trivial job takes 10 minutes to execute, all that time keeping 2 or 3 CPUs busy at 100%.

What, for just a bunch of inserts?

Easy Fix

The problem is that it’s Hibernate. It’s not just a dumb JDBC wrapper, but it has a lot more going on. It’s trying to keep caches and session state up to date. If you run a bare SQL update, it has no idea what table(s) you are updating, what it depends on and how it affects everything, so just in case it pretty much flushes everything.

If you do this 10,000 times in such a crowded environment, it adds up.

Here’s one way to fix it – rather than running 10,000 updates with flushes, execute everything in one block and flush once.

session.doWork(new Work() {
    public void execute(Connection connection) throws SQLException {
        PreparedStatement ps = connection
                .prepareStatement("insert into dir_contents values (?)");
        for (String fileName : folder.list()) {
            ps.setString(1, fileName);
            ps.executeUpdate();
        }
    }
});

Other Solutions

Surprise, surprise:

  • Do use Hibernate. Create a real entity to represent DirContents and just use it like everything else. Then Hibernate knows what caches to flush when, how to batch updates and so on.
  • Don’t use Hibernate. Use plain old JDBC, MyBatis, or whatever else suits your stack or is there already.

Takeaway

Native SQL has its place, even if this example is not the best use case. Anyway, the point is: If you are using native SQL with Hibernate, mind the session state and caches.

Reference: Careful With Native SQL in Hibernate from our JCG partner Konrad Garus at the Squirrel’s blog.
Related Whitepaper:

Functional Programming in Java: Harnessing the Power of Java 8 Lambda Expressions

Get ready to program in a whole new way!

Functional Programming in Java will help you quickly get on top of the new, essential Java 8 language features and the functional style that will change and improve your code. This short, targeted book will help you make the paradigm shift from the old imperative way to a less error-prone, more elegant, and concise coding style that’s also a breeze to parallelize. You’ll explore the syntax and semantics of lambda expressions, method and constructor references, and functional interfaces. You’ll design and write applications better using the new standards in Java 8 and the JDK.

Get it Now!  

5 Responses to "Careful With Native SQL in Hibernate"

  1. Angel says:

    Or you can just use Hibernate scrollable results (org.hibernate.ScrollableResults)

    http://www.numerati.com/2012/06/26/reading-large-result-sets-with-hibernate-and-mysql/

    The biggest problem of Hibernate is the people that do not know how to use it and that don’t take time to learn how to use it properly.

    How could you possibly think that a library like Hibernate, developed by very smart people and tested and used by thousands of developers, would not cover that use case that you present here.

    No offense :-)

  2. Angel says:

    In the case of an insert. You just need to use StatelessSession.

  3. Angel – I agree wholeheartedly that the main issue with Hibernate is incompetent and careless users.

    Anyway, how does “reading large sets” relate to *inserting* large sets that I am writing about here?

    • Angel says:

      Hi Konrad,

      I read your article and I did not reply straight away. In the mean time and until I replied I probably confused your post with a similar one that was about read operations on large data sets. Later on I realized my mistake and that is when I said that for batch write operation the best strategy is to use StalessSession and to flush it every hundred or so update operations :-)
      Sorry about that.

  4. StatelessSession is another good option, thank you.

Leave a Reply


9 + = eleven



Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.

Sign up for our Newsletter

20,709 insiders are already enjoying weekly updates and complimentary whitepapers! Join them now to gain exclusive access to the latest news in the Java world, as well as insights about Android, Scala, Groovy and other related technologies.

As an extra bonus, by joining you will get our brand new e-books, published by Java Code Geeks and their JCG partners for your reading pleasure! Enter your info and stay on top of things,

  • Fresh trends
  • Cases and examples
  • Research and insights
  • Two complimentary e-books