How to find which statement failed in a JDBC Batch Update
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
Introduction
Yesterday, my Danish friend, Flemming Harms, asked my a very interesting question related to when a JDBC batch update fails.
Basically, considering we are going to group several DML statements in a batch, we need a way to tell which statement is the cause of the failure. This post is going to answer this question in more detail.
Simulating a failure
Considering we have a Post
entity, whose identifiers are manually assigned:
@Entity(name = "Post") @Table(name = "post") public class Post { @Id private Long id; private String title; //getters and setters omitted for brevity }
Now, we are going to use JDBC batching for grouping several INSERT statements. For this purpose, we will use a PreparedStatement
since it responds better to JDBC batching than a simple java.sql.Statement
.
In the following example, to simulate the failure, we are going to assign the same Primary Key to multiple records so that the database can raise a ConstraintViolationException
:
Session session = entityManager.unwrap(Session.class); session.doWork(connection -> { try (PreparedStatement st = connection.prepareStatement( "INSERT INTO post (id, title) " + "VALUES (?, ?)")) { for (long i = 0; i < 5; i++) { st.setLong( 1, i % 2 ); st.setString( 2, String.format( "High-Performance Java Persistence, Part %d", i ) ); st.addBatch(); } st.executeBatch(); } catch (BatchUpdateException e) { LOGGER.info( "Batch has managed to process {} entries", e.getUpdateCounts().length ); } });
When running the test case above, Hibernate generates the following output:
c.v.b.h.h.b.BatchExceptionTest - testInsertPosts n.t.d.l.SLF4JQueryLoggingListener - Name:DATA_SOURCE_PROXY, Time:0, Success:False, Type:Prepared, Batch:True, QuerySize:1, BatchSize:5, Query:[ "INSERT INTO post (id, title) VALUES (?, ?)"], Params:[ (0, High-Performance Java Persistence, Part 0), (1, High-Performance Java Persistence, Part 1), (0, High-Performance Java Persistence, Part 2), (1, High-Performance Java Persistence, Part 3), (0, High-Performance Java Persistence, Part 4) ] c.v.b.h.h.b.BatchExceptionTest - Batch has managed to process 2 entries
So, from the logged JDBC output, we can see that the third statement is going to conflict with the first one. However, we can pinpoint the failing statements by introspecting the result of the getUpdateCounts
method on the java.sql.BatchUpdateException
that is thrown by the JDBC Driver.
Because the getUpdateCounts
method returns an int[]
array with two entries, we know that only two statements were successfully processed. So, the third statement was the one causing the failure.
I'm running an online workshop on the 11th of October about High-Performance SQL.If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.
Conclusion
To wrap it up, we need to things:
- A good way to log JDBC statements, and datasource-proxy is one of the best ways of logging JDBC
PreparedStatement(s)
- We also need to catch the
BatchUpdateException
and find the number of statements that were sucessfully executed by calling thegetUpdateCounts
method
This way, you will be able to locate which batched statement is causing an issue, and this technique is applicable to any production system. You just have to make sure you collect all logs properly (e.g. Logstash) so that you can query them when a problem occurs.
Code available on GitHub.
