How to find which statement failed in a JDBC Batch Update

Are you struggling with performance issues in your Spring, Jakarta EE, or Java EE application?

What if there were a tool that could automatically detect what caused performance issues in your JPA and Hibernate data access layer?

Wouldn’t it be awesome to have such a tool to watch your application and prevent performance issues during development, long before they affect production systems?

Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, Micronaut, or Play Framework.

So, rather than fixing performance issues in your production system on a Saturday night, you are better off using Hypersistence Optimizer to help you prevent those issues so that you can spend your time on the things that you love!

Introduction

Yesterday, my Danish friend, Flemming Harms, asked my a very interesting question related to when a JDBC batch update fails.

Basically, considering we are going to group several DML statements in a batch, we need a way to tell which statement is the cause of the failure. This post is going to answer this question in more detail.

Simulating a failure

Considering we have a Post entity, whose identifiers are manually assigned:

@Entity(name = "Post")
@Table(name = "post")
public class Post {

    @Id
    private Long id;

    private String title;

    //getters and setters omitted for brevity
}

Now, we are going to use JDBC batching for grouping several INSERT statements. For this purpose, we will use a PreparedStatement since it responds better to JDBC batching than a simple java.sql.Statement.

In the following example, to simulate the failure, we are going to assign the same Primary Key to multiple records so that the database can raise a ConstraintViolationException:

Session session = entityManager.unwrap(Session.class);
session.doWork(connection -> {

    try (PreparedStatement st = connection.prepareStatement(
            "INSERT INTO post (id, title) " +
            "VALUES (?, ?)")) {
        for (long i = 0; i < 5; i++) {
            st.setLong(
                1, 
                i % 2
            );
            st.setString(
                2, 
                String.format(
                    "High-Performance Java Persistence, Part %d", 
                    i
                )
            );
            st.addBatch();
        }
        st.executeBatch();
    } catch (BatchUpdateException e) {
        LOGGER.info(
            "Batch has managed to process {} entries", 
            e.getUpdateCounts().length
        );
    }
});

When running the test case above, Hibernate generates the following output:

c.v.b.h.h.b.BatchExceptionTest - testInsertPosts
n.t.d.l.SLF4JQueryLoggingListener - Name:DATA_SOURCE_PROXY, Time:0, 
Success:False, 
Type:Prepared, 
Batch:True, 
QuerySize:1, 
BatchSize:5, 
Query:[
    "INSERT INTO post (id, title) VALUES (?, ?)"], 
    Params:[
        (0, High-Performance Java Persistence, Part 0), 
        (1, High-Performance Java Persistence, Part 1), 
        (0, High-Performance Java Persistence, Part 2), 
        (1, High-Performance Java Persistence, Part 3), 
        (0, High-Performance Java Persistence, Part 4)
    ]
c.v.b.h.h.b.BatchExceptionTest - Batch has managed to process 2 entries

So, from the logged JDBC output, we can see that the third statement is going to conflict with the first one. However, we can pinpoint the failing statements by introspecting the result of the getUpdateCounts method on the java.sql.BatchUpdateException that is thrown by the JDBC Driver.

Because the getUpdateCounts method returns an int[] array with two entries, we know that only two statements were successfully processed. So, the third statement was the one causing the failure.

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

To wrap it up, we need to things:

  1. A good way to log JDBC statements, and datasource-proxy is one of the best ways of logging JDBC PreparedStatement(s)
  2. We also need to catch the BatchUpdateException and find the number of statements that were sucessfully executed by calling the getUpdateCounts method

This way, you will be able to locate which batched statement is causing an issue, and this technique is applicable to any production system. You just have to make sure you collect all logs properly (e.g. Logstash) so that you can query them when a problem occurs.

Code available on GitHub.

Transactions and Concurrency Control eBook

2 Comments on “How to find which statement failed in a JDBC Batch Update

  1. Hey Vlad, how can we tell which rows were processed successfully? Do we know if the two from the articles are at the beginning of the batch?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.