SQL Server useBulkCopyForBatchInsert configuration property
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
Introduction
In this article, we are going to see how the SQL Server useBulkCopyForBatchInsert configuration property works when using JDBC, JPA, or Hibernate.
SQL Server PreparedStatement batching
When using JPA and Hibernate, the generated SQL statements are going to be executed using the JDBC PreparedStatement
because prepared statements increase the likelihood of statement caching, and you to avoid SQL injection attacks.
By default, when persisting several Post
entities:
for (long i = 1; i <= 10; i++) { postRepository.persist( new Post() .setId(i) .setTitle(String.format("Post no. %d", i)) ); }
Hibernate generates the following SQL INSERT statements:
Query:["insert into post (title, id) values (?, ?)"], Params:[(Post no. 1, 1)] Query:["insert into post (title, id) values (?, ?)"], Params:[(Post no. 2, 2)] Query:["insert into post (title, id) values (?, ?)"], Params:[(Post no. 3, 3)] Query:["insert into post (title, id) values (?, ?)"], Params:[(Post no. 4, 4)] Query:["insert into post (title, id) values (?, ?)"], Params:[(Post no. 5, 5)] Query:["insert into post (title, id) values (?, ?)"], Params:[(Post no. 6, 6)] Query:["insert into post (title, id) values (?, ?)"], Params:[(Post no. 7, 7)] Query:["insert into post (title, id) values (?, ?)"], Params:[(Post no. 8, 8)] Query:["insert into post (title, id) values (?, ?)"], Params:[(Post no. 9, 9)] Query:["insert into post (title, id) values (?, ?)"], Params:[(Post no. 10, 10)]
By default, Hibernate uses the executeUpdate
method of the JDBC PreparedStatement
, so there’s no batching involved.
To enable batching, we need to provide the following Hibernate configuration properties:
spring.jpa.properties.hibernate.jdbc.batch_size=10 spring.jpa.properties.hibernate.order_inserts=true spring.jpa.properties.hibernate.order_updates=true
When rerunning the previous example that was persisting 10 Post
entities, Hibernate is going to execute a single JDBC INSERT:
Type:Prepared, Batch:True, QuerySize:1, BatchSize:10, Query:[" insert into post (title, id) values (?, ?) "], Params:[ (Post no. 1, 1), (Post no. 2, 2), (Post no. 3, 3), (Post no. 4, 4), (Post no. 5, 5), (Post no. 6, 6), (Post no. 7, 7), (Post no. 8, 8), (Post no. 9, 9), (Post no. 10, 10) ]
If you are using the
IDENTITY
entity identifier strategy, Hibernate won’t be able to batch the insert statement automatically.Check out this article.
So, using the default SQL Server JDBC Driver settings, a single statement was sent to the database server.
However, when we inspect the SQL Server query log:
SELECT deqs.execution_count [Execution Time], dest.TEXT AS [Statement] FROM sys.dm_exec_query_stats AS deqs CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest ORDER BY deqs.last_execution_time
We can see that our INSERT is executed 10 times as if it were run in a for-loop:
| Execution Count | Statement | |-----------------|------------------------------------------------------------------------------| | 2 | select next value for post_SEQ | | 10 | (@P0 varchar(8000),@P1 bigint)insert into post (title, id) values (@P0, @P1) |
Even if JDBC batching helped us reduce 9 network roundtrips as we called the database server only once and not 10 times, the statement is still executed 10 times on the database side.
SQL Server useBulkCopyForBatchInsert JDBC Driver setting
According to the SQL Server JDBC Driver documentation, we can use the useBulkCopyForBatchInsert
to transform a batch of INSERT statements into a single multi-value INSERT.
We can enable this setting on the SQLServerDataSource
, like this:
SQLServerDataSource dataSource = getSQLServerDataSource(); dataSource.setRewriteBatchedStatements(true);
When we rerun the previous test case that inserts 10 Post
entities, we can see that now the SQL Server JDBC Driver uses a SQLServerBulkCopy
operation instead of executing a PreparedStatement
:
The SQLServerBulkCopy
is not logged as prepared statements in the dm_exec_query_stats
Dynamic Management View, so you will no longer see the operation logged there.
I'm running an online workshop on the 11th of October about High-Performance SQL.If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.
Conclusion
Depending on the INSERT statements being executed, you may see a performance improvement when enabling the useBulkCopyForBatchInsert
setting.
However, prior to enabling it, take a look at the associated GitHub issues to make sure that there is no open issue that might affect you.
Since this feature is rather new, it’s important to have a solid suite of integration tests that can validate both the effectiveness and the efficiency of this setting. Therefore, the test suite should not fail after enabling this setting, and the performance metrics should confirm the expected performance improvement.
