Seize the deal!
Caching Best Practices
Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!
In-memory databases such as H2, HSQLDB, and Derby are great to speed up integration tests. Although most database queries can be run against these in-memory databases, many enterprise systems make use of complex native queries which can only be tested against an actual production-like relational database.
In this post, I’m going to show you how you can run PostgreSQL and MySQL integration tests almost as fast as any in-memory database.
Hibernate uses H2 by default, and, running all tests for the documentation module (317 tests) takes around 46 seconds:
> gradle clean test :documentation:processTestResources :documentation:testClasses :documentation:test BUILD SUCCESSFUL Total time: 46.148 secs
Now let’s see how much time it takes to run all these tests on my local MySQL 5.7 database engine:
> gradle clean test -Pdb=mysql :documentation:processTestResources :documentation:testClasses :documentation:test BUILD SUCCESSFUL Total time: 30 mins 26.568 secs
MySQL DDL statements are very expensive, and each unit tests creates and destroys a
SessionFactory, which, in turn, creates and destroys a database schema. This allows each test to start with a clean state, therefore providing test isolation.
However, by default, all transactions are ACID, and, to ensure Durability, all changes need to be flushed to disk whenever a transaction is completed. Creating and dropping a database schema after each test requires many I/O intensive operations that take a toll on the overall test execution time.
Luckily, for integration tests, we don’t need any Durability guarantee. We only need speed!
That being said, we can move the database data directory onto a RAM disk. On Linux, you can use tempfs, but, because I have a Windows machine, I’m going to use the ImDisk Virtual Disk Driver utility for this purpose.
If you’re interested in speeding up database integration tests with Docker and tmpfs, check out this article. It works on any Operation System (Linux, OSX, Windows), and, even for Windows, it’s much easier to work than with an ImDisk Virtual Disk Driver.
The ImDisk Virtual Disk Driver allows you to map a fragment of the total RAM memory just like a hard disk drive.
The script that does all the work looks like this:
sc stop %MYSQL_SERVICE% imdisk -D -m R: imdisk -a -s 2G -m R: -P -p "/FS:NTFS /C /Y" mkdir R:\data xcopy "%MySQL_DATA%\data" "R:\data" /S /E "%MySQL_HOME%\bin\mysqld" --defaults-file="%MySQL_DATA%\my_ram.ini"
# Path to the database root datadir=R:/data
When I’m done testing, to stop the daemon and start the previous MySQL service, I can run the following batch script:
"%MySQL_HOME%\bin\mysqladmin" -u mysql -p shutdown imdisk -D -m R: sc start %MYSQL_SERVICE%
Now, running the tests in the Hibernate documentation module takes less than 2 minutes:
> gradle clean test -Pdb=mysql :documentation:processTestResources :documentation:testClasses :documentation:test BUILD SUCCESSFUL Total time: 1 mins 41.022 secs
We can do better than this. As previously explained, we don’t need Durability at all, so I’m going to change some MySQL configurations which are explained in this very well-written Percona article:
log-output=NONE slow-query-log=0 innodb_flush_log_at_trx_commit=2 innodb_log_buffer_size=3M innodb_buffer_pool_size=180M
Rerunning our tests and we get:
Total time: 1 mins 30.628 secs
This is a 20 time improvement over the default MySQL database engine configuration.
Of course, this is not limited to MySQL. In fact, we can apply the same logic to any relational database that supports custom data directory configuration.
Running the documentation tests on PostgreSQL, takes over 3 minutes using the default settings:
> gradle clean test -Pdb=pgsql :documentation:processTestResources :documentation:testClasses :documentation:test BUILD SUCCESSFUL Total time: 3 mins 23.471 secs
To start a new PostgreSQL daemon running on an in-memory drive, we need to use the following batch script:
sc stop %PGSQL_SERVICE% imdisk -D -m R: imdisk -a -s 2G -m R: -P -p "/FS:NTFS /C /Y" mkdir R:\data xcopy "%PGSQL_DATA%" "R:\data" /S /E "%PGSQL_HOME%\bin\pg_ctl" start -D R:\data
When we are done testing, we can stop the PostgreSQL daemon and start the default service as follows:
"%PGSQL_HOME%\bin\pg_ctl" stop -D R:\data imdisk -D -m R: sc start %PGSQL_SERVICE%
Rerunning the documentation tests, we get the following results:
> gradle clean test -Pdb=pgsql :documentation:processTestResources :documentation:testClasses :documentation:test BUILD SUCCESSFUL Total time: 1 mins 45.431 secs
Just like with MySQL, we can improve the PostgreSQL settings as explained in this post. For this, we need to change the postgresql.conf file as follows:
fsync = off synchronous_commit = off full_page_writes = off
We also need to change the startup script so that we also copy the new postgresql.conf in the in-memory data folder:
sc stop %PGSQL_SERVICE% imdisk -D -m R: imdisk -a -s 2G -m R: -P -p "/FS:NTFS /C /Y" mkdir R:\data xcopy "%PGSQL_DATA%" "R:\data" /S /E xcopy postgresql.conf "R:\data" /Y "%PGSQL_HOME%\bin\pg_ctl" start -D R:\data
This time, we get the following results:
Total time: 1 mins 37.935 secs
This solution is not limited to Windows OS only. You can achieve the same goal using Docker and
tmpfson any operating system. For more details, check out this article.
The documentation module is tiny compared to hibernate-core which currently has 4352 unit tests. With these optimizations in place, running the hibernate-core tests takes:
|5 mins 34.711 secs||7 mins 55.082 secs||8 mins 34.275 secs|
If you enjoyed this article, I bet you are going to love my upcoming Online Workshops!
- Caching Best Practices with JPA and Hibernate (2.5 hours) on the 30th of September
- High-Performance SQL (4 hours) on the 6th of October in collaboration with Voxxed Days Ticino
- High-Performance SQL (12 hours) starting on the 28th of October in collaboration with Bouvet
Although not as fast as H2, by using a RAM drive, both MySQL and PostgreSQL integration tests run reasonably fast. Happy testing!
Code available on GitHub.