How to run database integration tests 20 times faster

Introduction

In-memory databases such as H2, HSQLDB, and Derby are great to speed up integration tests. Although most database queries can be run against these in-memory databases, many enterprise systems make use of complex native queries which can only be tested against an actual production-like relational database.

In this post, I’m going to show you how you can run PostgreSQL and MySQL integration tests almost as fast as any in-memory database.

Hibernate tests

Hibernate uses H2 by default, and, running all tests for the documentation module (317 tests) takes around 46 seconds:

> gradle clean test

:documentation:processTestResources
:documentation:testClasses
:documentation:test

BUILD SUCCESSFUL

Total time: 46.148 secs

MySQL

Now let’s see how much time it takes to run all these tests on my local MySQL 5.7 database engine:

> gradle clean test -Pdb=mysql

:documentation:processTestResources
:documentation:testClasses
:documentation:test

BUILD SUCCESSFUL

Total time: 30 mins 26.568 secs

MySQL DDL statements are very expensive, and each unit tests creates and destroys a SessionFactory, which, in turn, creates and destroys a database schema. This allows each test to start with a clean state, therefore providing test isolation.

However, by default, all transactions are ACID, and, to ensure Durability, all changes need to be flushed to disk whenever a transaction is completed. Creating and dropping a database schema after each test requires many I/O intensive operations that take a toll on the overall test execution time.

Luckily, for integration tests, we don’t need any Durability guarantee. We only need speed!

That being said, we can move the database data directory onto a RAM disk. On Linux, you can use tempfs, but, because I have a Windows machine, I’m going to use the ImDisk Virtual Disk Driver utility for this purpose.

If you’re interested in speeding up database integration tests with Docker and tmpfs, check out this article. It works on any Operation System (Linux, OSX, Windows), and, even for Windows, it’s much easier to work than with an ImDisk Virtual Disk Driver.

The ImDisk Virtual Disk Driver allows you to map a fragment of the total RAM memory just like a hard disk drive.

The script that does all the work looks like this:

sc stop %MYSQL_SERVICE%

imdisk -D -m R:
imdisk -a -s 2G -m R: -P  -p "/FS:NTFS /C /Y"

mkdir R:\data
xcopy "%MySQL_DATA%\data" "R:\data" /S /E

"%MySQL_HOME%\bin\mysqld" --defaults-file="%MySQL_DATA%\my_ram.ini"
  1. First, I stop the default MySQL service.
  2. Then, I map a 2 Gb RAM drive (e.g. R:\), and I format it as NTFS.
  3. Afterward, I copy the MySQL data folder onto the new in-memory drive.
  4. Last, I just start a new MySQL daemon using a configuration file where the data directory is configured as follows:
# Path to the database root
datadir=R:/data

When I’m done testing, to stop the daemon and start the previous MySQL service, I can run the following batch script:

"%MySQL_HOME%\bin\mysqladmin" -u mysql -p shutdown

imdisk -D -m R:

sc start %MYSQL_SERVICE%

Now, running the tests in the Hibernate documentation module takes less than 2 minutes:

> gradle clean test -Pdb=mysql

:documentation:processTestResources
:documentation:testClasses
:documentation:test

BUILD SUCCESSFUL

Total time: 1 mins 41.022 secs

We can do better than this. As previously explained, we don’t need Durability at all, so I’m going to change some MySQL configurations which are explained in this very well-written Percona article:

log-output=NONE
slow-query-log=0
innodb_flush_log_at_trx_commit=2
innodb_log_buffer_size=3M
innodb_buffer_pool_size=180M

Rerunning our tests and we get:

Total time: 1 mins 30.628 secs

This is a 20 time improvement over the default MySQL database engine configuration.

PostgreSQL

Of course, this is not limited to MySQL. In fact, we can apply the same logic to any relation database that supports custom data directory configuration.

Running the documentation tests on PostgreSQL, takes over 3 minutes using the default settings:

> gradle clean test -Pdb=pgsql

:documentation:processTestResources
:documentation:testClasses
:documentation:test

BUILD SUCCESSFUL

Total time: 3 mins 23.471 secs

To start a new PostgreSQL daemon running on an in-memory drive, we need to use the following batch script:

sc stop %PGSQL_SERVICE%

imdisk -D -m R:
imdisk -a -s 2G -m R: -P  -p "/FS:NTFS /C /Y"

mkdir R:\data
xcopy "%PGSQL_DATA%" "R:\data" /S /E

"%PGSQL_HOME%\bin\pg_ctl" start -D R:\data

When we are done testing, we can stop the PostgreSQL daemon and start the default service as follows:

"%PGSQL_HOME%\bin\pg_ctl" stop -D R:\data

imdisk -D -m R:

sc start %PGSQL_SERVICE%

Rerunning the documentation tests, we get the following results:

> gradle clean test -Pdb=pgsql


:documentation:processTestResources
:documentation:testClasses
:documentation:test

BUILD SUCCESSFUL

Total time: 1 mins 45.431 secs

Just like with MySQL, we can improve the PostgreSQL settings as explained in this post. For this, we need to change the postgresql.conf file as follows:

fsync = off
synchronous_commit = off
full_page_writes = off

We also need to change the startup script so that we also copy the new postgresql.conf in the in-memory data folder:

sc stop %PGSQL_SERVICE%

imdisk -D -m R:
imdisk -a -s 2G -m R: -P  -p "/FS:NTFS /C /Y"

mkdir R:\data
xcopy "%PGSQL_DATA%" "R:\data" /S /E
xcopy postgresql.conf "R:\data" /Y 

"%PGSQL_HOME%\bin\pg_ctl" start -D R:\data

This time, we get the following results:

Total time: 1 mins 37.935 secs

The documentation module is tiny compared to hibernate-core which currently has 4352 unit tests. With these optimizations in place, running the hibernate-core tests takes:

H2 MySQL PostgreSQL
5 mins 34.711 secs 7 mins 55.082 secs 8 mins 34.275 secs

If you enjoyed this article, I bet you are going to love my book as well.

Conclusion

Although not as fast as H2, by using a RAM drive, both MySQL and PostgreSQL integration tests run reasonably fast. Happy testing!

Code available on GitHub.

Enter your email address to follow this blog and receive notifications of new posts by email.

Advertisements

10 thoughts on “How to run database integration tests 20 times faster

  1. The caveat is, of course, a “works on my machine” situation in terms of performance. Productive performance issues are less easy to predict if a data subset is deployed for testing, on a test database

    1. This is about integration tests, not performance tests or smoke tests. The integration tests guarantee that all functionalities are still working with every change we try to make to our product. In this case, speed matters because otherwise developers will just skip running tests because “it’s not like my change is going to break something” syndrome.

  2. Before I got an SSD I used to do similar things all the time, like put my Chrome cache on a ramdisk:

    @echo off
    REM format z: /y
    imdisk -a -s 200M -m z: -p "/fs:ntfs /q /y"
    mkdir z:\cache
    rmdir /s /q "C:\Users\User\AppData\Local\Google\Chrome\User Data\Default\Cache"
    junction "C:\Users\User\AppData\Local\Google\Chrome\User Data\Default\Cache" z:\Cache
    
  3. Hi Vlad, nice article!

    About “creating database schema on each unit test”, why don’t you create it only for the first time and use a tool like DBUnit to manage database state among tests? I ask because it’s not likely the database will change during tests.

    1. Thanks. Well, for Hibernate tests, we need a clean state for each test.

      For an enterprise application, I’d create the DB using FlywayDB from the migration scripts and probably use DBUnit as well.

    1. I’ve never tried it with Oracle because I was happy with its speed for Hibernate tests. The major problem was MySQL and MariaDB, and that’s was the main reason for applying this trick. If I test this in-memory mapping with Oracle in future, I’ll let you know.

  4. Thanks for your wonderful article. One of our Enterprise Application uses Oracle Database. We have around 300 tables with bunch of other DB objects (Views, sequences, functions , stored procedures). We are still in Hibernate 3.3 version (with JPA 1.0). Is it possible to setup a clean database and do integration tests? Our goal is to test the Data access layer (DAO) . Any suggestion from you is highly appreciated.

    1. Of course you can set up a test database, and that’s what you should have in your Continuous Integration pipeline. Hibernate 3.3 is very old, so you might want to migrate to 5.2 which offers many performance improvements, as well as new features. Related to managing the DB schema, use FlywayDB, not HBM2DDL.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s