Integration testing done right with Embedded MongoDB

Introduction

Unit testing requires isolating individual components from their dependencies. Dependencies are replaced with mocks, which simulate certain use cases. This way, we can validate the in-test component behavior across various external context scenarios.

Web components can be unit tested using mock business logic services. Services can be tested against mock data access repositories. But the data access layer is not a good candidate for unit testing because database statements need to be validated against an actual running database system.

Integration testing database options

Ideally, our tests should run against a production-like database. But using a dedicated database server is not feasible, as we most likely have more than one developer to run such integration test-suites. To isolate concurrent test runs, each developer would require a dedicated database catalog. Adding a continuous integration tool makes matters worse since more tests would have to be run in parallel.

Lesson 1: We need a forked test-suite bound database

When a test suite runs, a database must be started and only made available to that particular test-suite instance. Basically we have the following options:

  • An in-memory embedded database
  • A temporary spawned database process

The fallacy of in-memory database testing

Java offers multiple in-memory relational database options to choose from:

Embedding an in-memory database is fast and each JVM can run its own isolated database. But we no longer test against the actual production-like database engine because our integration tests will validate the application behavior for a non-production database system.

Using an ORM tool may provide the false impression that all database are equal, especially when all generated SQL code is SQL-92 compliant.

What’s good for the ORM tool database support may deprive you from using database specific querying features (window functions, Common table expressions, PIVOT).

So the integration testing in-memory database might not support such advanced queries. This can lead to reduced code coverage or to pushing developers to only use the common-yet-limited SQL querying features.

Even if your production database engine provides an in-memory variant, there may still be operational differences between the actual and the lightweight database versions.

Lesson 2: In-memory databases may give you the false impression that your code will also run on a production database

Spawning a production-like temporary database

Testing against the actual production database is much more valuable and that’s why I grew to appreciate this alternative.

When using MongoDB we can choose the embedded mongo plugin. This open-source project creates an external database process that can be bound to the current test-suite life-cycle.

If you’re using Maven, you can take advantage of the embedmongo-maven-plugin:

<plugin>
	<groupId>com.github.joelittlejohn.embedmongo</groupId>
	<artifactId>embedmongo-maven-plugin</artifactId>
	<version>${embedmongo.plugin.version}</version>
	<executions>
		<execution>
			<id>start</id>
			<goals>
				<goal>start</goal>
			</goals>
			<configuration>
				<port>${embedmongo.port}</port>
				<version>${mongo.test.version}</version>
				<databaseDirectory>${project.build.directory}/mongotest</databaseDirectory>
				<bindIp>127.0.0.1</bindIp>
			</configuration>
		</execution>
		<execution>
			<id>stop</id>
			<goals>
				<goal>stop</goal>
			</goals>
		</execution>
	</executions>
</plugin>

When running the plugin, the following actions are taken:

  1. A MongoDB pack is downloaded

    [INFO] --- embedmongo-maven-plugin:0.1.12:start (start) @ mongodb-facts ---
    Download Version{2.6.1}:Windows:B64 START
    Download Version{2.6.1}:Windows:B64 DownloadSize: 135999092
    Download Version{2.6.1}:Windows:B64 0% 1% 2% 3% 4% 5% 6% 7% 8% 9% 10% 11% 12% 13% 14% 15% 16% 17% 18% 19% 20% 21% 22% 23% 24% 25% 26% 27% 28% 29% 30% 31% 32% 33% 34% 35% 36% 37% 38% 39% 40% 41% 42% 43% 44% 45% 46% 47% 48% 49% 50% 51% 52% 53% 54% 55% 56% 57% 58% 59% 60% 61% 62% 63% 64% 65% 66% 67% 68% 69% 70% 71% 72% 73% 74% 75% 76% 77% 78% 79% 80% 81% 82% 83% 84% 85% 86% 87% 88% 89% 90% 91% 92% 93% 94% 95% 96% 97% 98% 99% 100% Download Version{2.6.1}:Windows:B64 downloaded with 3320kb/s
    Download Version{2.6.1}:Windows:B64 DONE
    
  2. Upon starting a new test suite, the MongoDB pack is unzipped under a unique location in the OS temp folder

    Extract C:\Users\vlad\.embedmongo\win32\mongodb-win32-x86_64-2008plus-2.6.1.zip START
    Extract C:\Users\vlad\.embedmongo\win32\mongodb-win32-x86_64-2008plus-2.6.1.zip DONE
    
  3. The embedded MongoDB instance is started.

    [mongod output]note: noprealloc may hurt performance in many applications
    [mongod output] 2014-10-09T23:25:16.889+0300 [DataFileSync] warning: --syncdelay 0 is not recommended and can have strange performance
    [mongod output] 2014-10-09T23:25:16.891+0300 [initandlisten] MongoDB starting : pid=2384 port=51567 dbpath=D:\wrk\vladmihalcea\vladmihalcea.wordpress.com\mongodb-facts\target\mongotest 64-bit host=VLAD
    [mongod output] 2014-10-09T23:25:16.891+0300 [initandlisten] targetMinOS: Windows 7/Windows Server 2008 R2
    [mongod output] 2014-10-09T23:25:16.891+0300 [initandlisten] db version v2.6.1
    [mongod output] 2014-10-09T23:25:16.891+0300 [initandlisten] git version: 4b95b086d2374bdcfcdf2249272fb552c9c726e8
    [mongod output] 2014-10-09T23:25:16.891+0300 [initandlisten] build info: windows sys.getwindowsversion(major=6, minor=1, build=7601, platform=2, service_pack='Service Pack 1') BOOST_LIB_VERSION=1_49
    [mongod output] 2014-10-09T23:25:16.891+0300 [initandlisten] allocator: system
    [mongod output] 2014-10-09T23:25:16.891+0300 [initandlisten] options: { net: { bindIp: "127.0.0.1", http: { enabled: false }, port: 51567 }, security: { authorization: "disabled" }, storage: { dbPath: "D:\wrk\vladmihalcea\vladmihalcea.wordpress.com\mongodb-facts\target\mongotest", journal: { enabled: false }, preallocDataFiles: false, smallFiles: true, syncPeriodSecs: 0.0 } }
    [mongod output] 2014-10-09T23:25:17.179+0300 [FileAllocator] allocating new datafile D:\wrk\vladmihalcea\vladmihalcea.wordpress.com\mongodb-facts\target\mongotest\local.ns, filling with zeroes...
    [mongod output] 2014-10-09T23:25:17.179+0300 [FileAllocator] creating directory D:\wrk\vladmihalcea\vladmihalcea.wordpress.com\mongodb-facts\target\mongotest\_tmp
    [mongod output] 2014-10-09T23:25:17.240+0300 [FileAllocator] done allocating datafile D:\wrk\vladmihalcea\vladmihalcea.wordpress.com\mongodb-facts\target\mongotest\local.ns, size: 16MB,  took 0.059 secs
    [mongod output] 2014-10-09T23:25:17.240+0300 [FileAllocator] allocating new datafile D:\wrk\vladmihalcea\vladmihalcea.wordpress.com\mongodb-facts\target\mongotest\local.0, filling with zeroes...
    [mongod output] 2014-10-09T23:25:17.262+0300 [FileAllocator] done allocating datafile D:\wrk\vladmihalcea\vladmihalcea.wordpress.com\mongodb-facts\target\mongotest\local.0, size: 16MB,  took 0.021 secs
    [mongod output] 2014-10-09T23:25:17.262+0300 [initandlisten] build index on: local.startup_log properties: { v: 1, key: { _id: 1 }, name: "_id_", ns: "local.startup_log" }
    [mongod output] 2014-10-09T23:25:17.262+0300 [initandlisten]     added index to empty collection
    [mongod output] 2014-10-09T23:25:17.263+0300 [initandlisten] waiting for connections on port 51567
    [mongod output] Oct 09, 2014 11:25:17 PM MongodExecutable start
    INFO: de.flapdoodle.embed.mongo.config.MongodConfigBuilder$ImmutableMongodConfig@26b3719c
    
  4. For the life-time of the current test-suite you can see the embedded-mongo process:

    C:\Users\vlad>netstat -ano | findstr 51567
      TCP    127.0.0.1:51567        0.0.0.0:0              LISTENING       8500
      
    C:\Users\vlad>TASKLIST /FI "PID eq 8500"
    
    Image Name                     PID Session Name        Session#    Mem Usage
    ========================= ======== ================ =========== ============
    extract-0eecee01-117b-4d2     8500 RDP-Tcp#0                  1     44,532 K  
    

    embed-mongo

  5. When the test-suite is finished the embeded-mongo is stopped

    [INFO] --- embedmongo-maven-plugin:0.1.12:stop (stop) @ mongodb-facts ---
    2014-10-09T23:25:21.187+0300 [initandlisten] connection accepted from 127.0.0.1:64117 #11 (1 connection now open)
    [mongod output] 2014-10-09T23:25:21.189+0300 [conn11] terminating, shutdown command received
    [mongod output] 2014-10-09T23:25:21.189+0300 [conn11] dbexit: shutdown called
    [mongod output] 2014-10-09T23:25:21.189+0300 [conn11] shutdown: going to close listening sockets...
    [mongod output] 2014-10-09T23:25:21.189+0300 [conn11] closing listening socket: 520
    [mongod output] 2014-10-09T23:25:21.189+0300 [conn11] shutdown: going to flush diaglog...
    [mongod output] 2014-10-09T23:25:21.189+0300 [conn11] shutdown: going to close sockets...
    [mongod output] 2014-10-09T23:25:21.190+0300 [conn11] shutdown: waiting for fs preallocator...
    [mongod output] 2014-10-09T23:25:21.190+0300 [conn11] shutdown: closing all files...
    [mongod output] 2014-10-09T23:25:21.191+0300 [conn11] closeAllFiles() finished
    [mongod output] 2014-10-09T23:25:21.191+0300 [conn11] shutdown: removing fs lock...
    [mongod output] 2014-10-09T23:25:21.191+0300 [conn11] dbexit: really exiting now
    [mongod output] Oct 09, 2014 11:25:21 PM de.flapdoodle.embed.process.runtime.ProcessControl stopOrDestroyProcess
    

Conclusion

The embed-mongo plugin is nowhere slower than any in-memory relation database systems. It makes me wonder why there isn’t such an option for open-source RDBMS (e.g. PostgreSQL). This is a great open-source project idea and maybe Flapdoodle OSS will offer support for relational databases too.

Code available on GitHub.

If you liked this article, you might want to subscribe to my newsletter too.

Advertisements

6 thoughts on “Integration testing done right with Embedded MongoDB

  1. I’m personally not 100% sure if it is a good idea to have complete in-memory-ness as an option when running the whole server. I personally find this more useful as a per-storage-unit option – e.g. when saying that a certain table is now “in-memory” (or even a certain row). In the end, persistence behaves quite differently when you have a spinning disk with lots of I/O wait times, compared to memory with almost no wait times.

    But at least, in your example, the two storage modes are offered by the same vendor who is very likely to provide certain guarantes as far as interoperability is concerned. So, turning on the in-memory flag on MongoDB for development is much better than replacing Oracle for H2, for instance.

    1. Actually MongoDB didn’t write this tool. This is a GitHub project that simply starts an external mongo process and connects to it through classic networking. It’s not an in-memory database, it works like any other Mongo database. It even uses a disk based data folder, which usually resides in the target directory.

      I’d like to see such solution for open source RDBMS too.

  2. Hi!
    I used to use this approach till mongo 2.6 was released.
    Strangely enough, with mongo 2.6 it became drastically slower (especially batch writes), making it impossible to integration-test with embedded mongo.
    Tried different combinations with different mongo java drivers, but with no help, and was forced to set up development ‘test server’ for that matter.
    Reason is still unknown, just the facts: with 2.4 works like a charm, and awful performance with 2.6.
    When someone who faced same problem found a solution, please share.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s