The best way to detect database connection leaks

(Last Updated On: January 29, 2018)

Introduction

Database connections are not free, and that’s the reason for using a connection pooling solution in the first place. However, the connection pool alone does not solve every issue associated to managing database connections. The application developer must make sure that every Connection is closed when no longer needed. Behind the scenes, the connection pool gives a logical transaction which, when being closed, it returns back to the pool so that it can be further reused by other concurrent transactions.

A connection leak happens when a connection is acquired without ever being closed.

When should connection leaks be detected?

Every relational database offers a way to inspect the underlying connection status, so one can easily open a new SQL terminal and check if there are any connections dangling. However, this minimalistic approach is wrong because it implies that we deployed a broken version of our application into production.

Connection leaks should be detected during testing, therefore preventing connection leaks from occurring in a production environment.

This post is going to demonstrate how you can automate the connection pool detection using mere unit testing. This approach allows us to detect connection leaks both in our actual code base as well as in our test routines. If unit tests are leaking connections, then the Continous Integration process will get broken when the maximum database connections threshold is being reached.

Connection leak detector

To inspect if a given test class is leaking connections, we are going to check the number of dangling connections before and after a given class is being used by the JUnit Test Runner:

@BeforeClass
public static void initConnectionLeakUtility() {
    if ( enableConnectionLeakDetection ) {
        connectionLeakUtil = new ConnectionLeakUtil();
    }
}

@AfterClass
public static void assertNoLeaks() {
    if ( enableConnectionLeakDetection ) {
        connectionLeakUtil.assertNoLeaks();
    }
}

The ConnectionLeakUtil utility looks like this:

public class ConnectionLeakUtil {

    private JdbcProperties jdbcProperties = JdbcProperties.INSTANCE;

    private List idleConnectionCounters = 
        Arrays.asList(
            H2IdleConnectionCounter.INSTANCE,
            OracleIdleConnectionCounter.INSTANCE,
            PostgreSQLIdleConnectionCounter.INSTANCE,
            MySQLIdleConnectionCounter.INSTANCE
    );

    private IdleConnectionCounter connectionCounter;

    private int connectionLeakCount;

    public ConnectionLeakUtil() {
        for ( IdleConnectionCounter connectionCounter : 
            idleConnectionCounters ) {
            if ( connectionCounter.appliesTo( 
                Dialect.getDialect().getClass() ) ) {
                this.connectionCounter = connectionCounter;
                break;
            }
        }
        if ( connectionCounter != null ) {
            connectionLeakCount = countConnectionLeaks();
        }
    }

    public void assertNoLeaks() {
        if ( connectionCounter != null ) {
            int currentConnectionLeakCount = countConnectionLeaks();
            int diff = currentConnectionLeakCount - connectionLeakCount;
            if ( diff > 0 ) {
                throw new ConnectionLeakException( 
                    String.format(
                        "%d connection(s) have been leaked! Previous leak count: %d, Current leak count: %d",
                        diff,
                        connectionLeakCount,
                        currentConnectionLeakCount
                    ) 
                );
            }
        }
    }

    private int countConnectionLeaks() {
        try ( Connection connection = newConnection() ) {
            return connectionCounter.count( connection );
        }
        catch ( SQLException e ) {
            throw new IllegalStateException( e );
        }
    }

    private Connection newConnection() {
        try {
            return DriverManager.getConnection(
                jdbcProperties.getUrl(),
                jdbcProperties.getUser(),
                jdbcProperties.getPassword()
            );
        }
        catch ( SQLException e ) {
            throw new IllegalStateException( e );
        }
    }
}

The IdleConnectionCounter interface defines the contract for counting the number of inactive connections using a database-specific implementation.

public interface IdleConnectionCounter {

    /**
     * Specifies which Dialect the counter applies to.
     *
     * @param dialect dialect
     *
     * @return applicability.
     */
    boolean appliesTo(Class<? extends Dialect> dialect);

    /**
     * Count the number of idle connections.
     *
     * @param connection current JDBC connection to be used for querying the number of idle connections.
     *
     * @return idle connection count.
     */
    int count(Connection connection);
}

For each supported Hibernate Dialect that we use during testing,
there needs to be an IdleConnectionCounter implementation so that we can inspect the number of leaked connections.

H2

public class H2IdleConnectionCounter implements IdleConnectionCounter {

    public static final IdleConnectionCounter INSTANCE = 
        new H2IdleConnectionCounter();

    @Override
    public boolean appliesTo(Class<? extends Dialect> dialect) {
        return H2Dialect.class.isAssignableFrom( dialect );
    }

    @Override
    public int count(Connection connection) {
        try ( Statement statement = connection.createStatement() ) {
            try ( ResultSet resultSet = statement.executeQuery(
                    "SELECT COUNT(*) " +
                    "FROM information_schema.sessions " +
                    "WHERE statement IS NULL" ) ) {
                while ( resultSet.next() ) {
                    return resultSet.getInt( 1 );
                }
                return 0;
            }
        }
        catch ( SQLException e ) {
            throw new IllegalStateException( e );
        }
    }
}

Oracle

public class OracleIdleConnectionCounter implements IdleConnectionCounter {

    public static final IdleConnectionCounter INSTANCE = 
        new OracleIdleConnectionCounter();

    @Override
    public boolean appliesTo(Class<? extends Dialect> dialect) {
        return Oracle10gDialect.class.isAssignableFrom( dialect );
    }

    @Override
    public int count(Connection connection) {
        try ( Statement statement = connection.createStatement() ) {
            try ( ResultSet resultSet = statement.executeQuery(
                    "SELECT COUNT(*) " +
                    "FROM v$session " +
                    "WHERE status = 'INACTIVE'" ) ) {
                while ( resultSet.next() ) {
                    return resultSet.getInt( 1 );
                }
                return 0;
            }
        }
        catch ( SQLException e ) {
            throw new IllegalStateException( e );
        }
    }
}

PostgreSQL

public class PostgreSQLIdleConnectionCounter implements IdleConnectionCounter {

    public static final IdleConnectionCounter INSTANCE = 
        new PostgreSQLIdleConnectionCounter();

    @Override
    public boolean appliesTo(Class<? extends Dialect> dialect) {
        return PostgreSQL91Dialect.class.isAssignableFrom( dialect );
    }

    @Override
    public int count(Connection connection) {
        try ( Statement statement = connection.createStatement() ) {
            try ( ResultSet resultSet = statement.executeQuery(
                    "SELECT COUNT(*) " +
                    "FROM pg_stat_activity " +
                    "WHERE state ILIKE '%idle%'" ) ) {
                while ( resultSet.next() ) {
                    return resultSet.getInt( 1 );
                }
                return 0;
            }
        }
        catch ( SQLException e ) {
            throw new IllegalStateException( e );
        }
    }
}

MySQL

public class MySQLIdleConnectionCounter implements IdleConnectionCounter {

    public static final IdleConnectionCounter INSTANCE = 
        new MySQLIdleConnectionCounter();

    @Override
    public boolean appliesTo(Class<? extends Dialect> dialect) {
        return MySQL5Dialect.class.isAssignableFrom( dialect );
    }

    @Override
    public int count(Connection connection) {
        try ( Statement statement = connection.createStatement() ) {
            try ( ResultSet resultSet = statement.executeQuery(
                    "SHOW PROCESSLIST" ) ) {
                int count = 0;
                while ( resultSet.next() ) {
                    String state = resultSet.getString( "command" );
                    if ( "sleep".equalsIgnoreCase( state ) ) {
                        count++;
                    }
                }
                return count;
            }
        }
        catch ( SQLException e ) {
            throw new IllegalStateException( e );
        }
    }
}

Testing time

I built this utility so that we can track all the unit tests that are leaking connections in Hibernate ORM project. When running it against hibernate-core, I can easily spot the culprit tests:

:hibernate-core:test

org.hibernate.jpa.test.EntityManagerFactoryClosedTest > classMethod FAILED
    org.hibernate.testing.jdbc.leak.ConnectionLeakException

org.hibernate.jpa.test.EntityManagerFactoryUnwrapTest > classMethod FAILED
    org.hibernate.testing.jdbc.leak.ConnectionLeakException

org.hibernate.jpa.test.cdi.NoCdiAvailableTest > classMethod FAILED
    org.hibernate.testing.jdbc.leak.ConnectionLeakException

org.hibernate.jpa.test.factory.SynchronizationTypeTest > classMethod FAILED
    org.hibernate.testing.jdbc.leak.ConnectionLeakException

When I open the report for the EntityManagerFactoryClosedTest, I can even see how many connections are being leaked:

org.hibernate.testing.jdbc.leak.ConnectionLeakException: 1 connection(s) have been leaked! Previous leak count: 0, Current leak count: 1

SynchronizationTypeTest even indicates that there are previous connections leaks as well:

org.hibernate.testing.jdbc.leak.ConnectionLeakException: 1 connection(s) have been leaked! Previous leak count: 2, Current leak count: 3

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

Detecting connection leaks is a mandatory requirement for every enterprise application. While you can find scripts which run periodically and kill all idle database connections, this is just a band aid approach.

The best way to deal with connection leaks is to fix the underlying code base so that connections are always closed properly. To make sure that the production environment is connection leak free, every integration test must validate that no connection is being leaked by the actual tested source code or by the testing logic itself.

Subscribe to our Newsletter

* indicates required
10 000 readers have found this blog worth following!

If you subscribe to my newsletter, you'll get:
  • A free sample of my Video Course about running Integration tests at warp-speed using Docker and tmpfs
  • 3 chapters from my book, High-Performance Java Persistence, 
  • a 10% discount coupon for my book. 
Get the most out of your persistence layer!

Advertisements

9 thoughts on “The best way to detect database connection leaks

  1. I had once a connection leak in production. It was leaking slowly, once a day.
    I found it by subclassing a class from the Apache pool manager. Every time someone obtained a connection, I created an exception to capture the call stack, then I stored it in the context of the connection. Unlike the underlying connection manager, I was tracking in a lost the connections which I handed out.

    Any time a request came for a connections, I was checking the age of active connections in my list. If a connection was handed out more than an hour ago, I would show the stack trace in the application log.

    I promptly found the source line which asked for the connection without closing it. There was surely some overhead on my tracking, but I only had to run it for a day.

    1. There many other approaches. You can use FlexyPool as well and monitor connections.
      Also, closing connections after 5 or 10 minutes is good practice because you can clear the associated meta data as well. It’s important to avoid creating connections multiple times per second, not to try to keep a connections for hours or days.

  2. nice approach test could be extended to test under a pooling api as mostly is the case in this case the waiting count coming from v$session shall have a default waiting count between min pool and max pool size and if it extends beyond that we guess there is a leak lingering somewhere . I guess a healthcheck can be build in such vain to detect in live instances as sometimes there are other factors network and beyond which cause pooling issues in the real time .

  3. Pretty good post, Vlad!

    If I understood correctly, if there is at least a single idle connection open an exception will be thrown, right?

    Couldn’t this bring false-positive when running all tests? I ask this because when running my tests with Spring Testing I usually use the SingleConnectionDataSource which keep 1 single connection open during all tests.

    What do you think?

    1. We also have tests that require multiple connections because there are multiple concurrent threads that need to be coordinated. Other tests open multiple SessionFactory instances and check all sorts of things. Usually, we have like 5 connections per pool, but if connections are leaking, we can bump into issues where we can no longer run tests against a database which limits the maximum connection count to 10, for instance.

      1. I got it.

        Normally I use the SingleConnectionDataSource because it increases performance of the tests. The issue is we can’t detect connection leaks since we’re working always with the same connection.

        So, your suggestion is to configure the pool with 5 connections?

        When I suspect any kind of connection leak in the application, I normally turn on the c3p0.unreturnedConnectionTimeout and c3p0.debugUnreturnedConnectionStackTraces on C3P0 to detect where is the problem. What do you think?

      2. I think that investigating where the leaks are occurring is the only viable solution to this issue. Sure, those c3p0 properties can help, but that assumes that you can still have a connection leak until the timeout threshold is reached.

Leave a Reply

Your email address will not be published. Required fields are marked *