The best way to determine the optimal connection pool size

Are you struggling with performance issues in your Spring, Jakarta EE, or Java EE application?

What if there were a tool that could automatically detect what caused performance issues in your JPA and Hibernate data access layer?

Wouldn’t it be awesome to have such a tool to watch your application and prevent performance issues during development, long before they affect production systems?

Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, Micronaut, or Play Framework.

So, rather than fixing performance issues in your production system on a Saturday night, you are better off using Hypersistence Optimizer to help you prevent those issues so that you can spend your time on the things that you love!

Introduction

In this article, we are going to see the best way to determine the optimal connection pool size using the FlexyPool auto-incrementing pool strategy.

If you are unfamiliar with the reason why database applications need a connection pool, then check out this article first.

Now, according to the Universal Scalability Law, the maximum throughput of a database system is achieved for a limited number of database connections. If we increase the number of connections above that particular number, the throughput will get worse.

So, with that in mind, our goal is to make sure that each application node configures its connection pool to use the right number of database connections so that when summing all of them, we don’t get over the aforementioned optimal number of database connections.

Testing application

To demonstrate how FlexyPool can help us determine the optimal connection pool size, let’s consider that we have a transfer service that can move money from one account to another.

@Service
public class TransferService {
⠀
    private final AccountRepository accountRepository;
⠀
    public TransferService(
            @Autowired AccountRepository accountRepository) {
        this.accountRepository = accountRepository;
    }
⠀
    @Transactional(isolation = Isolation.REPEATABLE_READ)
    public void transfer(
            String sourceAccount, 
            String destinationAccount, 
            long amount) {
        if(accountRepository.getBalance(sourceAccount) >= amount) {
            accountRepository.addToBalance(sourceAccount, (-1) * amount);
⠀
            accountRepository.addToBalance(destinationAccount, amount);
        }
    }
}

To determine how many database connections will be used to execute 64 concurrent transfers, we can use the following test case:

long startNanos = System.nanoTime();
int threadCount = 64;

CountDownLatch startLatch = new CountDownLatch(1);
CountDownLatch endLatch = new CountDownLatch(threadCount);

for (int i = 0; i < threadCount; i++) {
    new Thread(() -> {
        try {
            startLatch.await();
⠀
            transferService.transfer("Alice-123", "Bob-456", 5L);
        } catch (Exception e) {
            LOGGER.error("Transfer failed", e);
        } finally {
            endLatch.countDown();
        }
    }).start();
}

LOGGER.info("Starting threads");
startLatch.countDown();
endLatch.await();

LOGGER.info(
    "The {} transfers were executed on {} database connections in {} ms",
    threadCount,
    hikariDataSource.getMaximumPoolSize(),
    TimeUnit.NANOSECONDS.toMillis(System.nanoTime() - startNanos)
);

If we use the default settings, then HikariCP will use at most 10 connections:

HikariConfig hikariConfig = new HikariConfig();
hikariConfig.setDataSource(dataSourceProvider().dataSource());
hikariConfig.setAutoCommit(false);
HikariDataSource poolingDataSource = new HikariDataSource(hikariConfig);

When running the integration test case, we will get the following message printed into the log:

The 64 transfers were executed on 10 database connections in 149 ms

Because the number of threads operating the transfers is greater than the maximum pool size, HikariCP will quickly grow to the maximum pool size of 10 database connections.

If we set the maximum pool size to match the number of transfer threads and rerun our test case:

hikariConfig.setMaximumPoolSize(64);

The total transfer time will take longer since now the contention is moved from the connection pool to the database that will have to serialize all the transactions according to the REPEATABLE_READ isolation level:

The 64 transfers were executed on 64 database connections in 272 ms

However, in our case, we can’t really use 10 database connections for this service since we may be running multiple other services in our system, and the database Primary node has a maximum connection count of 48 connections.

How to determine the optimal connection pool size

In order to determine the optimal connection pool size, we can use the FlexyPoolDataSource Proxy to wrap the HikariDataSource and configure the IncrementPoolOnTimeoutConnectionAcquisitionStrategy to increase the size of the pool on demand:

HikariConfig hikariConfig = new HikariConfig();
hikariConfig.setDataSource(dataSourceProvider().dataSource());
hikariConfig.setAutoCommit(false);
hikariConfig.setMaximumPoolSize(1);
hikariConfig.setConnectionTimeout(100);
HikariDataSource poolingDataSource = new HikariDataSource(hikariConfig);

int maxOverflowPoolSize = 10;
int connectionAcquisitionThresholdMillis = 25;

FlexyPoolDataSource<HikariDataSource> dataSource = new FlexyPoolDataSource<>(
    new FlexyPoolConfiguration.Builder<>(
        getClass().getSimpleName(),
        poolingDataSource,
        HikariCPPoolAdapter.FACTORY)
    .build(),
    new IncrementPoolOnTimeoutConnectionAcquisitionStrategy.Factory<>(
        maxOverflowPoolSize,
        connectionAcquisitionThresholdMillis
    )
);

We first set the maximum pool size of HiakriCP to one connection only, and if the connection acquisition takes longer than 25 milliseconds (e.g., connectionAcquisitionThresholdMillis) we will increase the HikariCP pool size by borrowing from a buffer up to 10 maximum connections (e.g., maxOverflowPoolSize).

When running our test case with this FlexyPool configuration, we can see that only 4 database connections are needed to execute the 64 transfers:

The 64 transfers were executed on 4 database connections in 289 ms

So, let’s then replace the FlexyPool configuration with a simple HikariCP one that uses a maximum pool size of 4:

HikariConfig hikariConfig = new HikariConfig();
hikariConfig.setDataSource(dataSourceProvider().dataSource());
hikariConfig.setAutoCommit(false);
hikariConfig.setMaximumPoolSize(4);
HikariDataSource poolingDataSource = new HikariDataSource(hikariConfig);

And when rerunning our test case, we get the following results:

The 64 transfers were executed on 4 database connections in 128 ms

Awesome, right?

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

The FlexyPool IncrementPoolOnTimeoutConnectionAcquisitionStrategy allows us to discover what pool size is optimal for a given use case that can accommodate a sporadic connection acquisition delay (e.g., 25 milliseconds in our case).

By using integration tests or functional testing with JMeter, we can determine the optimal number of connections our pool size needs in order to process a given load in a timely manner.

Transactions and Concurrency Control eBook

4 Comments on “The best way to determine the optimal connection pool size

  1. Very nice article, Vlad. Thanks for sharing it! 👏🏻👏🏻

    Just out of curiosity: Did you use any retrying strategy on the service method?

    Since your service method’s transaction is configured with a REPEATABLE_READ isolation level, we can assume that the bigger the connection pool, the greater the contention and the greater the odds of conflicts among those transactions.

    I mean, those conflicts (and retry attempts) might also negatively impact the execution time. Thus, the optimal limit of connections could be more related to the feature’s implementation than the database itself.

    • You’re welcome. The IncrementPoolOnTimeoutConnectionAcquisitionStrategy can increase the pool size when the acquisition takes longer than the specified threshold or when the connection acquisition times out. So, on time out, there would be a retry.

      The bigger the pool, the more contention is on the transactions since the database needs to ensure that they serialize. However, there is no retry on failure, just the failure being logged.

      In the end, the optimal number of connections is driven by the average transaction response time. The lower the transaction time, the fewer transactions are needed. The longer the response time, the more connections will be needed. So, the number of connections depends on both the hardware capabilities of the DB, which influence the response time, and the statements that are executed by the transaction.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.