YugabyteDB Connection Pooling
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
Introduction
In this article, we are going to see the overhead of acquiring a new connection when using YugabyteDB and why connection pooling is mandatory for performance.
Acquiring a database connection using JDBC
To interact with a database system, first, we need to acquire a database connection. And, when using Java, we need to obtain a Connection
object from the Driver prior to executing any statement.
Now, if you inspect the JDBC specification, you will see that there are two ways to obtain a database connection. You can either use the DataSource
or the Driver
:
The difference between using the DataSource
and the Driver
is that the DataSource
allows you to get a logical connection while the Driver
always establishes a physical connection.
Unlike a physical connection, the logical connection that you acquire via the DataSource
could have been created previously and pooled. That’s the reason why most Java applications acquire database connections via the JDBC DataSource
, as illustrated by the following example:
try (Connection connection = dataSource.getConnection()) { ... }
Because relational database systems use specific communication protocols, there are many database-specific JDBC Drivers, many of them being available on Maven Central.
However, as I explained in this article, YugabyteDB uses the PostgreSQL wire protocol, so we can use the standard PostgreSQL JDBC Driver to connect to a YugabyteDB database server.
And when using the PostgreSQL JDC Driver, the connection acquisition process looks as follows:
The PGSimpleDataSource
, which is the default DataSource
implementation in PostgreSQL, uses the underlying Driver
to acquire a physical connection which establishes a TCP connection to the database server.
And when the close
method is called on the JDBC Connection
object, the underlying Socket
and TCP connection are terminated.
Connection acquisition overhead
Even without measuring the overhead of fetching a physical connection from the database, we can intuit that it’s not going to be a very lightweight process.
However, it’s very easy to test how long it takes to acquire and release a database connection when using a database system, as illustrated by the following test case:
MetricRegistry metricRegistry = new MetricRegistry(); Timer timer = metricRegistry.timer("connectionTimer"); Slf4jReporter logReporter = Slf4jReporter .forRegistry(metricRegistry) .outputTo(LOGGER) .build(); for (int i = 0; i < connectionAcquisitionCount; i++) { long startNanos = System.nanoTime(); try (Connection connection = dataSource.getConnection()) {} timer.update( System.nanoTime() - startNanos, TimeUnit.NANOSECONDS ); } logReporter.report();
The MetricsRegistry
, Timer
, and Slf4jReporter
are from the Dropwizard Metrics library, which makes it very easy to collect metrics data.
In our case, we measure how long it takes to acquire and release a JDBC Connection
since the try-with-resources block is going to automatically call the close
method on the Connection
object in the finally
block that’s implicitly added by the Java compiler.
When running this test case on a local YugabyteDB server running on Docker, we get the following metrics printed into the log:
type=TIMER, name=connectionTimer, count=1000, min=27.587, max=166.9311, mean=45.017248196282914, stddev=14.85679839150412, median=41.11, p75=47.1509, p95=71.5313, p98=97.5146, p99=117.7868, p999=144.9534, mean_rate=18.406102362122116, m1=10.930380581381012, m5=2.809568612688502, m15=0.9789277818057495, rate_unit=events/second, duration_unit=milliseconds
The metrics show us the connection acquisition and release overhead takes between 27 and 166 milliseconds, and that’s a lot.
These values have nothing to do with YugabyteDB since similar results are measured when running the test case on PostgreSQL:
type=TIMER, name=connectionTimer, count=1000, min=69.5914, max=233.05349999999999, mean=90.92054062728442, stddev=16.03686125566243, median=87.3543, p75=96.8926, p95=118.5624, p98=134.3056, p99=145.7051, p999=233.05349999999999, mean_rate=9.916890350062024, m1=8.498321932375394, m5=2.85664417766656, m15=1.0496401169547016, rate_unit=events/second, duration_unit=milliseconds
In this case, the connection acquisition and release overhead on my local PostgreSQL instance ranges from 69 to 233 milliseconds.
If the DB server is located on a different server, the overhead of fetching the connection from the YugabyteDB
yb-master
server is significantly larger since the catalog metadata needs to be fetched and cached on a per-connection basis.
And that’s one reason why we are using database connection pooling.
Connection pooling
Now, if we change our test to use the HikariDataSource
from HikariCP:
HikariConfig config = new HikariConfig(); config.setJdbcUrl(dataSourceProvider().url()); config.setUsername(dataSourceProvider().username()); config.setPassword(dataSourceProvider().password()); HikariDataSource datasource = new HikariDataSource(config);
And rerun our test case, we get the following results no matter what database we are using:
type=TIMER, name=connectionTimer, count=1000, min=0.0019, max=0.0475, mean=0.004818100000000085, stddev=0.0037918099095287277, median=0.0042, p75=0.0049, p95=0.0104, p98=0.020399999999999998, p99=0.0219, p999=0.027999999999999997, mean_rate=339.58217740975067, m1=0.0, m5=0.0, m15=0.0, rate_unit=events/second, duration_unit=milliseconds
Because the connection was already acquired during the test warming-up phase and we always found a connection available to be acquired, the connection acquisition and release overhead took between 1.9 and 47.5 microseconds.
Amazing, right?
To understand why it is so much faster to acquire and release a logical connection from a pool than acquire and release a physical connection from the Driver, let’s analyze the connection acquisition request flow when using a connection pool:
If no connection is available, but the pool hasn’t reached its maximum size, a new physical connection is acquired and wrapped by the pool in a logical connection that’s stored in the pool.
If connections are available, then the acquisition consists in marking the current running Thread
as the borrower of an existing logical connection.
When the close
method is called on the logical connection, the underlying physical connection is not going to get closed. It’s just that the current running Thread
is deallocated from the logical connection.
If no connection is available and the pool has reached its maximum size, the connection request can wait for a limited amount of time to see if a connection is going to be released. Adding a timeout is very important as it allows you to reject incoming traffic when your system is running out of resources.
YugabyteDB maximum connection number
To see how many physical connections can be acquired on a given YugabyteDB server, you can run the following command:
SHOW max_connections;
When running this command on my local YugabyteDB server, I get the following result:
| max_connections | |-----------------| | 300 |
While YugabyteDB limits the maximum number of connections to 300, the maximum throughput of a system may be reached for a value that’s lower than that particular value.
As I explained in this article, the relationship between throughput and the number of connections is given by the Universal Scalability Law.
Once you determined what is the maximum number of connections that yields the best throughput, you need to use connection pooling to make sure you limit the number of connections a given application can acquire.
The
max_connections
setting is peryb-tserver
, so connections can scale by adding more tablet servers.When using the YugabyteDB-specific
YBClusterAwareDataSource
JDBC Driver, the connection requests can be balanced to all availableyb-tserver
nodes that were discovered when establishing the very first connection, as explained in the YugabyteDB documentation.
If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.
Conclusion
The overhead of acquiring and releasing physical connections is significant, no matter what database system you are using.
When using YugabyteDB, it’s a good idea to use a connection pooling solution that allows you to reuse logical connections for longer than just a single transaction.
While using a client-side pooling solution, like HikariCP, is a very popular option, there are other alternatives as well.
A tool like PgBouncer can be used as a proxy between the applications and the database server. The advantage of PgBouncer is that it can provide connections dynamically to the applications that need it the most, freeing you from deciding what is the best maximum pool size to configure on a per-application basis.
Using proxy servers like PgBouncer is especially useful for microservice architectures. If the connection pool maximum size is configured at bootstrap time, we may end up using either too many or too few connections per microservice.
On the other hand, when using PgBouncer, the application nodes no longer need to use a local connection pool since the physical connection is cached at PgBouncer level, and the only connection the application establishes is a simple TCP connection to the PgBouncer server.
This research was funded by Yugabyte and conducted in accordance with the blog ethics policy.
While the article was written independently and reflects entirely my opinions and conclusions, the amount of work involved in making this article happen was compensated by Yugabyte.
