How to get started with CockroachDB

Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?

Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.

So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!

Introduction

CockroachDB is a really interesting database system, getting the best of both RDBMS and NoSQL. It’s been developed by former Google developers, and it’s inspired by Google Spanner. However, unlike Google Spanner, which is offered as a service in Google Cloud, CockroachDB is an open-source database that can be installed on premise.

Also, CockroackDB allows you to use the PostgreSQL drivers as opposed to Spanner which only supports the gRPC protocol. So, you can practically reuse all the frameworks that have emerged in the Java ecosystem like connection pools, monitoring proxies (e.g. FlexyPool) or data access frameworks (e.g. Hibernate).

Unlike a typical RDBMS, CockroachDB is designed to be globally distributed and strongly resilient to disasters, hence its very unusual name. But what’s really exciting about CockroachDB is its non-locking timestamp ordering concurrency control mechanism which allows CockroachDB to run in Serializable isolation level, even if the database is massively distributed. Intriguing, right?

Installing CockroachDB

Using Docker is probably the easiest way to start working with CockroachDB.

So, you need to pull the latest image from Docker Hub:

> docker pull cockroachdb/cockroach:beta-20170406

Then, you need to create a new network bridge since you might want to run multiple nodes on your cluster:

> docker network create -d bridge roachnet

For our examples, we only need to run aa single node, so to create a new Docker container you need to run the following command:

docker run -d ^
 --name=roach1 ^
 --hostname=roach1 ^
 --net=roachnet ^
 -p 26257:26257 -p 8080:8080  ^
 cockroachdb/cockroach start --insecure

In the command above, I ussed the Windows-style ^ multi-line separator. For Linux, use \ instead.

If you run docker ps -a, you can now see the CockroachDB Docker container:

> docker ps -as

CONTAINER ID  IMAGE                         COMMAND                 CREATED         STATUS         PORTS                                             NAMES
753d8ebddd02  cockroachdb/cockroach         "/cockroach/cockro..."  28 seconds ago  Up 27 seconds  0.0.0.0:8080->8080/tcp, 0.0.0.0:26257->26257/tcp  roach1

No, we need to set up our database. For this, you need to start a bash process inside the Docker container we just created and started:

> docker exec -it roach1 /bin/bash

Once you get access to the CockroachDB Linux OS host, you can add a new database user as follows:

root@roach1:/cockroach# ./cockroach user set cockroach --password --insecure

Enter password:
Confirm password:
INSERT 1

You are prompted for a password after you type the ./cockroach user set cockroach --password command.

Now, you can see your newly created user as follows:

root@roach1:/cockroach# ./cockroach user ls --insecure

+-----------+
| username  |
+-----------+
| cockroach |
+-----------+
(1 row)

Now, you need to create a database like this:

root@roach1:/cockroach# ./cockroach sql -e 'CREATE DATABASE high_performance_java_persistence' --insecure

and grant all privileges to the cockroach user we previously created.

root@roach1:/cockroach# ./cockroach sql -e 'GRANT ALL ON DATABASE high_performance_java_persistence TO cockroach' --insecure

And we are done!

Using JDBC

Now, to make things really interesting, let me configure the High-Performance Java Persistence book GitHub repository to support CockroachDB.

Since it speaks the PostgreSQL protocol, we can use a PostgreSQL JDBC Driver with CockroachDB.

All I need to do is to create a new PGSimpleDataSource like this:

public DataSource dataSource() {
    PGSimpleDataSource dataSource = 
        new PGSimpleDataSource();

    dataSource.setDatabaseName( 
        "high_performance_java_persistence" 
    );
    dataSource.setServerName( host() );
    dataSource.setPortNumber( port() );
    dataSource.setUser( username() );
    dataSource.setPassword( password() );
    dataSource.setSsl( false );

    return dataSource;
}

I can even use HikariCP as a Connection Pooling solution for CockroachDB:

protected HikariDataSource connectionPoolDataSource(
    DataSource dataSource) {

    HikariConfig hikariConfig = new HikariConfig();
    int cpuCores = Runtime.getRuntime().availableProcessors();
    hikariConfig.setMaximumPoolSize(cpuCores * 4);
    hikariConfig.setDataSource(dataSource);
    return new HikariDataSource(hikariConfig);
}

Due to a known issue that will be fixed in the next Beta release, I had to set up the custom isolation level to make it work with HikariCP:

properties.put(
    "hibernate.hikari.transactionIsolation", 
    "TRANSACTION_SERIALIZABLE"
);

Using Hibernate

Not only that we can use JDBC with CockroachDB, but we can even use Hibernate with it. The PostgreSQL82Dialect is much closer to CockroachDB since it does not offer JSON types, SKIP_LOCKED, CTE (Common Table Expressions) or MATERIALIZED VIEWS to justify any newer PostgreSQL Dialect version.

In fact, even the PostgreSQL82Dialect is not a drop-in replacement since it comes with additional SQL functions (e.g. chr, str), and pessimistic locking support as well. However, for simplicity sake, we can just extend the PostgreSQL82Dialect and simply replace the most common data types:

public class CockroachDBDialect extends PostgreSQL82Dialect {

    public CockroachDBDialect() {
        super();
        registerColumnType( Types.SMALLINT, "smallint" );
        registerColumnType( Types.TINYINT, "smallint" );
        registerColumnType( Types.INTEGER, "integer" );

        registerColumnType( Types.FLOAT, "double precision" );
        registerColumnType( Types.DOUBLE, "double precision" );

        registerColumnType( Types.BLOB, "blob" );
        registerColumnType( Types.OTHER, "interval" );
    }
}

Even if the CockroachDBDialect will get you started, in reality, a new Dialect needs to be created from scratch since and define the exact set of features that CockroachDB offers.

Crafting an entity

Although there is a Hibernate tutorial on the CockroachDB website, the entity mapping they offered is using an assigned identifier, which is more of an exception than a rule.

So, our entity looks like this:

@Entity(name = "Post")
@Table(name = "post")
public class Post {

    @Id
    @GeneratedValue(
        strategy = GenerationType.IDENTITY
    )
    private Long id;

    @Column
    @Temporal(TemporalType.DATE)
    private Date createdOn;

    private String title;

    //Getters and setters omitted for brevity
}

Now, while using a database SEQUENCE is the best identifier strategy when using Hibernate, there is no such construct in CockroachDB. However, CockroachDB offers a SERIAL type, just like PostgreSQL, which uses a globally distributed generation strategy.

Unfortunately, due to a limitation of how Hibernate manages identifiers, the use of the IDENTITY generator can disable batch inserts. Read this article for more details on this topic. However, at some point, we’d really need to provide a solution to this limitation.

The createdOn is a DATE type attribute which we will use to order entities in our queries.

For this entity, Hibernate generates the following database table:

CREATE TABLE post
(
  id bigserial NOT NULL,
  createdOn DATE,
  title VARCHAR(255),
  PRIMARY KEY ( id )
)

Adding some entities

Let’s now add 10 entities:

LocalDate startDate = LocalDate.of( 2016, 11, 2 );

for ( int offset = 0; offset < 10; offset++ ) {
    Post post = new Post();
    post.setTitle(
        String.format(
            "High-Performance Java Persistence, Review %d",
            offset
        )
    );
    post.setCreatedOn(
        Date.from( startDate
            .plusDays( offset )
            .atStartOfDay( ZoneId.of( "UTC" ) )
            .toInstant()
        )
    );
    entityManager.persist( post );
}

Which generate typical SQL INSERT statements:

INSERT INTO post ('2016-11-02', 'High-Performance Java Persistence, Review 0')
INSERT INTO post ('2016-11-03', 'High-Performance Java Persistence, Review 1')
INSERT INTO post ('2016-11-04', 'High-Performance Java Persistence, Review 2')
INSERT INTO post ('2016-11-05', 'High-Performance Java Persistence, Review 3')
INSERT INTO post ('2016-11-06', 'High-Performance Java Persistence, Review 4')
INSERT INTO post ('2016-11-07', 'High-Performance Java Persistence, Review 5')
INSERT INTO post ('2016-11-08', 'High-Performance Java Persistence, Review 6')
INSERT INTO post ('2016-11-09', 'High-Performance Java Persistence, Review 7')
INSERT INTO post ('2016-11-10', 'High-Performance Java Persistence, Review 8')
INSERT INTO post ('2016-11-11', 'High-Performance Java Persistence, Review 9')

Query entities

Assuming we run the following JPQL query:

List<Post> posts = entityManager.createQuery(
    "select p " +
    "from Post p " +
    "order by p.createdOn", Post.class )
.setMaxResults( 5 )
.getResultList();

assertEquals( 5, posts.size() );

Hibernate generates the proper PostgreSQL SELECT query:

SELECT p.id AS id1_0_,
       p.createdOn AS createdO2_0_,
       p.title AS title3_0_
FROM   post p
ORDER BY p.createdOn
LIMIT 5 

And CockroachDB provides the proper result set.

I'm running an online workshop on the 20-21 and 23-24 of November about High-Performance Java Persistence.

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

Compared to a NoSQL database like MongoDB or Cassandra, or even a NewSQL database like VoltDB, CockroachDB allows you to reuse all the frameworks that have been developed in the Java ecosystem. This is very convenient both for developers and for the business since the adoption process is going to be smoother than when the data access paradigm shifts considerably.

Transactions and Concurrency Control eBook

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.