# Introduction

In this post, we’ll uncover a sequence identifier generator combining identifier assignment efficiency and interoperability with other external systems (concurrently accessing the underlying database system).

Traditionally there have been two sequence identifier strategies to choose from.

• The sequence identifier, always hitting the database for every new value assignment. Even with database sequence preallocation we have a significant database round-trip cost.
• The seqhilo identifier, using the hi/lo algorithm. This generator calculates some identifier values in-memory, therefore reducing the database round-trip calls. The problem with this optimization technique is that the current database sequence value no longer reflects the current highest in-memory generated value. The database sequence is used as a bucket number, making it difficult for other systems to interoperate with the database table in question. Other applications must know the inner-workings of the hi/lo identifier strategy to properly generate non-clashing identifiers.

# The enhanced identifiers

Hibernate offers a new class of identifier generators, addressing many shortcomings of the original ones. The enhanced identifier generators don’t come with a fixed identifier allocation strategy. The optimization strategy is configurable and we can even supply our own optimization implementation. By default Hibernate comes with the following built-in optimizers:

• none: every identifier is fetched from the database, so it’s equivalent to the original sequence generator.
• hi/lo: it uses the hi/lo algorithm and it’s equivalent to the original seqhilo generator.
• pooled: This optimizer uses a hi/lo optimization strategy, but the current in-memory identifiers highest boundary is extracted from an actual database sequence value.
• pooled-lo: It’s similar to the pooled optimizer but the database sequence value is used as the current in-memory lowest boundary

# JPA identifier generators

JPA defines the following identifier strategies:

Strategy Description
AUTO The persistence provider picks the most appropriate identifier strategy supported by the underlying database
IDENTITY Identifiers are assigned by a database IDENTITY column
SEQUENCE The persistence provider uses a database sequence for generating identifiers
TABLE The persistence provider uses a separate database table to emulate a sequence object

In my previous post I exampled the pros and cons of all these surrogate identifier strategies.

# Introduction

In my previous post I talked about various database identifier strategies, you need to be aware of when designing the database model. We concluded that database sequences are very convenient because they are both flexible and efficient for most use cases.

But even with cached sequences, the application requires a database round-trip for every new the sequence value. If your applications demand a high number of insert operations per transaction, the sequence allocation may be optimized with a hi/lo algorithm.

# The hi/lo algorithm

The hi/lo algorithms split the sequences domain into “hi” groups. A “hi” value is assigned synchronously. Every “hi” group is given a maximum number of “lo” entries, that can by assigned off-line without worrying about concurrent duplicate entries.

1. The “hi” token is assigned by the database, and two concurrent calls are guaranteed to see unique consecutive values
2. Once a “hi” token is retrieved we only need the “incrementSize” (the number of “lo” entries)
3. The identifiers range is given by the following formula:
4. $[(hi -1) * incrementSize) + 1, (hi * incrementSize) + 1)$

and the “lo” value will be taken from:

$[0, incrementSize)$

starting from

$[(hi -1) * incrementSize) + 1)$

5. When all “lo” values are used, a new “hi” value is fetched and the cycle continues

Here you can have an example of two concurrent transactions, each one inserting multiple entities: