The hi/lo algorithm


In my previous post I talked about various database identifier strategies, you need to be aware of when designing the database model. We concluded that database sequences are very convenient because they are both flexible and efficient for most use cases.

But even with cached sequences, the application requires a database round-trip for every new the sequence value. If your applications demand a high number of insert operations per transaction, the sequence allocation may be optimized with a hi/lo algorithm.

The hi/lo algorithm

The hi/lo algorithms split the sequences domain into “hi” groups. A “hi” value is assigned synchronously. Every “hi” group is given a maximum number of “lo” entries, that can by assigned off-line without worrying about concurrent duplicate entries.

  1. The “hi” token is assigned by the database, and two concurrent calls are guaranteed to see unique consecutive values
  2. Once a “hi” token is retrieved we only need the “incrementSize” (the number of “lo” entries)
  3. The identifiers range is given by the following formula:
  4. [(hi -1) * incrementSize) + 1, (hi * incrementSize) + 1)

    and the “lo” value will be taken from:

    [0, incrementSize)

    starting from

    [(hi -1) * incrementSize) + 1)

  5. When all “lo” values are used, a new “hi” value is fetched and the cycle continues

Here you can have an example of two concurrent transactions, each one inserting multiple entities:


Testing the theory

If we have the following entity:

public class Hilo {

    @GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "hilo_sequence_generator")
            name = "hilo_sequence_generator",
            strategy = "",
            parameters = {
                    @Parameter(name = "sequence_name", value = "hilo_seqeunce"),
                    @Parameter(name = "initial_value", value = "1"),
                    @Parameter(name = "increment_size", value = "3"),
                    @Parameter(name = "optimizer", value = "hilo")
    private Long id;

We can check how many database sequence round-trips are issued when inserting multiple entities:

public void testHiloIdentifierGenerator() {
	doInTransaction(new TransactionCallable<Void>() {
		public Void execute(Session session) {
			for(int i = 0; i < 8; i++) {
				Hilo hilo = new Hilo();
			return null;

Which end-ups generating the following SQL queries:

Query:{[call next value for hilo_seqeunce][]} 
Query:{[insert into Hilo (id) values (?)][1]} 
Query:{[insert into Hilo (id) values (?)][2]} 
Query:{[insert into Hilo (id) values (?)][3]} 
Query:{[call next value for hilo_seqeunce][]} 
Query:{[insert into Hilo (id) values (?)][4]} 
Query:{[insert into Hilo (id) values (?)][5]} 
Query:{[insert into Hilo (id) values (?)][6]} 
Query:{[call next value for hilo_seqeunce][]} 
Query:{[insert into Hilo (id) values (?)][7]} 
Query:{[insert into Hilo (id) values (?)][8]} 

If you enjoyed this article, I bet you are going to love my book as well.

As you can see we have only 3 sequence calls for 8 inserted entities. The more entity inserts a transaction will we require the better the performance gain we’ll obtain from reducing the database sequence round-trips.

If you liked this article, you might want to subscribe to my newsletter too.


12 thoughts on “The hi/lo algorithm

      1. Thank You. Which githubrepo?
        I added your class Hilo to the chunk_CSV_database project of jee7 samples. Using wildfly 8.2.1final.

  1. We can (I hope) get it easier. According to Pro JPA 2 Mastering the JavaTM Persistence API:


    @SequenceGenerator(name=”Emp_Gen”, sequenceName=”Emp_Seq”)
    @Id @GeneratedValue(generator=”Emp_Gen”)
    private int getId;

    The initial value and allocation size can also be used in sequence generators and would need to be reflected in the SQL to create the sequence. We can see that the default allocation size is 50, just as it is with table generators. If schema generation is not being used, and the sequence is being manually created, the INCREMENT BY clause would need to be configured to match the setting or default value of the allocation size.

    I have not tested it yet, but according to the documentation we should not apply this additional hibernate-specific options to support allocation size with sequences, like we have with table generator.

    1. The table and the sequence generators take the same configurations and optimizer settings. In fact, the table generator tries to emulate a sequence, but it does it very poorly since it uses row-level locks and separate transactions which are heavyweight from a database performance perspective.

  2. Hello,

    Are you aware of this issue:


    It still occurs when using hilo in distributed setups. It seems that there is a fix but the issue remains open…

  3. Hi,

    This is my first feedback after trying version 5.2.10 on a distributed setup using Spring Batch remote partitioning. More specifically I end up having 3 different machines/JVMs bulk inserting into 2 tables having one to many relationship.

    I do not experience now any issue and regarding the performance I guess is… quite good: 10 million records in ~5 minutes (the time includes reading from a source able A with Spring batch reader, processing the items an action that involves querying another table B, and finally writing to destination tables C and D which are the ones having the parent-child one-to-many relationship). Of course all these after following your guidelines for proper batching, how to map properly one-to-many relationships etc.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s