Seize the deal!
Caching Best Practices
Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!
All database tables must have one primary key column. The primary key uniquely identifies a row within a table therefore it’s bound by the following constraints:
When choosing a primary key we must take into consideration the following aspects:
When choosing a primary key generator strategy the options are:
Natural key uniqueness is enforced by external factors (e.g. person unique identifiers, social security numbers, vehicle identification numbers).
Natural keys are convenient because they have an outside world equivalent and they don’t require any extra database processing. We can, therefore, know the primary key even before inserting the actual row into the database, which simplifies batch inserts.
If the natural key is a single numeric value the performance is comparable to that of surrogate keys.
Non-numerical keys are less efficient than numeric ones (integer, bigint), for both indexing and joining. A CHAR(17) natural key (e.g. vehicle identification number) occupies 17 bytes as opposed to 4 bytes (32 bit integer) or 8 bytes (64 bit bigint).
The initial schema design uniqueness assumptions may not forever hold true. Let’s say we’d used one specific country citizen numeric code for identifying all application users. If we now need to support other countries that don’t have such citizen numeric code or the code clashed with existing entries, then we can conclude that the schema evolution is possibly hindered.
If the natural key uniqueness constraints change it’s going to be very difficult to update both the primary keys (if we manage to drop the primary key constraints anyway) and all associated foreign key relationships.
Surrogate keys are generated independently of the current row data, so the other column constraints may freely evolve according to the application business requirements.
The database system may manage the surrogate key generation and most often the key is of a numeric type (e.g. integer or bigint), is incremented whenever there is a need for a new key.
If we want to control the surrogate key generation we can employ a 128-bit GUID or UUID. This simplifies batching and may improve the insert performance since the additional database key generation processing is no longer required. However, being larger than an autp-incrementing number, the choice of a UUID identifier is not without drawbacks. Also, for clustered indexes, which are the default on MySQL and SQL Server, a random-generated identifier will require more cluster index re-balancing , might work against pre-allocating index entries and can bloat the index.
When the database identifier generation responsibility falls to the database system, there are several strategies for auto-incrementing surrogate keys:
|Database engine||Auto incrementing strategy|
|Oracle||SEQUENCE, IDENTITY (Oracle 12c)|
|SQL Server||IDENTITY, SEQUENCE (SQL Server 2012)|
|PostgreSQL||SEQUENCE, SERIAL TYPE|
If you enjoyed this article, I bet you are going to love my upcoming Online Workshops!
- Caching Best Practices with JPA and Hibernate (2.5 hours) on the 30th of September
- High-Performance SQL (4 hours) on the 6th of October in collaboration with Voxxed Days Ticino
- High-Performance SQL (12 hours) starting on the 28th of October in collaboration with Bouvet
Because sequences may be called concurrently from different transactions they are usually transaction-less.
|Oracle||When a sequence number is generated, the sequence is incremented, independent of the transaction committing or rolling back|
|SQL Server||Sequence numbers are generated outside the scope of the current transaction. They are consumed whether the transaction using the sequence number is committed or rolled back|
|PostgreSQL||Because sequences are non-transactional, changes made by setval are not undone if the transaction rolls back|
Both the IDENTITY type and the SEQUENCE generator are defined by the SQL:2003 standard, so they’ve become the standard primary key generator strategies.
Some database engines allow you to choose between IDENTITY and SEQUENCE so you have to decide which one better suits your current schema requirements.
Hibernate disables JDBC insert batching when using the IDENTITY generator strategy.