How to map table rows to columns using SQL PIVOT or CASE expressions

Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?

Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.

So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!

Introduction

While reading the wonderful SQL Antipatterns book by Bill Karwin, which is a great reference for any developer that needs to interact with a Relational Database System, I found an example where the SQL PIVOT clause would work like a charm.

In this post, I’m going to explain how to transpose a ResultSet using PIVOT so that rows become columns.

Domain Model

As an exercise, let’s imagine that our enterprise system is comprised of lots of Microservices which need to be configured on-demand. While each Microservice comes with its own configuration, we might need to configure them dynamically so that we increase the connection pool size, or make them switch from one database node to another.

For this purpose, we could have a dedicated database schema which serves configuration metadata to our Microservices. If a service reads a component property from the database, that value overrides the default value that was defined at built-time.

The metadata database schema looks as follows:

service_component_property

As you can see, the Property table is using an EAV model while using Foreign Keys for the Service and Component references.

Projecting the Component properties

When way to query a given component property across multiple services is to execute the following SQL query:

List<Object[]> componentProperties = entityManager
.createNativeQuery(
    "SELECT " +
    "   p.service_name AS serviceName, " +
    "   p.component_name AS componentName, " +
    "   p.property_name, " +
    "   p.property_value " +
    "FROM Property p " +
    "WHERE " +
    "   p.component_name = :name")
.setParameter("name", "dataSource")
.getResultList();

However, the ResultSet looks as follows:

componentProperties = {java.util.ArrayList@4968}  size = 8
 0 = {java.lang.Object[4]@4971} 
  0 = "Apollo"
  1 = "dataSource"
  2 = "databaseName"
  3 = "high_performance_java_persistence"
 1 = {java.lang.Object[4]@4972} 
  0 = "Artemis"
  1 = "dataSource"
  2 = "databaseName"
  3 = "high_performance_java_persistence"
 2 = {java.lang.Object[4]@4973} 
  0 = "Apollo"
  1 = "dataSource"
  2 = "password"
  3 = "admin"
 3 = {java.lang.Object[4]@4974} 
  0 = "Artemis"
  1 = "dataSource"
  2 = "password"
  3 = "admin"
 4 = {java.lang.Object[4]@4975} 
  0 = "Apollo"
  1 = "dataSource"
  2 = "serverName"
  3 = "192.168.0.5"
 5 = {java.lang.Object[4]@4976} 
  0 = "Artemis"
  1 = "dataSource"
  2 = "url"
  3 = "jdbc:oracle:thin:@192.169.0.6:1521/hpjp"
 6 = {java.lang.Object[4]@4977} 
  0 = "Apollo"
  1 = "dataSource"
  2 = "username"
  3 = "postgres"
 7 = {java.lang.Object[4]@4978} 
  0 = "Artemis"
  1 = "dataSource"
  2 = "username"
  3 = "oracle"

We don’t want to deal with an Object[] array, and we’d rather use a DataSourceConfiguration DTO to store all the DataSource information associated with a given service.

DataSourceConfiguration entity

If the sqlQuery is a String variable, then we can map its ResultSet to the DataSourceConfiguration DTO like this:

List<DataSourceConfiguration> dataSources = entityManager
.createNativeQuery(sqlQuery)
.setParameter("name", "dataSource")
.unwrap(Query.class)
.setResultTransformer(
    Transformers.aliasToBean(
        DataSourceConfiguration.class
    )
)
.getResultList();

You can find more details about fetching DTO projections with JPA and Hibernate in this article.

Bill Karwin’s query

In his book, Bill Karwin propose the following SQL query to transpose the ResultSet rows to columns:

SELECT DISTINCT 
   userName.service_name AS "serviceName", 
   c.name AS "componentName", 
   databaseName.property_value AS "databaseName", 
   url.property_value AS "url", 
   serverName.property_value AS "serverName", 
   userName.property_value AS "userName", 
   password.property_value AS "password" 
FROM Component c 
LEFT JOIN Property databaseName 
   ON databaseName.component_name = c.name AND 
      databaseName.property_name = 'databaseName' 
LEFT JOIN Property url 
   ON url.component_name = c.name AND 
      url.property_name = 'url' 
LEFT JOIN Property serverName 
   ON serverName.component_name = c.name AND 
      serverName.property_name = 'serverName' 
LEFT JOIN Property userName 
   ON userName.component_name = c.name AND 
      userName.property_name = 'username' 
LEFT JOIN Property password 
   ON password.component_name = c.name AND 
      password.property_name = 'password' 
WHERE 
   c.name = :name

Well, first of all, this query does not really render the expected result because properties get mixed between different services:

dataSources = {java.util.ArrayList@4990}  size = 2
 0 = {com.vladmihalcea.book.hpjp.hibernate.query.pivot.DataSourceConfiguration@4991} 
  serviceName = "Apollo"
  componentName = "dataSource"
  databaseName = "high_performance_java_persistence"
  url = "jdbc:oracle:thin:@192.169.0.6:1521/hpjp"
  serverName = "192.168.0.5"
  userName = "postgres"
  password = "admin"
 1 = {com.vladmihalcea.book.hpjp.hibernate.query.pivot.DataSourceConfiguration@4994} 
  serviceName = "Artemis"
  componentName = "dataSource"
  databaseName = "high_performance_java_persistence"
  url = "jdbc:oracle:thin:@192.169.0.6:1521/hpjp"
  serverName = "192.168.0.5"
  userName = "oracle"
  password = "admin"

Second, while the EAV model is more like a design smell than an Anti-Pattern, the aforementioned SQL query is surely an Anti-Pattern from a performance perspective.

We can do way better than this!

SQL PIVOT

Both Oracle and SQL Server support the PIVOT SQL clause, and so we can rewrite the previous query as follows:

SELECT * 
FROM ( 
   SELECT 
       p.service_name AS "serviceName", 
       p.component_name AS "componentName", 
       p.property_name , 
       p.property_value 
   FROM Property p 
   WHERE 
       p.component_name = :name
) 
PIVOT(
   MAX(property_value) 
   FOR property_name IN (
       'databaseName' AS "databaseName", 
       'url' AS "url", 
       'serverName' AS "serverName", 
       'username' AS "userName", 
       'password' AS "password") 
)

This time, the result looks much better:

dataSources = {java.util.ArrayList@4997}  size = 2
 0 = {com.vladmihalcea.book.hpjp.hibernate.query.pivot.DataSourceConfiguration@4998} 
  serviceName = "Apollo"
  componentName = "dataSource"
  databaseName = "high_performance_java_persistence"
  url = null
  serverName = "192.168.0.5"
  userName = "postgres"
  password = "admin"
 1 = {com.vladmihalcea.book.hpjp.hibernate.query.pivot.DataSourceConfiguration@5041} 
  serviceName = "Artemis"
  componentName = "dataSource"
  databaseName = "high_performance_java_persistence"
  url = "jdbc:oracle:thin:@192.169.0.6:1521/hpjp"
  serverName = null
  userName = "oracle"
  password = "admin"

Only the Artemis service has a url property defined. More, there’s no extra JOIN for each column that we need to transpose from a row value.

SQL CASE

If you’re not using Oracle or SQL Server, you can still transpose rows to columns using a CASE expression and a GROUP BY clause:

SELECT 
    p.service_name AS "serviceName", 
    p.component_name AS "componentName", 
    MAX(
        CASE WHEN property_name = 'databaseName' 
        THEN property_value END
    ) AS "databaseName", 
    MAX(
        CASE WHEN property_name = 'url' 
        THEN property_value END) 
    AS "url", 
    MAX(
        CASE WHEN property_name = 'serverName' 
        THEN property_value END
    ) AS "serverName", 
    MAX(
        CASE WHEN property_name = 'username' 
        THEN property_value END
    ) AS "userName", 
    MAX(
        CASE WHEN property_name = 'password' 
        THEN property_value END
    ) AS "password" 
FROM Property p 
WHERE 
   p.component_name = :name 
GROUP BY 
    p.service_name, 
    p.component_name

This query return the following result:

dataSources = {java.util.ArrayList@4992}  size = 2
 0 = {com.vladmihalcea.book.hpjp.hibernate.query.pivot.DataSourceConfiguration@4993} 
  serviceName = "Apollo"
  componentName = "dataSource"
  databaseName = "high_performance_java_persistence"
  url = null
  serverName = "192.168.0.5"
  userName = "postgres"
  password = "admin"
 1 = {com.vladmihalcea.book.hpjp.hibernate.query.pivot.DataSourceConfiguration@5177} 
  serviceName = "Artemis"
  componentName = "dataSource"
  databaseName = "high_performance_java_persistence"
  url = "jdbc:oracle:thin:@192.169.0.6:1521/hpjp"
  serverName = null
  userName = "oracle"
  password = "admin"
I'm running an online workshop on the 11th of October about High-Performance SQL.

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

Pivoting tables is a very handy feature when working with reports, and there are multiple approaches to tackling this issue. Using PIVOT or CASE expressions is the right thing to do, while the JOIN approach is both suboptimal and may generate a wrong ResultSet.

For more about this topic, check out this article from Markus Winand.

Transactions and Concurrency Control eBook

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.