SQL CTE – Common Table Expression
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
In this article, we are going to see what an SQL CTE (Common Table Expression) is and how you can use it to reuse subquery result sets or inline views.
Let’s assume we have the following
post_comment tables that form a one-to-many table relationship:
post table contains the following entries:
| id | title | |----|----------| | 1 | SQL:2016 | | 2 | SQL:2011 | | 3 | SQL:2008 | | 4 | JPA 3.0 | | 5 | JPA 2.2 | | 6 | JPA 2.1 | | 7 | JPA 2.0 | | 8 | JPA 1.0 |
post_comment child table contains the following data:
| id | review | post_id | |----|------------------------|---------| | 1 | SQL:2016 is great! | 1 | | 2 | SQL:2016 is excellent! | 1 | | 3 | SQL:2016 is awesome! | 1 | | 4 | SQL:2011 is great! | 2 | | 5 | SQL:2011 is excellent! | 2 | | 6 | SQL:2008 is great! | 3 |
SQL CTE (Common Table Expression)
We now want to get the first two most-commented posts that start with the
SQL prefix along with all their associated comments.
We cannot use a simple Top-N SQL query that defines a
FETCH FIRST 2 ROWS ONLY clause because that would bring the first 2 records of the joined
post_comment result set, and that’s not what we want.
In order to fetch the desired result set, we need to:
- JOIN the
post_commentand filter by the
- Use a COUNT Window Function to determine how many comments are associated with each post
- Use the
DENSE_RANKWindow Function to rank the posts by their number of comments
- Extract just the first two most-commented posts
Since there are multiple steps involved, we could either use Derived Tables (Inline Views) or a CTE (Common Table Expression) query.
The CTE query uses the
WITH clause to define an alias to a given subquery. For this reason, we can reference the subquery in a subsequent
WITH subquery or in the outer query that builds the final result set.
Our CTE query looks as follows:
WITH p_pc AS ( SELECT p.id AS post_id, p.title AS post_title, pc.id AS comment_id, pc.review AS comment_review, COUNT(post_id) OVER(PARTITION BY post_id) AS comment_count FROM post p LEFT JOIN post_comment pc ON p.id = pc.post_id WHERE p.title LIKE 'SQL%' ), p_pc_r AS ( SELECT post_id, post_title, comment_id, comment_review, DENSE_RANK() OVER (ORDER BY p_pc.comment_count DESC) AS ranking FROM p_pc ) SELECT * FROM p_pc_r WHERE p_pc_r.ranking <= 2 ORDER BY post_id, comment_id
The first subquery is
p_pc and defines the JOIN and the WHERE clause that filters the
post records. The
p_pc query projection includes a
COUNT Window Function that provides the number of comments for each post.
The second subquery is
p_pc_r and, as you can see, it references the
p_pc subquery we have previously defined. The goal of the
p_pc_r subquery is to add a new column that ranks the posts by the number of associated comments. For this purpose, we have used the
DENSE_RANK Window Function.
The last query filters the
p_pc_r query result set and extracts just the top 2 most-commented posts along with their associated comments.
The reason why the WHERE clause in the final query cannot be moved to the
p_pc_r query is that the WHERE clause evaluates before Window Functions. It’s the same argument we’ve been using for writing legacy Top-N queries using Oracle.
And the CTE query result set looks like this:
| post_id | post_title | comment_id | comment_review | ranking | |---------|------------|------------|------------------------|---------| | 1 | SQL:2016 | 1 | SQL:2016 is great! | 1 | | 1 | SQL:2016 | 2 | SQL:2016 is excellent! | 1 | | 1 | SQL:2016 | 3 | SQL:2016 is awesome! | 1 | | 2 | SQL:2011 | 4 | SQL:2011 is great! | 2 | | 2 | SQL:2011 | 5 | SQL:2011 is excellent! | 2 |
SQL CTE (Common Table Expression) Gotcha
Not all relational database systems are able to push down predicates from the outer query to the WITH queries. For instance, in PostgreSQL, the Common Table Expression queries are optimization fences.
Since a CTE query can be translated to a Derived Table Or Inline View query, it’s a good idea to compare the Execution Plans of the CTE and the Derived Table queries and see which one performs best.
The SQL CTE (Common Table Expression) allows us to decompose a complex query into a series of smaller subqueries that can be referenced afterward.
The CTE (Common Table Expression)
WITH clause is supported by all top relational database systems starting from the following versions:
- Oracle 9i R2
- SQL Server 2005
- PostgreSQL 8.4
- MySQL 8.0.1
- MariaDB 10.2.1