SQL Derived Table or Inline View
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
In this article, we are going to see what an SQL Derived Table or Inline View is and how you can use it to bypass the standard SQL operation order.
For this article, we are going to use the following
post_comment tables that form a one-to-many table relationship:
post table contains the following rows:
| id | title | |----|----------| | 1 | SQL:2016 | | 2 | SQL:2011 | | 3 | SQL:2008 | | 4 | JPA 3.0 | | 5 | JPA 2.2 | | 6 | JPA 2.1 | | 7 | JPA 2.0 | | 8 | JPA 1.0 |
post_comment child table contains these records:
| id | review | post_id | |----|------------------------|---------| | 1 | SQL:2016 is great! | 1 | | 2 | SQL:2016 is excellent! | 1 | | 3 | SQL:2016 is awesome! | 1 | | 4 | SQL:2011 is great! | 2 | | 5 | SQL:2011 is excellent! | 2 | | 6 | SQL:2008 is great! | 3 |
SQL Derived Table or Inline View
Let’s assume we want to get the first two most-commented posts that start with a given prefix (e.g.,
SQL) along with all their associated comments.
While the SQL standard provides the
FETCH FIRST N ROWS ONLY clause to write a Top-N SQL query, that would not help us implement our use case because we need the first 2
post rows along with all their associated
post_comment child records, no matter the size of the joined result set.
In order to fetch the expected result set, we would have to:
- Filter the
postrecords by the
- JOIN the
post_commentrecords and use the COUNT Window Function to check how many comments are associated with each post record
- Rank the posts by their number of comments using the
- Limit the result set to the first two most-commented posts
To write this query, we can use either a Derived Table (Inline Views) or a CTE (Common Table Expression).
The SQL Derived Table or Inline View is a subquery nested within a FROM clause of an outer query. The SQL syntax for a Derived Table query is basically equivalent to:
SELECT .. FROM ( SELECT .. FROM .. ) table_alias
For instance, in our case, we can generate our report using the following query that uses two nested Derived Tables:
SELECT * FROM ( SELECT post_id, post_title, comment_id, comment_review, DENSE_RANK() OVER (ORDER BY p_pc.comment_count DESC) AS ranking FROM ( SELECT p.id AS post_id, p.title AS post_title, pc.id AS comment_id, pc.review AS comment_review, COUNT(post_id) OVER(PARTITION BY post_id) AS comment_count FROM post p LEFT JOIN post_comment pc ON p.id = pc.post_id WHERE p.title LIKE 'SQL%' ) p_pc ) p_pc_r WHERE p_pc_r.ranking <= 2 ORDER BY post_id, comment_id
And the result set of the SQL Derived Table or Inline View query looks as follows:
| post_id | post_title | comment_id | comment_review | ranking | |---------|------------|------------|------------------------|---------| | 1 | SQL:2016 | 1 | SQL:2016 is great! | 1 | | 1 | SQL:2016 | 2 | SQL:2016 is excellent! | 1 | | 1 | SQL:2016 | 3 | SQL:2016 is awesome! | 1 | | 2 | SQL:2011 | 4 | SQL:2011 is great! | 2 | | 2 | SQL:2011 | 5 | SQL:2011 is excellent! | 2 |
The inner-most subquery,
p_pc, builds the JOIN between the
post_comment tables and filters the
post records by the provided prefix. To calculate the number of comments for each post, we don’t want to use
GROUP BY as that would reduce the result set to a single record per aggregation partition. Therefore, we will use the
COUNT Window Function instead.
The second subquery is
p_pc_r, and its FROM clause is the result set of the previous
p_pc subquery. The goal of the
p_pc_r subquery is to rank the posts by the number of associated comments, and, for this purpose, we use the
DENSE_RANK Window Function.
The outer-most query filters the
p_pc_r query result set and extracts the top 2 most-commented posts along with their associated comments.
The main advantage of a Derived Table or Inline View is that it allows us to override the default SQL operation order, which is:
- GROUP BY
- OVER Window Functions
- ORDER BY
If you take a look at the default SQL operation order, you will see why the outer-most query WHERE clause cannot be pushed to the
p_pc_r query because, if we did so, the SQL query would fail to compile since the
WHERE clause would reference the
DENSE_RANK Window Function column that hasn’t got the chance to be calculated.
If you’re using ORACLE, the legacy Top-N query uses a Derived Table to calculate the ROWCOUNT
I'm running an online workshop on the 11th of October about High-Performance SQL.
The SQL Derived Table or Inline View allows us to decompose a complex query into a series of smaller subqueries that can bypass the default SQL operation order.