SQL Derived Table or Inline View

Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?

Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.

So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!

Introduction

In this article, we are going to see what an SQL Derived Table or Inline View is and how you can use it to bypass the standard SQL operation order.

While the SQL Standard uses the term Derived Table for this feature, some relational database systems call it Inline View.

Database tables

For this article, we are going to use the following post and post_comment tables that form a one-to-many table relationship:

SQL Derived Table or Inline View Tables

The parent post table contains the following rows:

| id | title    |
|----|----------|
| 1  | SQL:2016 |
| 2  | SQL:2011 |
| 3  | SQL:2008 |
| 4  | JPA 3.0  |
| 5  | JPA 2.2  |
| 6  | JPA 2.1  |
| 7  | JPA 2.0  |
| 8  | JPA 1.0  |

And the post_comment child table contains these records:

| id | review                 | post_id |
|----|------------------------|---------|
| 1  | SQL:2016 is great!     | 1       |
| 2  | SQL:2016 is excellent! | 1       |
| 3  | SQL:2016 is awesome!   | 1       |
| 4  | SQL:2011 is great!     | 2       |
| 5  | SQL:2011 is excellent! | 2       |
| 6  | SQL:2008 is great!     | 3       |

SQL Derived Table or Inline View

Let’s assume we want to get the first two most-commented posts that start with a given prefix (e.g., SQL) along with all their associated comments.

While the SQL standard provides the FETCH FIRST N ROWS ONLY clause to write a Top-N SQL query, that would not help us implement our use case because we need the first 2 post rows along with all their associated post_comment child records, no matter the size of the joined result set.

In order to fetch the expected result set, we would have to:

  • Filter the post records by the SQL prefix
  • JOIN the post and post_comment records and use the COUNT Window Function to check how many comments are associated with each post record
  • Rank the posts by their number of comments using the DENSE_RANK Window Function
  • Limit the result set to the first two most-commented posts

To write this query, we can use either a Derived Table (Inline Views) or a CTE (Common Table Expression).

The SQL Derived Table or Inline View is a subquery nested within a FROM clause of an outer query. The SQL syntax for a Derived Table query is basically equivalent to:

SELECT .. 
FROM ( 
  SELECT .. 
  FROM ..
) table_alias

For instance, in our case, we can generate our report using the following query that uses two nested Derived Tables:

SELECT *
FROM (
  SELECT
    post_id,
    post_title,
    comment_id,
    comment_review,
    DENSE_RANK() OVER (ORDER BY p_pc.comment_count DESC) AS ranking
  FROM (
    SELECT
      p.id AS post_id,
      p.title AS post_title,
      pc.id AS comment_id,
      pc.review AS comment_review,
      COUNT(post_id) OVER(PARTITION BY post_id) AS comment_count
    FROM post p
    LEFT JOIN post_comment pc ON p.id = pc.post_id
    WHERE p.title LIKE 'SQL%'
  ) p_pc
) p_pc_r
WHERE p_pc_r.ranking <= 2
ORDER BY post_id, comment_id

And the result set of the SQL Derived Table or Inline View query looks as follows:

| post_id | post_title | comment_id | comment_review         | ranking |
|---------|------------|------------|------------------------|---------|
| 1       | SQL:2016   | 1          | SQL:2016 is great!     | 1       |
| 1       | SQL:2016   | 2          | SQL:2016 is excellent! | 1       |
| 1       | SQL:2016   | 3          | SQL:2016 is awesome!   | 1       |
| 2       | SQL:2011   | 4          | SQL:2011 is great!     | 2       |
| 2       | SQL:2011   | 5          | SQL:2011 is excellent! | 2       |

The inner-most subquery, p_pc, builds the JOIN between the post and post_comment tables and filters the post records by the provided prefix. To calculate the number of comments for each post, we don’t want to use GROUP BY as that would reduce the result set to a single record per aggregation partition. Therefore, we will use the COUNT Window Function instead.

The second subquery is p_pc_r, and its FROM clause is the result set of the previous p_pc subquery. The goal of the p_pc_r subquery is to rank the posts by the number of associated comments, and, for this purpose, we use the DENSE_RANK Window Function.

The outer-most query filters the p_pc_r query result set and extracts the top 2 most-commented posts along with their associated comments.

The main advantage of a Derived Table or Inline View is that it allows us to override the default SQL operation order, which is:

  • FROM
  • JOIN
  • WHERE
  • GROUP BY
  • HAVING
  • OVER Window Functions
  • SELECT
  • ORDER BY
  • LIMIT

If you take a look at the default SQL operation order, you will see why the outer-most query WHERE clause cannot be pushed to the p_pc_r query because, if we did so, the SQL query would fail to compile since the WHERE clause would reference the DENSE_RANK Window Function column that hasn’t got the chance to be calculated.

If you’re using ORACLE, the legacy Top-N query uses a Derived Table to calculate the ROWCOUNT

I'm running an online workshop on the 20-21 and 23-24 of November about High-Performance Java Persistence.

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

The SQL Derived Table or Inline View allows us to decompose a complex query into a series of smaller subqueries that can bypass the default SQL operation order.

Transactions and Concurrency Control eBook

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.