SQL Operation Order

Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?

Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.

So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!

Introduction

In this article, we are going to see what is the standard SQL operation order.

Once you understand the order in which SQL operations are executed, then it will be clear why the Oracle legacy pagination query required a Derived Table in order to make sure that the ROWNUM pseudocolumn is computed after executing the ORDER BY clause.

SQL Operations

The SQL standard defines many operations, such as:

  • SELECT
  • FROM
  • JOIN
  • GROUP BY and HAVING
  • Window Functions (e.g., DENSE_RANK)
  • UNION and UNION ALL
  • ORDER BY
  • FETCH FIRST/NEXT ROWS ONLY

But since SQL is a declarative language, the order of operations is not the one we defined, but the one that follows a logical flow that’s similar to processing Streams of data using a functional programming approach.

SQL Operations Order

In order to see the order of operations in a given SQL query, you just need to check its associated Execution Plan.

For instance, let’s consider the following query that includes the vast majority of standard SQL Operations:

SELECT
    p.id AS post_id,
    p.title AS post_title,
    COUNT(pc.*) AS comment_count,
    row_number() over () idx
FROM post p
LEFT JOIN post_comment pc ON p.id = pc.post_id
WHERE p.title LIKE 'SQL%'
GROUP BY p.id, p.title
HAVING COUNT(pc.*) > 1
UNION ALL
SELECT
    p.id AS post_id,
    p.title AS post_title,
    COUNT(pc.*) AS comment_count,
    row_number() over () idx
FROM post p
LEFT JOIN post_comment pc ON p.id = pc.post_id
WHERE p.title LIKE 'JPA%'
GROUP BY p.id, p.title
ORDER BY idx
OFFSET 1 ROWS
FETCH NEXT 5 ROWS ONLY

When extracting the Execution Plan with EXPLAIN ANALYZE, we get the following list of operations that got executed:

Limit  
(cost=47.21..47.21 rows=1 width=536) 
(actual time=0.139..0.141 rows=5 loops=1)
  ->  Sort  
      (cost=47.21..47.21 rows=2 width=536) 
      (actual time=0.138..0.139 rows=6 loops=1)
        Sort Key: (row_number() OVER (?))
        Sort Method: quicksort  Memory: 25kB
        ->  Append  
           (cost=23.55..47.20 rows=2 width=536) 
           (actual time=0.075..0.131 rows=7 loops=1)
              ->  WindowAgg  
                  (cost=23.55..23.59 rows=1 width=536) 
                  (actual time=0.075..0.079 rows=2 loops=1)
                    ->  GroupAggregate  
                        (cost=23.55..23.57 rows=1 width=528) 
                        (actual time=0.069..0.072 rows=2 loops=1)
                          Group Key: p.id
                          Filter: (count(pc.*) > 1)
                          Rows Removed by Filter: 1
                          ->  Sort  
                              (cost=23.55..23.56 rows=1 width=1068) 
                              (actual time=0.065..0.066 rows=6 loops=1)
                                Sort Key: p.id
                                Sort Method: quicksort  Memory: 25kB
                                ->  Hash Right Join  
                                    (cost=11.76..23.54 rows=1 width=1068) 
                                    (actual time=0.052..0.059 rows=6 loops=1)
                                      Hash Cond: (pc.post_id = p.id)
                                      ->  Seq Scan on post_comment pc  
                                          (cost=0.00..11.40 rows=140 width=552) 
                                          (actual time=0.013..0.014 rows=6 loops=1)
                                      ->  Hash  
                                          (cost=11.75..11.75 rows=1 width=520) 
                                          (actual time=0.027..0.027 rows=3 loops=1)
                                            Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                            ->  Seq Scan on post p  
                                                (cost=0.00..11.75 rows=1 width=520) 
                                                (actual time=0.018..0.019 rows=3 loops=1)
                                                  Filter: ((title)::text ~~ 'SQL%'::text)
                                                  Rows Removed by Filter: 5
              ->  WindowAgg  
                  (cost=23.55..23.58 rows=1 width=536) 
                  (actual time=0.046..0.051 rows=5 loops=1)
                    ->  GroupAggregate  
                        (cost=23.55..23.57 rows=1 width=528) 
                        (actual time=0.045..0.047 rows=5 loops=1)
                          Group Key: p_1.id
                          ->  Sort  
                              (cost=23.55..23.56 rows=1 width=1068) 
                              (actual time=0.043..0.044 rows=5 loops=1)
                                Sort Key: p_1.id
                                Sort Method: quicksort  Memory: 25kB
                                ->  Hash Right Join  
                                    (cost=11.76..23.54 rows=1 width=1068) 
                                    (actual time=0.036..0.038 rows=5 loops=1)
                                      Hash Cond: (pc_1.post_id = p_1.id)
                                      ->  Seq Scan on post_comment pc_1  
                                          (cost=0.00..11.40 rows=140 width=552) 
                                          (actual time=0.010..0.011 rows=6 loops=1)
                                      ->  Hash  
                                          (cost=11.75..11.75 rows=1 width=520) 
                                          (actual time=0.015..0.015 rows=5 loops=1)
                                            Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                            ->  Seq Scan on post p_1  
                                                (cost=0.00..11.75 rows=1 width=520) 
                                                (actual time=0.013..0.014 rows=5 loops=1)
                                                  Filter: ((title)::text ~~ 'JPA%'::text)
                                                  Rows Removed by Filter: 3

To better visualize the flow of operations, we can use this online tool, and we will get the following SQL Operation diagram:

SQL Operation Order

So, based on this diagram, we can see that the SQL Operation Order:

  • FROM and JOIN
  • WHERE
  • GROUP BY
  • ROLLUP, CUBE, GROUPING SETS
  • HAVING
  • OVER (e.g., Window Functions)
  • SELECT
  • DISTINCT
  • UNION, INTERSECT, EXCEPT
  • ORDER BY
  • OFFSET
  • FETCH FIRST/NEXT ROWS ONLY, LIMIT, TOP

That’s it!

I'm running an online workshop on the 20-21 and 23-24 of November about High-Performance Java Persistence.

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

Knowing the SQL operation order not only makes it easier for you to write SQL queries, but you will also be better prepared when having to read the associated Execution Plan.

Transactions and Concurrency Control eBook

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.