SQL Operation Order

Last modified:

Are you struggling with performance issues in your Spring, Jakarta EE, or Java EE application?

What if there were a tool that could automatically detect what caused performance issues in your JPA and Hibernate data access layer?

Wouldn’t it be awesome to have such a tool to watch your application and prevent performance issues during development, long before they affect production systems?

Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, Micronaut, or Play Framework.

So, rather than fixing performance issues in your production system on a Saturday night, you are better off using Hypersistence Optimizer to help you prevent those issues so that you can spend your time on the things that you love!

Introduction

In this article, we are going to see what is the standard SQL operation order.

Once you understand the order in which SQL operations are executed, then it will be clear why the Oracle legacy pagination query required a Derived Table in order to make sure that the ROWNUM pseudocolumn is computed after executing the ORDER BY clause.

SQL Operations

The SQL standard defines many operations, such as:

SELECT
FROM
JOIN
GROUP BY and HAVING
Window Functions (e.g., DENSE_RANK)
UNION and UNION ALL
ORDER BY
FETCH FIRST/NEXT ROWS ONLY

But since SQL is a declarative language, the order of operations is not the one we defined, but the one that follows a logical flow that’s similar to processing Streams of data using a functional programming approach.

SQL Operations Order

In order to see the order of operations in a given SQL query, you just need to check its associated Execution Plan.

For instance, let’s consider the following query that includes the vast majority of standard SQL Operations:

SELECT
    p.id AS post_id,
    p.title AS post_title,
    COUNT(pc.*) AS comment_count,
    row_number() over () idx
FROM post p
LEFT JOIN post_comment pc ON p.id = pc.post_id
WHERE p.title LIKE 'SQL%'
GROUP BY p.id, p.title
HAVING COUNT(pc.*) > 1
UNION ALL
SELECT
    p.id AS post_id,
    p.title AS post_title,
    COUNT(pc.*) AS comment_count,
    row_number() over () idx
FROM post p
LEFT JOIN post_comment pc ON p.id = pc.post_id
WHERE p.title LIKE 'JPA%'
GROUP BY p.id, p.title
ORDER BY idx
OFFSET 1 ROWS
FETCH NEXT 5 ROWS ONLY

When extracting the Execution Plan with EXPLAIN ANALYZE, we get the following list of operations that got executed:

Limit  
(cost=47.21..47.21 rows=1 width=536) 
(actual time=0.139..0.141 rows=5 loops=1)
  ->  Sort  
      (cost=47.21..47.21 rows=2 width=536) 
      (actual time=0.138..0.139 rows=6 loops=1)
        Sort Key: (row_number() OVER (?))
        Sort Method: quicksort  Memory: 25kB
        ->  Append  
           (cost=23.55..47.20 rows=2 width=536) 
           (actual time=0.075..0.131 rows=7 loops=1)
              ->  WindowAgg  
                  (cost=23.55..23.59 rows=1 width=536) 
                  (actual time=0.075..0.079 rows=2 loops=1)
                    ->  GroupAggregate  
                        (cost=23.55..23.57 rows=1 width=528) 
                        (actual time=0.069..0.072 rows=2 loops=1)
                          Group Key: p.id
                          Filter: (count(pc.*) > 1)
                          Rows Removed by Filter: 1
                          ->  Sort  
                              (cost=23.55..23.56 rows=1 width=1068) 
                              (actual time=0.065..0.066 rows=6 loops=1)
                                Sort Key: p.id
                                Sort Method: quicksort  Memory: 25kB
                                ->  Hash Right Join  
                                    (cost=11.76..23.54 rows=1 width=1068) 
                                    (actual time=0.052..0.059 rows=6 loops=1)
                                      Hash Cond: (pc.post_id = p.id)
                                      ->  Seq Scan on post_comment pc  
                                          (cost=0.00..11.40 rows=140 width=552) 
                                          (actual time=0.013..0.014 rows=6 loops=1)
                                      ->  Hash  
                                          (cost=11.75..11.75 rows=1 width=520) 
                                          (actual time=0.027..0.027 rows=3 loops=1)
                                            Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                            ->  Seq Scan on post p  
                                                (cost=0.00..11.75 rows=1 width=520) 
                                                (actual time=0.018..0.019 rows=3 loops=1)
                                                  Filter: ((title)::text ~~ 'SQL%'::text)
                                                  Rows Removed by Filter: 5
              ->  WindowAgg  
                  (cost=23.55..23.58 rows=1 width=536) 
                  (actual time=0.046..0.051 rows=5 loops=1)
                    ->  GroupAggregate  
                        (cost=23.55..23.57 rows=1 width=528) 
                        (actual time=0.045..0.047 rows=5 loops=1)
                          Group Key: p_1.id
                          ->  Sort  
                              (cost=23.55..23.56 rows=1 width=1068) 
                              (actual time=0.043..0.044 rows=5 loops=1)
                                Sort Key: p_1.id
                                Sort Method: quicksort  Memory: 25kB
                                ->  Hash Right Join  
                                    (cost=11.76..23.54 rows=1 width=1068) 
                                    (actual time=0.036..0.038 rows=5 loops=1)
                                      Hash Cond: (pc_1.post_id = p_1.id)
                                      ->  Seq Scan on post_comment pc_1  
                                          (cost=0.00..11.40 rows=140 width=552) 
                                          (actual time=0.010..0.011 rows=6 loops=1)
                                      ->  Hash  
                                          (cost=11.75..11.75 rows=1 width=520) 
                                          (actual time=0.015..0.015 rows=5 loops=1)
                                            Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                            ->  Seq Scan on post p_1  
                                                (cost=0.00..11.75 rows=1 width=520) 
                                                (actual time=0.013..0.014 rows=5 loops=1)
                                                  Filter: ((title)::text ~~ 'JPA%'::text)
                                                  Rows Removed by Filter: 3

To better visualize the flow of operations, we can use this online tool, and we will get the following SQL Operation diagram:

SQL Operation Order

So, based on this diagram, we can see that the SQL Operation Order:

FROM and JOIN
WHERE
GROUP BY
ROLLUP, CUBE, GROUPING SETS
HAVING
OVER (e.g., Window Functions)
SELECT
DISTINCT
UNION, INTERSECT, EXCEPT
ORDER BY
OFFSET
FETCH FIRST/NEXT ROWS ONLY, LIMIT, TOP

That’s it!

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

Knowing the SQL operation order not only makes it easier for you to write SQL queries, but you will also be better prepared when having to read the associated Execution Plan.

Follow @vlad_mihalcea