PostgreSQL Auto Explain
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
Introduction
In this article, we are going to see how the PostgreSQL Auto Explain feature works and why you should use it to gather the actual execution plan for SQL statements that execute on a production system.
SQL Execution Plan
As I explained in this article, when you send a SQL statement to PostgreSQL, the statement is executed as illustrated by the following diagram:
First, the text-based SQL statement is parsed into an Abstract Syntax Tree (e.g., Query Tree) that can be traversed programmatically by the database server.
Second, the Optimizer uses the Query Tree to generate an optimal Execution Plan that consumes the least amount of resources to generate the desired outcome.
Third, the Executor runs the Execution Plan, and the output is returned to the client as the query Result Set.
Estimated vs. Actual Execution Plan
When you are using the EXPLAIN
command. PostgreSQL only returns the Estimated Execution Plan, the one that the Optimizer thinks that’s the most efficient for the provided SQL statement. However, the statement is not really executed when you run the EXPLAIN
command.
On the other hand, if we run EXPLAIN ANALYZE
, PostgreSQL runs the statement, so we will get the Actual Execution Plan, which also contains the timing information for each operation in the execution plan.
When investigating a slow query on a production system, there are several challenges that we may face.
First, for security reasons, we may not be allowed to run queries on the production system, so, in that case, we cannot simply run the EXPLAIN ANALYZE
command to get the Actual Execution Plan.
Second, even if we have the privilege to run the EXPLAIN ANALYZE
command, we may observe a different plan than the one the client complained about. This may be due to several reasons.
For instance, PostgreSQL has a prepareThreshold
setting, which has a default value of 5
. This value tells PostgreSQL how many times it can emulate a Prepared Statement on the client side before switching to a server-side Prepared Statement that employs a generic plan.
If the slow query was using a generic plan, you might not get the same plan even if you run EXPLAIN ANALYZE
, which generates an execution plan on the fly.
So, a much better solution to analyzing slow queries is if we get the Actual Execution Plan that was used by PostgreSQL when running the query in question.
PostgreSQL Auto Explain module
PostgreSQL is very customizable, and it provides several extensions that we can activate explicitly.
One such extension is auto_explain
, which allows us to capture the Actual Execution Plan of an SQL query that took longer than a pre-defined threshold value.
If you want to activate the auto_explain
extension on a PostgreSQL database server, then you can enable it in the postgresql.conf
configuration file via the session_preload_libraries
setting:
session_preload_libraries = 'auto_explain'
Afterward, you can define the slow query threshold. For instance, if we want to capture the Actual Execution Plan of all queries that take more than 100
milliseconds, then we need to provide the following PostgreSQL setting:
auto_explain.log_analyze = 'on' auto_explain.log_min_duration = '100ms'
The PostgreSQL settings can also be set at runtime using the
SET
command, in which case it will be applied only to the statements executed using the database connection on which we executed theSET
command.
After we enable the PostgreSQL Auto Explain feature and we run some SQL queries that take more than 100
milliseconds, we will get the following entries into the PostgreSQL log:
LOG: duration: 216.952 ms plan: Query Text: SELECT university_name FROM scholarship_criteria WHERE admission_score <= ANY ( SELECT AVG(student_grade.grade) FROM student_grade GROUP BY student_id ) ORDER BY id Nested Loop Semi Join (cost=0.55..20075.80 rows=1 width=17) (actual time=216.944..216.946 rows=1 loops=1) Join Filter: (scholarship_criteria.admission_score <= (avg(student_grade.grade))) Rows Removed by Join Filter: 121266 -> Index Scan using scholarship_criteria_pkey on scholarship_criteria (cost=0.13..12.16 rows=2 width=25) (actual time=0.038..0.039 rows=2 loops=1) -> GroupAggregate (cost=0.42..17798.42 rows=100640 width=16) (actual time=0.046..104.982 rows=60634 loops=2) Group Key: student_grade.student_id -> Index Scan using idx_student_grade_student_id on student_grade (cost=0.42..14040.42 rows=500000 width=16) (actual time=0.040..51.788 rows=303168 loops=2)
Awesome, right?
If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.
Conclusion
The PostgreSQL Auto Explain extension is a very useful feature when inspecting slow queries on a production system.
Not only that it can indicate which queries took longer than expected, but it allows us to capture the Actual Execution Plan that PostgreSQL used when running the query in question.
