Summary

This blog is a complete guide to pg_stat_statements, showing how to use its metrics to find slow queries and optimize PostgreSQL performance.

Introduction

Performance tuning in PostgreSQL becomes much easier when you have visibility into how your queries behave over time. One of the most powerful tools for this purpose is pg_stat_statements. It provides a clear picture of query execution patterns, resource consumption, and performance bottlenecks. This extension is part of the contrib module and is widely used by DBAs and application teams.

This blog explains what pg_stat_statements is, how it works, how to enable it, and how to use it in real-world scenarios. It also covers tuning parameters, internal mechanics, limitations, and advanced tips.

1. What is pg_stat_statements

pg_stat_statements is a PostgreSQL extension that tracks statistics for all SQL statements executed by the server. It normalizes queries by removing literal values and aggregates execution statistics for each unique normalized query.

It helps you answer questions such as:

  • Which queries are slow
  • Which queries are executed most frequently
  • Which queries consume the most CPU time
  • Which queries cause the most I/O
  • Which queries need indexing
  • How performance changes after a code release or schema change

2. How pg_stat_statements Works Internally

Understanding how it collects and aggregates data is crucial.

2.1 Query Normalization

Literal values are replaced with placeholders.

Example:

SELECT * FROM orders WHERE id = 101
SELECT * FROM orders WHERE id = 202

Both are stored as:

SELECT * FROM orders WHERE id = $1

2.2 Tracking Metrics

For every normalized query, the extension tracks metrics such as:

  • Total execution time
  • Average execution time
  • Number of calls
  • Min and max time
  • Rows returned
  • Rows processed
  • Shared and local block hits or reads
  • Temp reads or writes
  • I/O time breakdown

2.3 Storage

Data is stored in shared memory. When the server restarts, the stats are retained if track_activity_query_size and shared_preload_libraries are set correctly.

3. How to Enable pg_stat_statements

This extension requires a parameter added to shared_preload_libraries because it must be loaded at server start.

3.1 Modify postgresql.conf

shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.max = 10000
pg_stat_statements.track = all
pg_stat_statements.track_utility = off
pg_stat_statements.save = on
pg_stat_statements.track_planning = on

3.2 Restart PostgreSQL

You must restart the service because shared_preload_libraries is involved.

3.3 Create the Extension

CREATE EXTENSION pg_stat_statements;

You can now query the pg_stat_statements view.

4. Exploring pg_stat_statements Columns

Important columns include:

ColumnExplanation
queryNormalized statement text
callsNumber of times the query was executed
total_timeSum of execution time in milliseconds
mean_timeAverage time per call
min_time and max_timeFastest and slowest execution
rowsRows returned or affected
shared_blks_hitBlocks found in shared buffers
shared_blks_readBlocks read from disk
temp_blks_writtenTemp file writes used by sorts or hashes
local_blks_read and hitLocal buffer activity
wal_bytesWAL data generated by the query

These metrics help pinpoint whether a query is slow due to CPU, I/O, or plan issues.

5. Real World Use Cases

5.1 Identify Slow Queries

To find your top slowest queries based on average execution time:

SELECT query, mean_time, calls
FROM pg_stat_statements
ORDER BY mean_time DESC
LIMIT 5;

This helps detect queries that take long per call.

5.2 Identify Queries with High Total Time

These queries consume the most server resources overall. They may run frequently or be inherently slow.

SELECT query, total_time, calls
FROM pg_stat_statements
ORDER BY total_time DESC
LIMIT 5;

This gives you the highest impact queries to tune.

5.3 Find Queries Causing Heavy I/O

If your server has high disk activity, use this:

SELECT query,
       shared_blks_read AS disk_reads,
       shared_blks_hit,
       calls
FROM pg_stat_statements
ORDER BY shared_blks_read DESC
LIMIT 5;

These queries often need improvements in indexing or caching.

5.4 Find Queries with High WAL Writes

Useful during performance degradation or when WAL volume spikes.

SELECT query, wal_bytes
FROM pg_stat_statements
ORDER BY wal_bytes DESC
LIMIT 5;

This helps identify heavy write operations such as large updates, deletes, or bulk inserts.

5.5 Detect N Plus One Query Issues

If a query is executed too many times repeatedly, it may indicate application design issues.

SELECT query, calls
FROM pg_stat_statements
ORDER BY calls DESC
LIMIT 5;

If one simple query is called millions of times, it may require batching.

5.6 Compare Before and After Changes

After creating indexes or tuning configuration, reset stats:

SELECT pg_stat_statements_reset();

Rerun the workload and compare results. This is very useful for performance testing.

5.7 Real World Example from a Client Engagement

A recent engagement clearly demonstrated the value of pg_stat_statements during active performance troubleshooting. During the assessment, we generated a list of the most CPU-intensive queries using the following statement.

SELECT
  query,
  calls,
  rows,
  total_exec_time,
  mean_exec_time,
  to_char((total_exec_time / 1000) * interval '1 second', 'HH24:MI:SS') AS total_time_hms,
  to_char((mean_exec_time / 1000) * interval '1 second', 'MI:SS.MS') AS avg_time_mss,
  round((total_exec_time / sum(total_exec_time) OVER ())::numeric * 100, 2) || '% of total load' AS pct_load
FROM
  pg_stat_statements
WHERE
  query NOT LIKE 'BEGIN%' AND query NOT LIKE 'COMMIT%' AND query NOT LIKE 'ROLLBACK%'
ORDER BY
  total_exec_time DESC
LIMIT 15;

This helped us quickly identify several queries that were consuming a very high percentage of the total CPU load. Many of these statements were executed frequently and scanned large tables because the required indexes were missing. After reviewing the execution plans, we created the appropriate indexes and optimized a few filter conditions.

Once these changes were deployed, the improvement was significant. CPU usage dropped, overall execution time decreased, and the application became noticeably faster for end users.

This is a practical example of how pg_stat_statements provides clear, actionable insights and helps teams resolve real-world performance issues efficiently.

6. Advanced Features

6.1 Tracking Planning Time

With track_planning enabled, you get planning time in addition to execution time. This is extremely helpful when:

  • Your query takes long to plan
  • The planner misestimates cardinality
  • Statistics are outdated

Columns include plan_time and execution_time.

6.2 Track Utility Commands

Utility commands include CREATE TABLE, DROP, VACUUM, and COPY.

By default, track_utility is off. You can enable it to audit heavy operations.

6.3 Identifying High Memory Sorts

Look for temp_blks_read and temp_blks_written. These indicate disk sorts or hashes.

SELECT query, temp_blks_read, temp_blks_written
FROM pg_stat_statements
ORDER BY temp_blks_written DESC;

If temp blocks are very high, you may need to increase work_mem, rewrite the query, or adjust indexing.

7. When Not to Use pg_stat_statements

Although powerful, pg_stat_statements can be used carefully in some cases.

  • Extremely high-volume OLTP systems may generate overhead
  • Very large pg_stat_statements.max can increase shared memory usage
  • Literal removal means that different parameter values appear identical in stats
  • Query text is truncated based on track_activity_query_size

8. Tuning Parameters

8.1 pg_stat_statements.max

Defines the number of unique normalized statements stored. Increase if you have many unique queries.

8.2 pg_stat_statements.track

Options include all, top, or none. Use all for maximum visibility.

8.3 pg_stat_statements.save

Keeps stats across restarts. This is useful for long-term continuous monitoring.

9. Best Practices

  • Reset stats before running benchmarking tools such as pgbench
  • Periodically export results for trend analysis
  • Combine pg_stat_statements with analysis of autovacuum logs for deeper tuning.
  • Do not keep pg_stat_statements.max too low
  • Use track_planning for diagnosing planner bottlenecks
  • Always analyze slow queries with EXPLAIN and EXPLAIN ANALYZE

10. Conclusion

pg_stat_statements is one of the most valuable performance tuning tools in PostgreSQL. It allows DBAs and developers to identify problematic queries, compare performance changes, reduce I/O, and diagnose inefficiencies. By understanding its metrics and applying them to real-world use cases, you can optimize workloads and maintain high performance across environments.

For any questions, please contact us.

Get 24/7 PostgreSQL Support and Managed Services with Certified Experts from XTIVIA Virtual-DBA 24/7.