How to Monitor a PostgreSQL Cluster: Metrics, Alerts, and Dashboards

A PostgreSQL cluster that you cannot observe is a cluster that will surprise you. High availability with Patroni gives you automatic failover, but failover is a last resort -- it means something already went wrong. Monitoring lets you catch degradation before it becomes an outage: replication lag creeping up, connections approaching limits, disks filling, or queries slowing down.

This guide covers the metrics that matter for a production PostgreSQL cluster, how to collect them with postgres_exporter and Prometheus, how to visualize them in Grafana, and what alerting rules to set so you get paged before your users notice a problem.

Why Monitoring Matters for HA PostgreSQL Clusters

A standalone PostgreSQL instance is relatively simple to monitor. You watch CPU, disk, and connections. With a Patroni-based HA cluster, the surface area expands significantly:

Multiple nodes -- each PostgreSQL instance has its own metrics. A problem on one replica (disk full, high I/O wait) can cascade if that replica gets promoted during failover.
Replication health -- streaming replication lag determines how much data you could lose during an unplanned failover. If a replica falls behind by minutes, promoting it means minutes of data loss.
Patroni state -- leader elections, failover events, and cluster membership changes all need visibility. A Patroni node stuck in a restart loop or an etcd quorum loss can silently degrade your cluster's fault tolerance.
Connection pooling -- if PgBouncer is in the stack, you need metrics on pool saturation, wait times, and client queuing separately from PostgreSQL's own connection metrics.

Without monitoring, you learn about these problems from your application's error logs or, worse, from users reporting failures. With monitoring, you learn about them from a dashboard or a PagerDuty alert at 2 AM -- before users are affected.

Key PostgreSQL Metrics to Track

Not every metric PostgreSQL exposes is worth alerting on. The following table covers the metrics that matter most for production clusters, what they mean, and what values should concern you.

Metric	Source	What it tells you	Warning threshold
Active connections	`pg_stat_activity`	Number of connections currently executing queries	> 80% of `max_connections`
Idle connections	`pg_stat_activity`	Connections open but not doing work; wastes resources	> 50% of `max_connections`
Replication lag (bytes)	`pg_stat_replication`	How far behind each replica is from the primary	> 10 MB
Replication lag (seconds)	`pg_stat_replication`	Time-based estimate of replica delay	> 30 seconds
Transaction rate (commits/s)	`pg_stat_database`	Throughput indicator; sudden drops signal problems	Baseline-dependent
Rollback rate	`pg_stat_database`	High rollback rates indicate application errors or lock contention	> 5% of total transactions
Cache hit ratio	`pg_stat_database`	Percentage of reads served from shared_buffers vs disk	< 95%
Disk usage	Node exporter	Free space on the PostgreSQL data volume	< 20% free
Dead tuples	`pg_stat_user_tables`	Rows deleted/updated but not yet vacuumed; bloat indicator	Growing consistently
Slow queries	`pg_stat_statements`	Queries exceeding a duration threshold	Application-dependent
Lock waits	`pg_stat_activity`	Queries blocked waiting for locks	Any sustained lock waits
WAL generation rate	`pg_stat_wal`	Rate of write-ahead log production; affects disk I/O and replication	Baseline-dependent

Connections

PostgreSQL handles each connection as a separate OS process, consuming 5-10 MB of memory each. When connections approach max_connections, new connections are refused and the overhead degrades query performance.

Track both active and idle connections. A large number of idle connections suggests missing or misconfigured connection pooling. Connection counts climbing without corresponding traffic increases indicate leaks in your application.

Replication Lag

Replication lag is arguably the most important metric for an HA cluster. In a Patroni setup with asynchronous replication (the default), replicas stream WAL from the primary but can fall behind under heavy write load, network congestion, or slow I/O on the replica.

The consequences of high replication lag:

Data loss on failover -- Patroni's maximum_lag_on_failover setting (default 1 MB) prevents promoting a replica that is too far behind. If all replicas exceed this threshold, automatic failover is blocked entirely.
Stale reads -- applications reading from replicas through HAProxy's reader endpoint see outdated data.
Backup gaps -- replicas used for pgBackRest backups produce backups that are behind the primary.

Monitor both byte-based lag (pg_stat_replication.replay_lag) and time-based lag. Byte lag tells you volume; time lag tells you impact.

Cache Hit Ratio

PostgreSQL's shared_buffers caches frequently accessed data pages in memory. Production databases should maintain a cache hit ratio above 95%. Below 90%, a significant portion of reads hit disk, which is orders of magnitude slower.

Common causes of a dropping hit ratio: shared_buffers too small for the working set, new query patterns scanning large tables, or VACUUM not running frequently enough causing bloat.

Transaction Rates and Rollbacks

Commits per second and rollbacks per second together give you a throughput and health signal. A sudden drop in commit rate without a corresponding drop in traffic means queries are taking longer (check slow queries and lock waits). A spike in rollback rate indicates application errors, deadlocks, or serialization failures.

Disk Usage and WAL

Running out of disk space is catastrophic -- PostgreSQL crashes and may not restart cleanly. Monitor disk usage at the OS level via node_exporter and alert at 80% utilization, with a critical alert at 90%.

WAL generation rate affects both disk consumption and replication bandwidth. Under heavy write load, WAL files can accumulate faster than they are archived, filling the pg_wal directory. Monitor WAL directory size separately from data directory size.

Setting Up postgres_exporter with Prometheus

postgres_exporter is the standard Prometheus exporter for PostgreSQL metrics. It connects to PostgreSQL, runs queries against the statistics views (pg_stat_activity, pg_stat_replication, pg_stat_database, pg_stat_statements, etc.), and exposes the results as Prometheus metrics on port 9187.

Installation

Run postgres_exporter as a Docker container alongside your PostgreSQL instance. It needs a connection string to the PostgreSQL server:

docker run -d \
  --name postgres-exporter \
  --restart unless-stopped \
  -p 9187:9187 \
  -e DATA_SOURCE_NAME="postgresql://postgres:<password>@localhost:5432/postgres?sslmode=disable" \
  prometheuscommunity/postgres-exporter:v0.17.0 \
  --collector.stat_statements

The --collector.stat_statements flag enables collection from pg_stat_statements, which requires the extension to be loaded in PostgreSQL. Ensure your PostgreSQL configuration includes:

shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.track = all
pg_stat_statements.max = 5000

If you deployed your PostgreSQL cluster with sshploy's Patroni recipe, pg_stat_statements is already enabled and postgres_exporter is deployed automatically on every node.

Prometheus Configuration

Add a scrape job to your prometheus.yml targeting each postgres_exporter instance:

scrape_configs:
  - job_name: 'postgres-exporter'
    scrape_interval: 15s
    static_configs:
      - targets:
          - '192.168.1.1:9187'
          - '192.168.1.2:9187'
          - '192.168.1.3:9187'

After restarting Prometheus, verify the targets appear as "UP" under Status > Targets. If a target shows "DOWN", check that the postgres_exporter container is running and that port 9187 is reachable from Prometheus.

Key Exported Metrics

Once scraped, the following Prometheus metrics become available:

pg_stat_activity_count -- connections grouped by state (active, idle, idle in transaction)
pg_stat_replication_pg_wal_lsn_diff -- replication lag in bytes per replica
pg_stat_database_xact_commit -- total committed transactions (use rate() for commits/sec)
pg_stat_database_xact_rollback -- total rolled-back transactions
pg_stat_database_blks_hit and pg_stat_database_blks_read -- cache hit ratio components
pg_stat_database_deadlocks -- deadlock count
pg_stat_user_tables_n_dead_tup -- dead tuples per table
pg_stat_statements_mean_time_seconds -- average execution time per query

Grafana Dashboard Setup

Grafana provides the visualization layer for your PostgreSQL metrics. A well-built dashboard gives you at-a-glance cluster health and drill-down capability for debugging.

Data Source Configuration

Add Prometheus as a data source in Grafana:

Navigate to Configuration > Data Sources > Add data source
Select Prometheus
Set the URL to your Prometheus instance (e.g., http://192.168.1.10:9090)
Click Save & Test to verify connectivity

Dashboard Panels

Build your PostgreSQL monitoring dashboard with these panels:

Cluster overview row:

PostgreSQL version (stat panel)
Current primary node (stat panel using Patroni metrics or a label query)
Total active connections across all nodes (stat panel)
Cluster uptime (stat panel)

Connections row:

Active connections per node (time series, grouped by instance)
Idle connections per node (time series)
Connection utilization percentage: pg_stat_activity_count / pg_settings_max_connections * 100

Replication row:

Replication lag in bytes per replica (time series)
Replication lag in seconds per replica (time series)
Replica state (stat panel: streaming, catchup, or disconnected)

Performance row:

Transactions per second: rate(pg_stat_database_xact_commit[5m])
Rollbacks per second: rate(pg_stat_database_xact_rollback[5m])
Cache hit ratio: pg_stat_database_blks_hit / (pg_stat_database_blks_hit + pg_stat_database_blks_read) * 100
Deadlocks per second: rate(pg_stat_database_deadlocks[5m])

Storage row:

Database size per instance (gauge)
Dead tuples by table (table panel, sorted descending)
Disk usage from node_exporter (time series)

You can also import community dashboards. The PostgreSQL Database dashboard (ID 9628) and the postgres_exporter dashboard (ID 455) are solid starting points that you can customize.

Alerting Rules

Dashboards are for humans looking at screens. Alerts are for when nobody is looking. Define Prometheus alerting rules for conditions that require immediate attention.

Create an alerting rules file (postgres_alerts.yml) and reference it in your Prometheus configuration:

rule_files:
  - 'postgres_alerts.yml'

Prometheus Alerting Rules

groups:
  - name: postgresql_alerts
    rules:
      - alert: PostgresqlReplicationLagHigh
        expr: pg_stat_replication_pg_wal_lsn_diff > 10485760
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "PostgreSQL replication lag above 10 MB"
          description: "Replica {{ $labels.instance }} has {{ $value | humanize1024 }}B replication lag for more than 5 minutes."

      - alert: PostgresqlReplicationLagCritical
        expr: pg_stat_replication_pg_wal_lsn_diff > 104857600
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "PostgreSQL replication lag above 100 MB"
          description: "Replica {{ $labels.instance }} has {{ $value | humanize1024 }}B replication lag. Failover may be blocked if lag exceeds maximum_lag_on_failover."

      - alert: PostgresqlConnectionsHigh
        expr: sum by (instance) (pg_stat_activity_count) > (pg_settings_max_connections * 0.8)
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "PostgreSQL connections above 80% of max"
          description: "Instance {{ $labels.instance }} has {{ $value }} active connections (max_connections: {{ $labels.max_connections }})."

      - alert: PostgresqlConnectionsCritical
        expr: sum by (instance) (pg_stat_activity_count) > (pg_settings_max_connections * 0.95)
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "PostgreSQL connections above 95% of max"
          description: "Instance {{ $labels.instance }} is nearly at max_connections. New connections will be refused."

      - alert: PostgresqlCacheHitRatioLow
        expr: |
          (
            pg_stat_database_blks_hit{datname!~"template.*"}
            /
            (pg_stat_database_blks_hit{datname!~"template.*"} + pg_stat_database_blks_read{datname!~"template.*"})
          ) < 0.90
        for: 15m
        labels:
          severity: warning
        annotations:
          summary: "PostgreSQL cache hit ratio below 90%"
          description: "Database {{ $labels.datname }} on {{ $labels.instance }} has a cache hit ratio of {{ $value | humanizePercentage }}. Consider increasing shared_buffers or investigating query patterns."

      - alert: PostgresqlDeadlocks
        expr: rate(pg_stat_database_deadlocks[5m]) > 0.1
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "PostgreSQL deadlocks detected"
          description: "Database {{ $labels.datname }} on {{ $labels.instance }} is experiencing {{ $value }} deadlocks/sec."

      - alert: PostgresqlHighRollbackRate
        expr: |
          rate(pg_stat_database_xact_rollback[5m])
          /
          (rate(pg_stat_database_xact_commit[5m]) + rate(pg_stat_database_xact_rollback[5m]))
          > 0.05
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "PostgreSQL rollback rate above 5%"
          description: "Database {{ $labels.datname }} on {{ $labels.instance }} has a rollback rate of {{ $value | humanizePercentage }}."

      - alert: DiskSpaceLow
        expr: (node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) < 0.20
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "Disk space below 20%"
          description: "Instance {{ $labels.instance }} has {{ $value | humanizePercentage }} disk space remaining."

      - alert: DiskSpaceCritical
        expr: (node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) < 0.10
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Disk space below 10%"
          description: "Instance {{ $labels.instance }} has {{ $value | humanizePercentage }} disk space remaining. PostgreSQL may crash if disk fills completely."

      - alert: PostgresqlExporterDown
        expr: up{job="postgres-exporter"} == 0
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "postgres_exporter is down"
          description: "Cannot scrape PostgreSQL metrics from {{ $labels.instance }}. The exporter may have crashed or PostgreSQL may be unreachable."

Alert Routing

Connect Prometheus alerts to a notification channel using Alertmanager. Configure routes for different severity levels:

critical -- PagerDuty, Opsgenie, or phone call. These are wake-someone-up-at-3-AM alerts.
warning -- Slack channel or email. These need attention during business hours but are not immediately dangerous.

Keep alerting rules manageable. Ten well-tuned alerts are better than fifty noisy ones. Every alert should have a clear action -- if you cannot articulate what to do when it fires, it should be a dashboard panel, not an alert.

Patroni-Specific Monitoring

Beyond PostgreSQL metrics, a Patroni cluster has its own operational state that needs monitoring.

Patroni REST API

Patroni exposes a REST API on port 8008 that reports cluster state. Key endpoints:

GET /primary -- returns 200 if the node is the current primary, 503 otherwise
GET /replica -- returns 200 if the node is a healthy replica, 503 otherwise
GET /cluster -- returns full cluster state as JSON, including all members, their roles, lag, and timeline

You can scrape the Patroni REST API directly or use a blackbox exporter to probe the endpoints. A simple approach is a custom Prometheus metric from a script that queries /cluster and exposes:

patroni_cluster_leader -- which node is the current leader
patroni_cluster_members -- number of healthy members
patroni_node_role -- role of each node (primary, replica, or unknown)

Failover Events

Monitor Patroni logs for failover events. Key log patterns to watch for:

"promoted self to leader" -- a replica was promoted
"demoted self" -- a primary stepped down
"Lock owner changed" -- leadership changed in etcd
"reinitialize" -- a node is rebuilding from scratch (data divergence detected)

Forward Patroni container logs to a log aggregation system (Loki, Elasticsearch, or Datadog Logs) and set up alerts on these patterns. A failover event is not necessarily a problem, but an unexpected one needs investigation.

etcd Health

If etcd loses quorum (fewer than half the nodes are healthy), Patroni cannot perform failovers or leader elections. Monitor:

etcd_server_has_leader -- 1 if the node sees a leader, 0 otherwise
etcd_disk_wal_fsync_duration_seconds -- high fsync latency causes slow elections and timeouts
etcd_server_proposals_failed_total -- failed consensus proposals indicate cluster instability

etcd exposes these metrics on port 2379 at /metrics. Add an etcd scrape job to Prometheus alongside your postgres_exporter job.

How sshploy's Grafana Stack Helps

Setting up Prometheus, Grafana, postgres_exporter, and node_exporter across multiple servers -- each with proper network connectivity, firewall rules, and service discovery -- takes significant effort when done manually. sshploy's Grafana stack recipe deploys the entire monitoring stack in a single operation: Grafana with pre-provisioned data sources, Prometheus with scrape targets auto-configured for your cluster nodes, node_exporter on every server, and a PostgreSQL dashboard pre-loaded and ready to use. When deployed alongside the Patroni recipe, postgres_exporter is automatically installed on every PostgreSQL node and registered as a Prometheus scrape target through custom scraper configuration, so metrics start flowing immediately with no manual wiring.

FAQ

What is the best Grafana dashboard for PostgreSQL monitoring?

The postgres_exporter community dashboard (Grafana ID 9628) is a solid general-purpose starting point. It covers connections, transaction rates, cache hit ratios, replication lag, and table-level statistics. For Patroni-specific metrics, you will likely need to build custom panels or extend an existing dashboard. sshploy's Grafana stack includes a pre-configured PostgreSQL dashboard that covers the most important metrics out of the box.

How often should Prometheus scrape postgres_exporter?

A 15-second scrape interval is the standard default and works well for most clusters. For high-traffic databases where you need finer-grained resolution, you can reduce it to 10 seconds. Going below 10 seconds adds meaningful load to both postgres_exporter (which runs SQL queries on every scrape) and Prometheus (more data points to store). If you need sub-second metrics for specific queries, consider application-level instrumentation instead.

How do I monitor replication lag in a Patroni cluster?

postgres_exporter collects replication lag from pg_stat_replication on the primary node and exposes it as pg_stat_replication_pg_wal_lsn_diff (bytes) and pg_stat_replication_replay_lag (seconds). In Prometheus, query these metrics filtered by the primary instance. You can also query the Patroni REST API at /cluster, which returns each member's lag value directly. For alerting, the byte-based metric with a threshold of 10 MB as a warning and 100 MB as critical is a reasonable starting point. Adjust based on your write volume.

Should I run monitoring on the same servers as the database?

Run node_exporter and postgres_exporter on the database servers -- they are lightweight and need local access. Run Prometheus and Grafana on a separate server. If Prometheus runs on a database node and that node fails, you lose monitoring at the exact moment you need it most. A small dedicated monitoring server (2 vCPU, 4 GB RAM) is sufficient for clusters of up to 20-30 nodes and several weeks of metric retention.

What is the difference between pg_stat_activity and PgBouncer metrics?

pg_stat_activity shows connections at the PostgreSQL level -- these are actual server processes. PgBouncer metrics (available via SHOW POOLS or a PgBouncer exporter) show connections at the pooler level: client connections waiting, server connections in use, and pool utilization. In a PgBouncer setup, you can have 5,000 client connections mapped to 100 PostgreSQL connections. Monitoring only PostgreSQL misses client-side queuing and wait times. Monitor both layers: PgBouncer for client experience, PostgreSQL for server resource usage.