background

PostgreSQL Replication Monitoring

Join Over 1,000 Engineers & Get New Episodes Weekly!

Learn how to monitor native PostgreSQL replication using the system views and functions PostgreSQL provides. Specific queries are also provided.

This tutorial assumes you have already setup 1) streaming replication between a primary and a replica PostgreSQL database cluster, and 2) are using replication slots. If you need to set these up use the following links to setup: streaming replication and replication slots.

--# PostgreSQL 10 installed on Ubuntu
--# Primary DB cluster called "main" on port 5432
--# Replica DB cluster called "replica" on port 5433

-- PostgreSQL >= 10 then use "wal_lsn"
-- PostgreSQL <  10 then use "xlog_location"

-- on main
select * from pg_stat_replication;
select * from pg_replication_slots;

-- on replica check if in standby_mode
select pg_is_in_recovery();

-- stop replica and check change in status
sudo systemctl stop postgresql@10-replica

-- on main
select * from pg_stat_replication;
select * from pg_replication_slots;

-- start replica again
sudo systemctl start postgresql@10-replica

-- on main compare WAL status using lsn diff
select pg_wal_lsn_diff('0/23000738','0/230001B0');

-- on main query to track lag in bytes
-- sending_lag could indicate heavy load on primary
-- receiving_lag could indicate network issues or replica under heavy load
-- replaying_lag could indicate replica under heavy load
select
  pid,
  application_name,
  pg_wal_lsn_diff(pg_current_wal_lsn(), sent_lsn) sending_lag,
  pg_wal_lsn_diff(sent_lsn, flush_lsn) receiving_lag,
  pg_wal_lsn_diff(flush_lsn, replay_lsn) replaying_lag,
  pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) total_lag
from pg_stat_replication;

-- on replica can check replica locations or timing
--   pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn(), pg_last_xact_replay_timestamp()
SELECT
  CASE
    WHEN pg_last_wal_receive_lsn() = pg_last_wal_replay_lsn()
    THEN 0
    ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp())
    END AS log_delay;

-- Want to know what file a lsn refers to?
select pg_walfile_name(pg_current_wal_lsn());

Postgres documentation

https://www.postgresql.org/docs/10/static/monitoring-stats.html#PG-STAT-REPLICATION-VIEW https://www.postgresql.org/docs/10/static/functions-admin.html#FUNCTIONS-ADMIN-BACKUP-TABLE https://www.postgresql.org/docs/10/static/functions-admin.html#FUNCTIONS-RECOVERY-INFO-TABLE