Skip to content

Improve PostgreSQL replication lag detection in low traffic PostgreSQL cluster #534

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

v0112358
Copy link

In case the query pg_replication run in the slave which currently has no database traffic (for example in the middle of the night), the exporter will show an incorrect lag value.

Example from idle PostgreSQL cluster:

  • Current WAL lsn (from master)
# select pg_current_wal_lsn();
 pg_current_wal_lsn
--------------------
 1/3E029AE8
(1 row)
  • Current WAL recevice lsn (from slave)
# select pg_last_wal_receive_lsn();
 pg_last_wal_receive_lsn
-------------------------
 1/3E029AE8
(1 row)
  • Current WAL replay lsn (from slave)
# select pg_last_wal_replay_lsn();
 pg_last_wal_replay_lsn
------------------------
 1/3E029AE8
(1 row)
  • Current replication lag (from slave)
# SELECT CASE WHEN NOT pg_is_in_recovery() THEN 0 ELSE GREATEST (0, EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp()))) END AS lag;
     lag
-------------
 7466.495493
(1 row)
  • Improved query (from slave)
# SELECT
    CASE WHEN NOT pg_is_in_recovery() THEN
        0
    WHEN pg_last_wal_receive_lsn () = pg_last_wal_replay_lsn () THEN
        0
    ELSE
        GREATEST (0, EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp())))
    END AS lag;
 lag
-----
   0
(1 row)

@danpoltawski
Copy link

#385 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants