You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Deploy postgresql 14/edge on rev 239 with three units.
Model Controller Cloud/Region Version SLA Timestamp
prod-repo-policy-compliance juju-controller-35-production-ps6 k8s-prod-general/default 3.5.3 unsupported 08:07:59Z
App Version Status Scale Charm Channel Rev Address Exposed Message
postgresql 14.12 active 3 postgresql-k8s 14/edge 239 10.87.6.161 no Primary
repo-policy-compliance active 1 repo-policy-compliance latest/edge 17 10.87.26.0 no
Unit Workload Agent Address Ports Message
postgresql/0 active idle 192.168.102.57 Primary
postgresql/1* active idle 192.168.100.212
postgresql/2 active idle 192.168.103.49
repo-policy-compliance/0* active idle 192.168.102.43
Expected behavior
Postgreql is not going down.
Actual behavior
We received the Postgresql down alert on a production deployment with three units.
Logs indicate that postgresql-1 was shutdown due to
2024-08-07 06:58:22 UTC [8758]: user=,db=,app=,client=,line=12 LOG: received fast shutdown request
2024-08-07 06:58:22 UTC [8758]: user=,db=,app=,client=,line=13 LOG: aborting any active transactions
2024-08-07 06:58:22 UTC [8888]: user=monitoring,db=postgres,app=[unknown],client=[local],line=1 FATAL: terminating connection due to administrator command
2024-08-07 06:58:22 UTC [8867]: user=monitoring,db=postgres,app=[unknown],client=[local],line=1 FATAL: terminating connection due to administrator command
2024-08-07 06:58:22 UTC [8769]: user=operator,db=postgres,app=Patroni,client=127.0.0.1,line=1 FATAL: terminating connection due to administrator command
2024-08-07 06:58:22 UTC [8758]: user=,db=,app=,client=,line=14 LOG: background worker "logical replication launcher" (PID 8783) exited with exit code 1
2024-08-07 06:58:22 UTC [8765]: user=,db=,app=,client=,line=5 LOG: shutting down
root@postgresql-1:/# cat /var/log/postgresql/postgresql-3_0658.log
2024-08-07 06:58:22 UTC [8758]: user=,db=,app=,client=,line=12 LOG: received fast shutdown request
2024-08-07 06:58:22 UTC [8758]: user=,db=,app=,client=,line=13 LOG: aborting any active transactions
2024-08-07 06:58:22 UTC [8888]: user=monitoring,db=postgres,app=[unknown],client=[local],line=1 FATAL: terminating connection due to administrator command
2024-08-07 06:58:22 UTC [8867]: user=monitoring,db=postgres,app=[unknown],client=[local],line=1 FATAL: terminating connection due to administrator command
2024-08-07 06:58:22 UTC [8769]: user=operator,db=postgres,app=Patroni,client=127.0.0.1,line=1 FATAL: terminating connection due to administrator command
2024-08-07 06:58:22 UTC [8758]: user=,db=,app=,client=,line=14 LOG: background worker "logical replication launcher" (PID 8783) exited with exit code 1
2024-08-07 06:58:22 UTC [8765]: user=,db=,app=,client=,line=5 LOG: shutting down
2024-08-07 06:58:22 UTC [8765]: user=,db=,app=,client=,line=6 LOG: checkpoint starting: shutdown immediate
2024-08-07 06:58:23 UTC [8765]: user=,db=,app=,client=,line=7 LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 2 recycled; write=0.015 s, sync=0.001 s, total=0.242 s; sync files=0, longest=0.000 s, average=0.000 s; distance=16382 kB, estimate=16382 kB
2024-08-07 06:58:23 UTC [17415]: user=replication,db=[unknown],app=[unknown],client=192.168.102.57,line=1 FATAL: the database system is shutting down
2024-08-07 06:58:33 UTC [17419]: user=operator,db=postgres,app=[unknown],client=127.0.0.1,line=1 FATAL: the database system is shutting down
2024-08-07 06:58:33 UTC [17421]: user=operator,db=postgres,app=[unknown],client=127.0.0.1,line=1 FATAL: the database system is shutting down
2024-08-07 06:58:34 UTC [8758]: user=,db=,app=,client=,line=15 LOG: database system is shut down
2024-08-07 06:58:35 UTC [17577]: user=,db=,app=,client=,line=3 LOG: starting PostgreSQL 14.12 (Ubuntu 14.12-0ubuntu0.22.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit
2024-08-07 06:58:35 UTC [17577]: user=,db=,app=,client=,line=4 LOG: listening on IPv4 address "0.0.0.0", port 5432
2024-08-07 06:58:37 UTC [17577]: user=,db=,app=,client=,line=5 LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2024-08-07 06:58:37 UTC [17582]: user=,db=,app=,client=,line=1 LOG: database system was shut down at 2024-08-07 06:58:23 UTC
2024-08-07 06:58:37 UTC [17582]: user=,db=,app=,client=,line=2 WARNING: specified neither primary_conninfo nor restore_command
2024-08-07 06:58:37 UTC [17582]: user=,db=,app=,client=,line=3 HINT: The database server will regularly poll the pg_wal subdirectory to check for files placed there.
2024-08-07 06:58:37 UTC [17582]: user=,db=,app=,client=,line=4 LOG: entering standby mode
2024-08-07 06:58:37 UTC [17583]: user=rewind,db=postgres,app=[unknown],client=192.168.102.57,line=1 FATAL: the database system is starting up
2024-08-07 06:58:37 UTC [17584]: user=operator,db=postgres,app=[unknown],client=127.0.0.1,line=1 FATAL: the database system is starting up
2024-08-07 06:58:37 UTC [17586]: user=operator,db=postgres,app=[unknown],client=127.0.0.1,line=1 FATAL: the database system is starting up
2024-08-07 06:58:37 UTC [17582]: user=,db=,app=,client=,line=5 LOG: consistent recovery state reached at 0/2C0000A0
2024-08-07 06:58:37 UTC [17582]: user=,db=,app=,client=,line=6 LOG: invalid record length at 0/2C0000A0: wanted 24, got 0
2024-08-07 06:58:37 UTC [17577]: user=,db=,app=,client=,line=6 LOG: database system is ready to accept read-only connections
2024-08-07 06:58:38 UTC [17590]: user=replication,db=[unknown],app=[unknown],client=192.168.102.57,line=1 FATAL: no pg_hba.conf entry for replication connection from host "192.168.102.57", user "replication", no encryption
2024-08-07 06:58:38 UTC [17590]: user=replication,db=[unknown],app=[unknown],client=192.168.102.57,line=2 DETAIL: Client IP address resolved to "192-168-102-57.postgresql-replicas.prod-repo-policy-compliance.svc.cluster.local", forward lookup not checked.
2024-08-07 06:58:39 UTC [17593]: user=replication,db=[unknown],app=postgresql-2,client=192.168.103.49,line=1 LOG: received replication command: IDENTIFY_SYSTEM
2024-08-07 06:58:39 UTC [17593]: user=replication,db=[unknown],app=postgresql-2,client=192.168.103.49,line=2 STATEMENT: IDENTIFY_SYSTEM
2024-08-07 06:58:39 UTC [17593]: user=replication,db=[unknown],app=postgresql-2,client=192.168.103.49,line=3 LOG: received replication command: START_REPLICATION SLOT "postgresql_2" 0/2C000000 TIMELINE 65
2024-08-07 06:58:39 UTC [17593]: user=replication,db=[unknown],app=postgresql-2,client=192.168.103.49,line=4 STATEMENT: START_REPLICATION SLOT "postgresql_2" 0/2C000000 TIMELINE 65
2024-08-07 06:58:43 UTC [17594]: user=replication,db=[unknown],app=postgresql-0,client=192.168.102.57,line=1 LOG: received replication command: IDENTIFY_SYSTEM
2024-08-07 06:58:43 UTC [17594]: user=replication,db=[unknown],app=postgresql-0,client=192.168.102.57,line=2 STATEMENT: IDENTIFY_SYSTEM
2024-08-07 06:58:43 UTC [17594]: user=replication,db=[unknown],app=postgresql-0,client=192.168.102.57,line=3 LOG: received replication command: START_REPLICATION SLOT "postgresql_0" 0/2C000000 TIMELINE 65
2024-08-07 06:58:43 UTC [17594]: user=replication,db=[unknown],app=postgresql-0,client=192.168.102.57,line=4 STATEMENT: START_REPLICATION SLOT "postgresql_0" 0/2C000000 TIMELINE 65
2024-08-07 06:58:45 UTC [17597]: user=replication,db=[unknown],app=[unknown],client=127.0.0.1,line=1 LOG: received replication command: IDENTIFY_SYSTEM
2024-08-07 06:58:45 UTC [17597]: user=replication,db=[unknown],app=[unknown],client=127.0.0.1,line=2 STATEMENT: IDENTIFY_SYSTEM
2024-08-07 06:58:46 UTC [17599]: user=replication,db=[unknown],app=[unknown],client=127.0.0.1,line=1 LOG: received replication command: IDENTIFY_SYSTEM
2024-08-07 06:58:46 UTC [17599]: user=replication,db=[unknown],app=[unknown],client=127.0.0.1,line=2 STATEMENT: IDENTIFY_SYSTEM
2024-08-07 06:58:46 UTC [17582]: user=,db=,app=,client=,line=7 LOG: received promote request
2024-08-07 06:58:46 UTC [17582]: user=,db=,app=,client=,line=8 LOG: redo is not required
2024-08-07 06:58:46 UTC [17582]: user=,db=,app=,client=,line=9 LOG: selected new timeline ID: 66
2024-08-07 06:58:46 UTC [17582]: user=,db=,app=,client=,line=10 LOG: archive recovery complete
2024-08-07 06:58:46 UTC [17587]: user=,db=,app=,client=,line=1 LOG: checkpoint starting: force
2024-08-07 06:58:46 UTC [17593]: user=replication,db=[unknown],app=postgresql-2,client=192.168.103.49,line=5 LOG: received replication command: TIMELINE_HISTORY 66
2024-08-07 06:58:46 UTC [17593]: user=replication,db=[unknown],app=postgresql-2,client=192.168.103.49,line=6 STATEMENT: TIMELINE_HISTORY 66
2024-08-07 06:58:46 UTC [17594]: user=replication,db=[unknown],app=postgresql-0,client=192.168.102.57,line=5 LOG: received replication command: TIMELINE_HISTORY 66
2024-08-07 06:58:46 UTC [17594]: user=replication,db=[unknown],app=postgresql-0,client=192.168.102.57,line=6 STATEMENT: TIMELINE_HISTORY 66
2024-08-07 06:58:46 UTC [17577]: user=,db=,app=,client=,line=7 LOG: database system is ready to accept connections
2024-08-07 06:58:46 UTC [17594]: user=replication,db=[unknown],app=postgresql-0,client=192.168.102.57,line=7 LOG: received replication command: IDENTIFY_SYSTEM
2024-08-07 06:58:46 UTC [17594]: user=replication,db=[unknown],app=postgresql-0,client=192.168.102.57,line=8 STATEMENT: IDENTIFY_SYSTEM
2024-08-07 06:58:46 UTC [17593]: user=replication,db=[unknown],app=postgresql-2,client=192.168.103.49,line=7 LOG: received replication command: IDENTIFY_SYSTEM
2024-08-07 06:58:46 UTC [17593]: user=replication,db=[unknown],app=postgresql-2,client=192.168.103.49,line=8 STATEMENT: IDENTIFY_SYSTEM
2024-08-07 06:58:46 UTC [17594]: user=replication,db=[unknown],app=postgresql-0,client=192.168.102.57,line=9 LOG: received replication command: START_REPLICATION SLOT "postgresql_0" 0/2C000000 TIMELINE 66
2024-08-07 06:58:46 UTC [17594]: user=replication,db=[unknown],app=postgresql-0,client=192.168.102.57,line=10 STATEMENT: START_REPLICATION SLOT "postgresql_0" 0/2C000000 TIMELINE 66
2024-08-07 06:58:46 UTC [17593]: user=replication,db=[unknown],app=postgresql-2,client=192.168.103.49,line=9 LOG: received replication command: START_REPLICATION SLOT "postgresql_2" 0/2C000000 TIMELINE 66
2024-08-07 06:58:46 UTC [17593]: user=replication,db=[unknown],app=postgresql-2,client=192.168.103.49,line=10 STATEMENT: START_REPLICATION SLOT "postgresql_2" 0/2C000000 TIMELINE 66
2024-08-07 06:58:46 UTC [17587]: user=,db=,app=,client=,line=2 LOG: checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.020 s, sync=0.003 s, total=0.067 s; sync files=2, longest=0.002 s, average=0.002 s; distance=0 kB, estimate=0 kB
2024-08-07 06:58:47 UTC [17577]: user=,db=,app=,client=,line=8 LOG: received SIGHUP, reloading configuration files
2024-08-07 06:58:47 UTC [17577]: user=,db=,app=,client=,line=9 LOG: parameter "timezone" changed to "Etc/UTC"
2024-08-07 06:58:47 UTC [17577]: user=,db=,app=,client=,line=10 LOG: parameter "TimeZone" changed to "UTC"
2024-08-07 06:58:47 UTC [17577]: user=,db=,app=,client=,line=11 LOG: parameter "synchronous_standby_names" changed to ""postgresql-2""
2024-08-07 06:58:47 UTC [17593]: user=replication,db=[unknown],app=postgresql-2,client=192.168.103.49,line=11 LOG: standby "postgresql-2" is now a synchronous standby with priority 1
2024-08-07 06:58:47 UTC [17593]: user=replication,db=[unknown],app=postgresql-2,client=192.168.103.49,line=12 STATEMENT: START_REPLICATION SLOT "postgresql_2" 0/2C000000 TIMELINE 66
2024-08-07 06:58:56 UTC [17577]: user=,db=,app=,client=,line=12 LOG: received immediate shutdown request
2024-08-07 06:58:56 UTC [17577]: user=,db=,app=,client=,line=13 LOG: database system is shut down
2024-08-07 06:58:59 UTC [17771]: user=,db=,app=,client=,line=3 LOG: starting PostgreSQL 14.12 (Ubuntu 14.12-0ubuntu0.22.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit
2024-08-07 06:58:59 UTC [17771]: user=,db=,app=,client=,line=4 LOG: listening on IPv4 address "0.0.0.0", port 5432
2024-08-07 06:58:59 UTC [17771]: user=,db=,app=,client=,line=5 LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2024-08-07 06:58:59 UTC [17775]: user=,db=,app=,client=,line=1 LOG: database system was interrupted; last known up at 2024-08-07 06:58:46 UTC
2024-08-07 06:58:59 UTC [17775]: user=,db=,app=,client=,line=2 WARNING: specified neither primary_conninfo nor restore_command
2024-08-07 06:58:59 UTC [17775]: user=,db=,app=,client=,line=3 HINT: The database server will regularly poll the pg_wal subdirectory to check for files placed there.
2024-08-07 06:58:59 UTC [17775]: user=,db=,app=,client=,line=4 LOG: entering standby mode
2024-08-07 06:58:59 UTC [17775]: user=,db=,app=,client=,line=5 LOG: database system was not properly shut down; automatic recovery in progress
2024-08-07 06:58:59 UTC [17775]: user=,db=,app=,client=,line=6 LOG: redo starts at 0/2C0000D0
2024-08-07 06:58:59 UTC [17775]: user=,db=,app=,client=,line=7 LOG: invalid record length at 0/2C000218: wanted 24, got 0
2024-08-07 06:58:59 UTC [17775]: user=,db=,app=,client=,line=8 LOG: consistent recovery state reached at 0/2C000218
2024-08-07 06:58:59 UTC [17771]: user=,db=,app=,client=,line=6 LOG: database system is ready to accept read-only connections
2024-08-07 06:58:59 UTC [17784]: user=replication,db=[unknown],app=[unknown],client=127.0.0.1,line=1 LOG: received replication command: IDENTIFY_SYSTEM
2024-08-07 06:58:59 UTC [17784]: user=replication,db=[unknown],app=[unknown],client=127.0.0.1,line=2 STATEMENT: IDENTIFY_SYSTEM
2024-08-07 06:59:00 UTC [17771]: user=,db=,app=,client=,line=7 LOG: received SIGHUP, reloading configuration files
root@postgresql-1:/# cat /var/log/postgresql/postgresql-3_0656.log
root@postgresql-1:/# cat /var/log/postgresql/postgresql-3_0655.log
root@postgresql-1:/# cat /var/log/postgresql/postgresql-3_0658.log
2024-08-07 06:58:22 UTC [8758]: user=,db=,app=,client=,line=12 LOG: received fast shutdown request
2024-08-07 06:58:22 UTC [8758]: user=,db=,app=,client=,line=13 LOG: aborting any active transactions
2024-08-07 06:58:22 UTC [8888]: user=monitoring,db=postgres,app=[unknown],client=[local],line=1 FATAL: terminating connection due to administrator command
2024-08-07 06:58:22 UTC [8867]: user=monitoring,db=postgres,app=[unknown],client=[local],line=1 FATAL: terminating connection due to administrator command
2024-08-07 06:58:22 UTC [8769]: user=operator,db=postgres,app=Patroni,client=127.0.0.1,line=1 FATAL: terminating connection due to administrator command
2024-08-07 06:58:22 UTC [8758]: user=,db=,app=,client=,line=14 LOG: background worker "logical replication launcher" (PID 8783) exited with exit code 1
2024-08-07 06:58:22 UTC [8765]: user=,db=,app=,client=,line=5 LOG: shutting down
2024-08-07 06:58:22 UTC [8765]: user=,db=,app=,client=,line=6 LOG: checkpoint starting: shutdown immediate
2024-08-07 06:58:23 UTC [8765]: user=,db=,app=,client=,line=7 LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 2 recycled; write=0.015 s, sync=0.001 s, total=0.242 s; sync files=0, longest=0.000 s, average=0.000 s; distance=16382 kB, estimate=16382 kB
2024-08-07 06:58:23 UTC [17415]: user=replication,db=[unknown],app=[unknown],client=192.168.102.57,line=1 FATAL: the database system is shutting down
2024-08-07 06:58:33 UTC [17419]: user=operator,db=postgres,app=[unknown],client=127.0.0.1,line=1 FATAL: the database system is shutting down
2024-08-07 06:58:33 UTC [17421]: user=operator,db=postgres,app=[unknown],client=127.0.0.1,line=1 FATAL: the database system is shutting down
2024-08-07 06:58:34 UTC [8758]: user=,db=,app=,client=,line=15 LOG: database system is shut down
2024-08-07 06:58:35 UTC [17577]: user=,db=,app=,client=,line=3 LOG: starting PostgreSQL 14.12 (Ubuntu 14.12-0ubuntu0.22.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit
2024-08-07 06:58:35 UTC [17577]: user=,db=,app=,client=,line=4 LOG: listening on IPv4 address "0.0.0.0", port 5432
2024-08-07 06:58:37 UTC [17577]: user=,db=,app=,client=,line=5 LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2024-08-07 06:58:37 UTC [17582]: user=,db=,app=,client=,line=1 LOG: database system was shut down at 2024-08-07 06:58:23 UTC
2024-08-07 06:58:37 UTC [17582]: user=,db=,app=,client=,line=2 WARNING: specified neither primary_conninfo nor restore_command
2024-08-07 06:58:37 UTC [17582]: user=,db=,app=,client=,line=3 HINT: The database server will regularly poll the pg_wal subdirectory to check for files placed there.
2024-08-07 06:58:37 UTC [17582]: user=,db=,app=,client=,line=4 LOG: entering standby mode
2024-08-07 06:58:37 UTC [17583]: user=rewind,db=postgres,app=[unknown],client=192.168.102.57,line=1 FATAL: the database system is starting up
2024-08-07 06:58:37 UTC [17584]: user=operator,db=postgres,app=[unknown],client=127.0.0.1,line=1 FATAL: the database system is starting up
2024-08-07 06:58:37 UTC [17586]: user=operator,db=postgres,app=[unknown],client=127.0.0.1,line=1 FATAL: the database system is starting up
2024-08-07 06:58:37 UTC [17582]: user=,db=,app=,client=,line=5 LOG: consistent recovery state reached at 0/2C0000A0
2024-08-07 06:58:37 UTC [17582]: user=,db=,app=,client=,line=6 LOG: invalid record length at 0/2C0000A0: wanted 24, got 0
2024-08-07 06:58:37 UTC [17577]: user=,db=,app=,client=,line=6 LOG: database system is ready to accept read-only connections
2024-08-07 06:58:38 UTC [17590]: user=replication,db=[unknown],app=[unknown],client=192.168.102.57,line=1 FATAL: no pg_hba.conf entry for replication connection from host "192.168.102.57", user "replication", no encryption
2024-08-07 06:58:38 UTC [17590]: user=replication,db=[unknown],app=[unknown],client=192.168.102.57,line=2 DETAIL: Client IP address resolved to "192-168-102-57.postgresql-replicas.prod-repo-policy-compliance.svc.cluster.local", forward lookup not checked.
2024-08-07 06:58:39 UTC [17593]: user=replication,db=[unknown],app=postgresql-2,client=192.168.103.49,line=1 LOG: received replication command: IDENTIFY_SYSTEM
2024-08-07 06:58:39 UTC [17593]: user=replication,db=[unknown],app=postgresql-2,client=192.168.103.49,line=2 STATEMENT: IDENTIFY_SYSTEM
2024-08-07 06:58:39 UTC [17593]: user=replication,db=[unknown],app=postgresql-2,client=192.168.103.49,line=3 LOG: received replication command: START_REPLICATION SLOT "postgresql_2" 0/2C000000 TIMELINE 65
2024-08-07 06:58:39 UTC [17593]: user=replication,db=[unknown],app=postgresql-2,client=192.168.103.49,line=4 STATEMENT: START_REPLICATION SLOT "postgresql_2" 0/2C000000 TIMELINE 65
2024-08-07 06:58:43 UTC [17594]: user=replication,db=[unknown],app=postgresql-0,client=192.168.102.57,line=1 LOG: received replication command: IDENTIFY_SYSTEM
2024-08-07 06:58:43 UTC [17594]: user=replication,db=[unknown],app=postgresql-0,client=192.168.102.57,line=2 STATEMENT: IDENTIFY_SYSTEM
2024-08-07 06:58:43 UTC [17594]: user=replication,db=[unknown],app=postgresql-0,client=192.168.102.57,line=3 LOG: received replication command: START_REPLICATION SLOT "postgresql_0" 0/2C000000 TIMELINE 65
2024-08-07 06:58:43 UTC [17594]: user=replication,db=[unknown],app=postgresql-0,client=192.168.102.57,line=4 STATEMENT: START_REPLICATION SLOT "postgresql_0" 0/2C000000 TIMELINE 65
2024-08-07 06:58:45 UTC [17597]: user=replication,db=[unknown],app=[unknown],client=127.0.0.1,line=1 LOG: received replication command: IDENTIFY_SYSTEM
2024-08-07 06:58:45 UTC [17597]: user=replication,db=[unknown],app=[unknown],client=127.0.0.1,line=2 STATEMENT: IDENTIFY_SYSTEM
2024-08-07 06:58:46 UTC [17599]: user=replication,db=[unknown],app=[unknown],client=127.0.0.1,line=1 LOG: received replication command: IDENTIFY_SYSTEM
2024-08-07 06:58:46 UTC [17599]: user=replication,db=[unknown],app=[unknown],client=127.0.0.1,line=2 STATEMENT: IDENTIFY_SYSTEM
2024-08-07 06:58:46 UTC [17582]: user=,db=,app=,client=,line=7 LOG: received promote request
2024-08-07 06:58:46 UTC [17582]: user=,db=,app=,client=,line=8 LOG: redo is not required
2024-08-07 06:58:46 UTC [17582]: user=,db=,app=,client=,line=9 LOG: selected new timeline ID: 66
2024-08-07 06:58:46 UTC [17582]: user=,db=,app=,client=,line=10 LOG: archive recovery complete
2024-08-07 06:58:46 UTC [17587]: user=,db=,app=,client=,line=1 LOG: checkpoint starting: force
2024-08-07 06:58:46 UTC [17593]: user=replication,db=[unknown],app=postgresql-2,client=192.168.103.49,line=5 LOG: received replication command: TIMELINE_HISTORY 66
2024-08-07 06:58:46 UTC [17593]: user=replication,db=[unknown],app=postgresql-2,client=192.168.103.49,line=6 STATEMENT: TIMELINE_HISTORY 66
2024-08-07 06:58:46 UTC [17594]: user=replication,db=[unknown],app=postgresql-0,client=192.168.102.57,line=5 LOG: received replication command: TIMELINE_HISTORY 66
2024-08-07 06:58:46 UTC [17594]: user=replication,db=[unknown],app=postgresql-0,client=192.168.102.57,line=6 STATEMENT: TIMELINE_HISTORY 66
2024-08-07 06:58:46 UTC [17577]: user=,db=,app=,client=,line=7 LOG: database system is ready to accept connections
2024-08-07 06:58:46 UTC [17594]: user=replication,db=[unknown],app=postgresql-0,client=192.168.102.57,line=7 LOG: received replication command: IDENTIFY_SYSTEM
2024-08-07 06:58:46 UTC [17594]: user=replication,db=[unknown],app=postgresql-0,client=192.168.102.57,line=8 STATEMENT: IDENTIFY_SYSTEM
2024-08-07 06:58:46 UTC [17593]: user=replication,db=[unknown],app=postgresql-2,client=192.168.103.49,line=7 LOG: received replication command: IDENTIFY_SYSTEM
2024-08-07 06:58:46 UTC [17593]: user=replication,db=[unknown],app=postgresql-2,client=192.168.103.49,line=8 STATEMENT: IDENTIFY_SYSTEM
2024-08-07 06:58:46 UTC [17594]: user=replication,db=[unknown],app=postgresql-0,client=192.168.102.57,line=9 LOG: received replication command: START_REPLICATION SLOT "postgresql_0" 0/2C000000 TIMELINE 66
2024-08-07 06:58:46 UTC [17594]: user=replication,db=[unknown],app=postgresql-0,client=192.168.102.57,line=10 STATEMENT: START_REPLICATION SLOT "postgresql_0" 0/2C000000 TIMELINE 66
2024-08-07 06:58:46 UTC [17593]: user=replication,db=[unknown],app=postgresql-2,client=192.168.103.49,line=9 LOG: received replication command: START_REPLICATION SLOT "postgresql_2" 0/2C000000 TIMELINE 66
2024-08-07 06:58:46 UTC [17593]: user=replication,db=[unknown],app=postgresql-2,client=192.168.103.49,line=10 STATEMENT: START_REPLICATION SLOT "postgresql_2" 0/2C000000 TIMELINE 66
2024-08-07 06:58:46 UTC [17587]: user=,db=,app=,client=,line=2 LOG: checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.020 s, sync=0.003 s, total=0.067 s; sync files=2, longest=0.002 s, average=0.002 s; distance=0 kB, estimate=0 kB
2024-08-07 06:58:47 UTC [17577]: user=,db=,app=,client=,line=8 LOG: received SIGHUP, reloading configuration files
2024-08-07 06:58:47 UTC [17577]: user=,db=,app=,client=,line=9 LOG: parameter "timezone" changed to "Etc/UTC"
2024-08-07 06:58:47 UTC [17577]: user=,db=,app=,client=,line=10 LOG: parameter "TimeZone" changed to "UTC"
2024-08-07 06:58:47 UTC [17577]: user=,db=,app=,client=,line=11 LOG: parameter "synchronous_standby_names" changed to ""postgresql-2""
2024-08-07 06:58:47 UTC [17593]: user=replication,db=[unknown],app=postgresql-2,client=192.168.103.49,line=11 LOG: standby "postgresql-2" is now a synchronous standby with priority 1
2024-08-07 06:58:47 UTC [17593]: user=replication,db=[unknown],app=postgresql-2,client=192.168.103.49,line=12 STATEMENT: START_REPLICATION SLOT "postgresql_2" 0/2C000000 TIMELINE 66
2024-08-07 06:58:56 UTC [17577]: user=,db=,app=,client=,line=12 LOG: received immediate shutdown request
2024-08-07 06:58:56 UTC [17577]: user=,db=,app=,client=,line=13 LOG: database system is shut down
2024-08-07 06:58:59 UTC [17771]: user=,db=,app=,client=,line=3 LOG: starting PostgreSQL 14.12 (Ubuntu 14.12-0ubuntu0.22.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit
2024-08-07 06:58:59 UTC [17771]: user=,db=,app=,client=,line=4 LOG: listening on IPv4 address "0.0.0.0", port 5432
2024-08-07 06:58:59 UTC [17771]: user=,db=,app=,client=,line=5 LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2024-08-07 06:58:59 UTC [17775]: user=,db=,app=,client=,line=1 LOG: database system was interrupted; last known up at 2024-08-07 06:58:46 UTC
2024-08-07 06:58:59 UTC [17775]: user=,db=,app=,client=,line=2 WARNING: specified neither primary_conninfo nor restore_command
2024-08-07 06:58:59 UTC [17775]: user=,db=,app=,client=,line=3 HINT: The database server will regularly poll the pg_wal subdirectory to check for files placed there.
2024-08-07 06:58:59 UTC [17775]: user=,db=,app=,client=,line=4 LOG: entering standby mode
2024-08-07 06:58:59 UTC [17775]: user=,db=,app=,client=,line=5 LOG: database system was not properly shut down; automatic recovery in progress
2024-08-07 06:58:59 UTC [17775]: user=,db=,app=,client=,line=6 LOG: redo starts at 0/2C0000D0
2024-08-07 06:58:59 UTC [17775]: user=,db=,app=,client=,line=7 LOG: invalid record length at 0/2C000218: wanted 24, got 0
2024-08-07 06:58:59 UTC [17775]: user=,db=,app=,client=,line=8 LOG: consistent recovery state reached at 0/2C000218
2024-08-07 06:58:59 UTC [17771]: user=,db=,app=,client=,line=6 LOG: database system is ready to accept read-only connections
2024-08-07 06:58:59 UTC [17784]: user=replication,db=[unknown],app=[unknown],client=127.0.0.1,line=1 LOG: received replication command: IDENTIFY_SYSTEM
2024-08-07 06:58:59 UTC [17784]: user=replication,db=[unknown],app=[unknown],client=127.0.0.1,line=2 STATEMENT: IDENTIFY_SYSTEM
2024-08-07 06:59:00 UTC [17771]: user=,db=,app=,client=,line=7 LOG: received SIGHUP, reloading configuration files
patroni log
root@postgresql-1:/# cat /var/log/postgresql/patroni.log.110
2024-08-07 07:02:09 UTC [16]: INFO: no action. I am (postgresql-1), a secondary, and following a leader (postgresql-0)
2024-08-07 07:02:19 UTC [16]: INFO: no action. I am (postgresql-1), a secondary, and following a leader (postgresql-0)
2024-08-07 07:02:29 UTC [16]: INFO: no action. I am (postgresql-1), a secondary, and following a leader (postgresql-0)
2024-08-07 07:02:39 UTC [16]: INFO: no action. I am (postgresql-1), a secondary, and following a leader (postgresql-0)
root@postgresql-1:/# cat /var/log/postgresql/patroni.log.120
2024-08-07 06:58:58 UTC [16]: WARNING: Retry got exception: connection problems
2024-08-07 06:58:58 UTC [16]: INFO: Lock owner: postgresql-0; I am postgresql-1
2024-08-07 06:58:58 UTC [16]: INFO: starting after demotion in progress
2024-08-07 06:58:58 UTC [16]: INFO: Lock owner: postgresql-0; I am postgresql-1
2024-08-07 06:58:58 UTC [16]: INFO: starting after demotion in progress
2024-08-07 06:58:58 UTC [16]: WARNING: Retry got exception: connection problems
2024-08-07 06:58:59 UTC [16]: INFO: Lock owner: postgresql-0; I am postgresql-1
root@postgresql-1:/# cat /var/log/postgresql/patroni.log.121
2024-08-07 06:58:56 UTC [16]: ERROR: failed to update leader lock
2024-08-07 06:58:56 UTC [16]: INFO: Demoting self (immediate-nolock)
2024-08-07 06:58:56 UTC [16]: INFO: demoted self because failed to update leader lock in DCS
2024-08-07 06:58:56 UTC [16]: INFO: Lock owner: None; I am postgresql-1
2024-08-07 06:58:56 UTC [16]: INFO: not healthy enough for leader race
2024-08-07 06:58:56 UTC [16]: INFO: starting after demotion in progress
2024-08-07 06:58:56 UTC [16]: INFO: closed patroni connection to the postgresql cluster
2024-08-07 06:58:57 UTC [16]: INFO: postmaster pid=17771
root@postgresql-1:/# cat /var/log/postgresql/patroni.log.122
2024-08-07 06:58:56 UTC [16]: ERROR: Unexpected error from Kubernetes API
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/patroni/dcs/kubernetes.py", line 1159, in _update_leader_with_retry
return bool(self._patch_or_create(self.leader_path, annotations, resource_version, ips=ips, retry=_retry))
File "/usr/lib/python3/dist-packages/patroni/dcs/kubernetes.py", line 1090, in _patch_or_create
ret = retry(func, self._namespace, body) if retry else func(self._namespace, body)
File "/usr/lib/python3/dist-packages/patroni/dcs/kubernetes.py", line 1156, in _retry
return retry(*args, **kwargs)
File "/usr/lib/python3/dist-packages/patroni/utils.py", line 599, in __call__
return func(*args, **kwargs)
File "/usr/lib/python3/dist-packages/patroni/dcs/kubernetes.py", line 558, in wrapper
return getattr(self._core_v1_api, func)(*args, **kwargs)
File "/usr/lib/python3/dist-packages/patroni/dcs/kubernetes.py", line 479, in wrapper
return self._api_client.call_api(method, path, headers, body, **kwargs)
File "/usr/lib/python3/dist-packages/patroni/dcs/kubernetes.py", line 447, in call_api
return self._handle_server_response(response, _preload_content)
File "/usr/lib/python3/dist-packages/patroni/dcs/kubernetes.py", line 266, in _handle_server_response
raise k8s_client.rest.ApiException(http_resp=response)
patroni.dcs.kubernetes.K8sClient.rest.ApiException: (404)
Reason: Not Found
HTTP response headers: HTTPHeaderDict({'Audit-Id': 'da0b6ecb-8d42-4964-9819-41a2d385b2dd', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Kubernetes-Pf-Flowschema-Uid': '37606d0c-8ddf-4cc4-9faa-563142648f97', 'X-Kubernetes-Pf-Prioritylevel-Uid': '296b2e7b-e274-4aed-9e4a-019ddda1c429', 'Date': 'Wed, 07 Aug 2024 06:58:56 GMT', 'Content-Length': '214'})
HTTP response body: b'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"endpoints \\"patroni-postgresql\\" not found","reason":"NotFound","details":{"name":"patroni-postgresql","kind":"endpoints"},"code":404}\n'
root@postgresql-1:/# cat /var/log/postgresql/patroni.log.123
2024-08-07 06:58:56 UTC [16]: INFO: Assigning synchronous standby status to ['postgresql-2']
2024-08-07 06:58:56 UTC [16]: INFO: Synchronous standby status assigned to ['postgresql-2']
2024-08-07 06:58:56 UTC [16]: INFO: acquired session lock as a leader
2024-08-07 06:58:56 UTC [16]: INFO: Lock owner: postgresql-1; I am postgresql-1
Additional context
The text was updated successfully, but these errors were encountered:
Looks like Patroni was unable to access the K8s API to update the leader lock. We can potentially enable Patroni's DCS failsafe mode to try to mitigate this.
Steps to reproduce
Deploy postgresql 14/edge on rev 239 with three units.
Expected behavior
Postgreql is not going down.
Actual behavior
We received the Postgresql down alert on a production deployment with three units.
Logs indicate that postgresql-1 was shutdown due to
It healed itself quite quickly.
Versions
Operating system:
Juju CLI: 3.5.3
Juju agent: 3.5.3
Charm revision: 239
microk8s:
Log output
juju debug log
postgresql log
patroni log
Additional context
The text was updated successfully, but these errors were encountered: