Обсуждение: max_slot_wal_keep_size

Поиск
Список
Период
Сортировка

max_slot_wal_keep_size

От
Scott Ribe
Дата:
If I use max_slot_wal_keep_size to limit disk impact of a down replica, and subsequently a down replica causes PG to
hitthis limit, is there a particular message that will be logged when the limit is crossed and PG starts to purge WAL? 

Context is: trying to debug a failure to bring up a replica, where the failure happened in the middle of a moderately
complexchain of events that likely started with a bad disk. (Patroni is involved, FWIW) 

--
Scott Ribe
scott_ribe@elevated-dev.com
https://www.linkedin.com/in/scottribe/





--
Scott Ribe
scott_ribe@elevated-dev.com
https://www.linkedin.com/in/scottribe/






Re: max_slot_wal_keep_size

От
Alvaro Herrera
Дата:
On 2021-Aug-16, Scott Ribe wrote:

> If I use max_slot_wal_keep_size to limit disk impact of a down
> replica, and subsequently a down replica causes PG to hit this limit,
> is there a particular message that will be logged when the limit is
> crossed and PG starts to purge WAL?
> 
> Context is: trying to debug a failure to bring up a replica, where the
> failure happened in the middle of a moderately complex chain of events
> that likely started with a bad disk. (Patroni is involved, FWIW)

Yes, you should see
  invalidating slot "..." because its restart_lsn ... exceeds max_slot_wal_keep_size

However, there was a bug fixed recently in that area, whereby the slot
would be invalidated but the space would not be freed; the fix was on
July 16th and it was released together with last week's minors:

Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
Branch: master [ead9e51e8] 2021-07-16 12:07:30 -0400
Branch: REL_14_STABLE [e5bcbb107] 2021-07-16 12:07:30 -0400
Branch: REL_13_STABLE Release: REL_13_4 [866237a6f] 2021-07-16 12:07:30 -0400

    Advance old-segment horizon properly after slot invalidation
    
    When some slots are invalidated due to the max_slot_wal_keep_size limit,
    the old segment horizon should move forward to stay within the limit.
    However, in commit c6550776394e we forgot to call KeepLogSeg again to
    recompute the horizon after invalidating replication slots.  In cases
    where other slots remained, the limits would be recomputed eventually
    for other reasons, but if all slots were invalidated, the limits would
    not move at all afterwards.  Repair.
    
    Backpatch to 13 where the feature was introduced.
    
    Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
    Reported-by: Marcin Krupowicz <mk@071.ovh>
    Discussion: https://postgr.es/m/17103-004130e8f27782c9@postgresql.org


-- 
Álvaro Herrera           39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/



Re: max_slot_wal_keep_size

От
Scott Ribe
Дата:
> On Aug 16, 2021, at 8:08 AM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> Yes, you should see
>  invalidating slot "..." because its restart_lsn ... exceeds max_slot_wal_keep_size

Thank you; that's exactly what I need to cut through the log noise in splunk and see if this happened