Re: Standby trying "restore_command" before local WAL

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Standby trying "restore_command" before local WAL
Дата
Msg-id 8ec2279e-9295-3cd2-a71f-a8ef97e7f4c7@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: Standby trying "restore_command" before local WAL  (Stephen Frost <sfrost@snowman.net>)
Ответы Re: Standby trying "restore_command" before local WAL
Список pgsql-hackers
On 08/06/2018 05:19 PM, Stephen Frost wrote:
> Greetings,
> 
> * David Steele (david@pgmasters.net) wrote:
>> I think for the stated scenario (known good standby that has been
>> shutdown gracefully) it makes perfect sense to trust the contents of
>> pg_wal.  Call this scenario #1.
>>
>> An alternate scenario (#2) is that the data directory was copied using a
>> basic copy tool and the pg_wal directory was not excluded from the copy.
>>   This means the contents of pg_wal will be in an inconsistent state.
>> The files that are there might be partials (not with the extension,
>> though) and you can easily have multiple partials.  You will almost
>> certainly not have everything you need to get to consistency.
>>

Yeah. But as Simon said, we do have fairly strong protections about 
applying corrupted WAL - every record is CRC-checked. So why not to 
fall-back to the restore_command only if the locally available WAL is 
not fully consistent?

>> But there's another good scenario (#3): where the pg_wal directory was
>> preloaded with all the WAL required to make the cluster consistent or
>> all the WAL that was available at restore time.  In this case, it would
>> be make sense to prefer the contents of pg_wal and only switch to
>> restore_command after that has been exhausted.
>>
>> So, the choice of whether to prefer locally-stored or
>> restore_command-fetched WAL is context-dependent, in my mind.
> 
> Agreed.
> 

Maybe, not sure.

>> Ideally we could have a default that is safe in each scenario with
>> perhaps an override if the user knows better.  Scenario #1 would allow
>> WAL to be read from pg_wal by default, scenario #2 would prefer fetched
>> WAL, and scenario #3 could use a GUC to override the default fetch behavior.
> 
> Not sure how we'd be able to automatically realize which scenario we're
> in though..?
> 

But do we need to know it? I mean, can't we try the local WAL first, use 
it if it passes the CRC checks (and possibly some other checks), and 
only fallback to the remote WAL if it's identified as broken?


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Curt Tilmes
Дата:
Сообщение: Re: [PATCH] Find additional connection service files inpg_service.conf.d directory
Следующее
От: Stephen Frost
Дата:
Сообщение: Re: Standby trying "restore_command" before local WAL