Re: VM corruption on standby

Поиск
Список
Период
Сортировка
От Yura Sokolov
Тема Re: VM corruption on standby
Дата
Msg-id dc5d7533-c254-4ad7-8ec2-c96497c30cbe@postgrespro.ru
обсуждение исходный текст
Ответ на Re: VM corruption on standby  (Kirill Reshke <reshkekirill@gmail.com>)
Список pgsql-hackers
10.08.2025 08:45, Kirill Reshke пишет:
> On Sun, 10 Aug 2025 at 01:55, Aleksander Alekseev
> <aleksander@tigerdata.com> wrote:
>> For this reason we have PageHeaderData.pd_lsn for instance - to make sure
>> pages are evicted only *after* the record that changed it is written
>> to disk (because WAL records can't be applied to pages from the
>> future).
> 
> We don't bump the LSN of the heap page when setting the visibility
> map bit.

And that is very bad!!!
In fact, I believe "incremental backup" is not safe because of that.

>> I guess the intent here could be to do an optimization of some sort
>> but the facts that 1. the instance can be killed at any time and 2.
>> there might be replicas - were not considered.
> 
>> IMHO: logging the changes first, then allowing to evict the page.
> 
> Clearing the vm before the logging changes was intentional [0].
> So I assume we should not change the approach, but rather just tweak
> things a bit to make the whole thing work.
> 
> [0] https://www.postgresql.org/message-id/BANLkTimuLk4RHXSQHEEiYGbxiXp2mh5KCA%40mail.gmail.com

Lets cite that message:

On Fri, May 6, 2011 at 5:47 PM, Robert Haas <robertmhaas(at)gmail(dot)com>
wrote:
> On Wed, Mar 30, 2011 at 8:52 AM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>>> Another question:
>>> To address the problem in
>>> http://archives.postgresql.org/pgsql-hackers/2010-02/msg02097.php
>>> , should we just clear the vm before the log of insert/update/delete?
>>> This may reduce the performance, is there another solution?
>>
>> Yeah, that's a straightforward way to fix it. I don't think the performance
>> hit will be too bad. But we need to be careful not to hold locks while doing
>> I/O, which might require some rearrangement of the code. We might want to do
>> a similar dance that we do in vacuum, and call visibilitymap_pin first, then
>> lock and update the heap page, and then set the VM bit while holding the
>> lock on the heap page.
>
> Here's an attempt at implementing the necessary gymnastics.

So we see: Heikki told "lock and update the heap page, and then set the VM
bit while holding lock on the heap page". He didn't say "set the VM bit
before log heap update". It is not clear why Robert put visibilitymap_clear
before logging of update, so your point "was intentional" is not valid.

Call of visibilitymap_clear after logging but before exit from critical
section (as Aleksander Alekseev did in suggested patch) is correct way to
do the thing.

-- 
regards
Yura Sokolov aka funny-falcon



В списке pgsql-hackers по дате отправления: