Re: Failure in contrib test _int on loach

Поиск
Список
Период
Сортировка
От Andrey Lepikhov
Тема Re: Failure in contrib test _int on loach
Дата
Msg-id 89f1015d-a6c3-e341-235b-b60de2343fe6@postgrespro.ru
обсуждение исходный текст
Ответ на Re: Failure in contrib test _int on loach  (Heikki Linnakangas <hlinnaka@iki.fi>)
Список pgsql-hackers

On 11/04/2019 13:14, Heikki Linnakangas wrote:
> On 11/04/2019 09:10, Andrey Lepikhov wrote:
>> On 10/04/2019 20:25, Heikki Linnakangas wrote:
>>> On 09/04/2019 19:11, Anastasia Lubennikova wrote:
>>>> After introducing GistBuildNSN this code path became unreachable.
>>>> To fix it, I added new flag to detect such splits during indexbuild.
>>>
>>> Isn't it possible that the grandparent page is also split, so that we'd
>>> need to climb further up?
>>
>> Based on Anastasia's idea i prepare alternative solution to fix the bug
>> (see attachment).
>> It utilizes the idea of linear increment of LSN/NSN. WAL write process
>> is used for change NSN value to 1 for each block of index relation.
>> I hope this can be a fairly clear and safe solution.
> 
> That's basically the same idea as always using the "fake LSN" during 
> index build, like the original version of this patch did. It's got the 
> problem that I mentioned at 
> https://www.postgresql.org/message-id/090fb3cb-1ca4-e173-ecf7-47d41ebac620@iki.fi: 
> 
> 
>> * Using "fake" unlogged LSNs for GiST index build seemed fishy. I 
>> could not convince myself that it was safe in all corner cases. In a 
>> recently initdb'd cluster, it's theoretically possible that the fake 
>> LSN counter overtakes the real LSN value, and that could lead to 
>> strange behavior. For example, how would the buffer manager behave, if 
>> there was a dirty page in the buffer cache with an LSN value that's 
>> greater than the current WAL flush pointer? I think you'd get "ERROR: 
>> xlog flush request %X/%X is not satisfied --- flushed only to %X/%X".
> 
> Perhaps the risk is theoretical; the real WAL begins at XLOG_SEG_SIZE, 
> so with defaults WAL segment size, the index build would have to do 
> about 16 million page splits. The index would have to be at least 150 GB 
> for that. But it seems possible, and with non-default segment and page 
> size settings more so.
As i see in bufmgr.c, XLogFlush() can't called during index build. In 
the log_newpage_range() call we can use mask to set value of NSN (and 
LSN) to 1.
> 
> Perhaps we could start at 1, but instead of using a global counter, 
> whenever a page is split, we take the parent's LSN value and increment 
> it by one. So different branches of the tree could use the same values, 
> which would reduce the consumption of the counter values.
> 
> Yet another idea would be to start the counter at 1, but check that it 
> doesn't overtake the WAL insert pointer. If it's about to overtake it, 
> just generate some dummy WAL.
> 
> But it seems best to deal with this in gistdoinsert(). I think 
> Anastasia's approach of adding a flag to GISTInsertStack can be made to 
> work, if we set the flag somewhere in gistinserttuples() or 
> gistplacetopage(), whenever a page is split. That way, if it needs to 
> split multiple levels, the flag is set on all of the corresponding 
> GISTInsertStack entries.
> 
> Yet another trivial fix would be just always start the tree descend from 
> the root in gistdoinsert(), if a page is split. Not as efficient, but 
> probably negligible in practice.
Agree

-- 
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Chris Travers
Дата:
Сообщение: Re: Berserk Autovacuum (let's save next Mandrill)
Следующее
От: Ibrar Ahmed
Дата:
Сообщение: Commit message / hash in commitfest page.