Re: Failure in contrib test _int on loach
От | Andrey Lepikhov |
---|---|
Тема | Re: Failure in contrib test _int on loach |
Дата | |
Msg-id | 89f1015d-a6c3-e341-235b-b60de2343fe6@postgrespro.ru обсуждение исходный текст |
Ответ на | Re: Failure in contrib test _int on loach (Heikki Linnakangas <hlinnaka@iki.fi>) |
Список | pgsql-hackers |
On 11/04/2019 13:14, Heikki Linnakangas wrote: > On 11/04/2019 09:10, Andrey Lepikhov wrote: >> On 10/04/2019 20:25, Heikki Linnakangas wrote: >>> On 09/04/2019 19:11, Anastasia Lubennikova wrote: >>>> After introducing GistBuildNSN this code path became unreachable. >>>> To fix it, I added new flag to detect such splits during indexbuild. >>> >>> Isn't it possible that the grandparent page is also split, so that we'd >>> need to climb further up? >> >> Based on Anastasia's idea i prepare alternative solution to fix the bug >> (see attachment). >> It utilizes the idea of linear increment of LSN/NSN. WAL write process >> is used for change NSN value to 1 for each block of index relation. >> I hope this can be a fairly clear and safe solution. > > That's basically the same idea as always using the "fake LSN" during > index build, like the original version of this patch did. It's got the > problem that I mentioned at > https://www.postgresql.org/message-id/090fb3cb-1ca4-e173-ecf7-47d41ebac620@iki.fi: > > >> * Using "fake" unlogged LSNs for GiST index build seemed fishy. I >> could not convince myself that it was safe in all corner cases. In a >> recently initdb'd cluster, it's theoretically possible that the fake >> LSN counter overtakes the real LSN value, and that could lead to >> strange behavior. For example, how would the buffer manager behave, if >> there was a dirty page in the buffer cache with an LSN value that's >> greater than the current WAL flush pointer? I think you'd get "ERROR: >> xlog flush request %X/%X is not satisfied --- flushed only to %X/%X". > > Perhaps the risk is theoretical; the real WAL begins at XLOG_SEG_SIZE, > so with defaults WAL segment size, the index build would have to do > about 16 million page splits. The index would have to be at least 150 GB > for that. But it seems possible, and with non-default segment and page > size settings more so. As i see in bufmgr.c, XLogFlush() can't called during index build. In the log_newpage_range() call we can use mask to set value of NSN (and LSN) to 1. > > Perhaps we could start at 1, but instead of using a global counter, > whenever a page is split, we take the parent's LSN value and increment > it by one. So different branches of the tree could use the same values, > which would reduce the consumption of the counter values. > > Yet another idea would be to start the counter at 1, but check that it > doesn't overtake the WAL insert pointer. If it's about to overtake it, > just generate some dummy WAL. > > But it seems best to deal with this in gistdoinsert(). I think > Anastasia's approach of adding a flag to GISTInsertStack can be made to > work, if we set the flag somewhere in gistinserttuples() or > gistplacetopage(), whenever a page is split. That way, if it needs to > split multiple levels, the flag is set on all of the corresponding > GISTInsertStack entries. > > Yet another trivial fix would be just always start the tree descend from > the root in gistdoinsert(), if a page is split. Not as efficient, but > probably negligible in practice. Agree -- Andrey Lepikhov Postgres Professional https://postgrespro.com The Russian Postgres Company
В списке pgsql-hackers по дате отправления: