Обсуждение: pgsql: Fix BRIN to use SnapshotAny during summarization
Fix BRIN to use SnapshotAny during summarization For correctness of summarization results, it is critical that the snapshot used during the summarization scan is able to see all tuples that are live to all transactions -- including tuples inserted or deleted by in-progress transactions. Otherwise, it would be possible for a transaction to insert a tuple, then idle for a long time while a concurrent transaction executes summarization of the range: this would result in the inserted value not being considered in the summary. Previously we were trying to use a MVCC snapshot in conjunction with adding a "placeholder" tuple in the index: the snapshot would see all committed tuples, and the placeholder tuple would catch insertions by any new inserters. The hole is that prior insertions by transactions that are still in progress by the time the MVCC snapshot was taken were ignored. Kevin Grittner reported this as a bogus error message during vacuum with default transaction isolation mode set to repeatable read (because the error report mentioned a function name not being invoked during), but the problem is larger than that. To fix, tweak IndexBuildHeapRangeScan to have a new mode that behaves the way we need using SnapshotAny visibility rules. This change simplifies the BRIN code a bit, mainly by removing large comments that were mistaken. Instead, rely on the SnapshotAny semantics to provide what it needs. (The business about a placeholder tuple needs to remain: that covers the case that a transaction inserts a a tuple in a page that summarization already scanned.) Discussion: https://www.postgresql.org/message-id/20150731175700.GX2441@postgresql.org In passing, remove a couple of unused declarations from brin.h and reword a comment to be proper English. This part submitted by Kevin Grittner. Backpatch to 9.5, where BRIN was introduced. Branch ------ master Details ------- http://git.postgresql.org/pg/commitdiff/2834855cb9fde734ce12f59694522c10bf0c0205 Modified Files -------------- src/backend/access/brin/brin.c | 41 +++++++---------------------- src/backend/catalog/index.c | 36 +++++++++++++++++++++++++- src/include/access/brin.h | 2 -- src/include/catalog/index.h | 1 + src/test/isolation/expected/brin-1.out | 39 ++++++++++++++++++++++++++++ src/test/isolation/isolation_schedule | 1 + src/test/isolation/specs/brin-1.spec | 44 ++++++++++++++++++++++++++++++++ 7 files changed, 129 insertions(+), 35 deletions(-)
Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > Fix BRIN to use SnapshotAny during summarization This patch added an isolation test that fails unless contrib/pageinspect has been built and installed. I do not find that acceptable. It causes "make check-world" to fail ... and no, installing the extension during make check-world isn't going to make me happier. I don't really think we need this isolation test at all, but if we do, please fix it to not rely on any extensions. Perhaps looking at pg_relation_size or some such would do? Or you could just issue a query that should use the index, and see if it finds the rows it ought to. regards, tom lane
... btw, I believe the reason for the failure here: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jaguarundi&dt=2015-08-06%2011%3A52%3A17 is that brin_page_items() is unsafe. It's accessing state->bdesc->bd_tupdesc, which is pointing at the tupdesc of an index Relation that it no longer has open; not only directly, but via brin_deform_tuple(). All you need is a relcache flush to be accessing garbage. I haven't dug down thoroughly, but I'd not be surprised if there were also some dereferences of bd_index, which is equally a dangling pointer. regards, tom lane
Using contrib modules in check (Re: pgsql: Fix BRIN to use SnapshotAny during summarization)
От
Christoph Berg
Дата:
Re: Tom Lane 2015-08-07 <928.1438900846@sss.pgh.pa.us> > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > > Fix BRIN to use SnapshotAny during summarization > > This patch added an isolation test that fails unless contrib/pageinspect > has been built and installed. I do not find that acceptable. It causes > "make check-world" to fail ... and no, installing the extension during > make check-world isn't going to make me happier. > > I don't really think we need this isolation test at all, but if we do, > please fix it to not rely on any extensions. Perhaps looking at > pg_relation_size or some such would do? Or you could just issue > a query that should use the index, and see if it finds the rows it > ought to. Hi, this breaks the Debian package builds as well because we run check-world as a build step. Any chance for a fix/workaround so the nightly master/head builds will succeed again? Christoph -- cb@df7cb.de | http://www.df7cb.de/
Christoph Berg <myon@debian.org> writes: > Re: Tom Lane 2015-08-07 <928.1438900846@sss.pgh.pa.us> >> I don't really think we need this isolation test at all, but if we do, >> please fix it to not rely on any extensions. Perhaps looking at >> pg_relation_size or some such would do? Or you could just issue >> a query that should use the index, and see if it finds the rows it >> ought to. > this breaks the Debian package builds as well because we run > check-world as a build step. > Any chance for a fix/workaround so the nightly master/head builds will > succeed again? I was waiting for Alvaro to deal with this, but perhaps he's on summer vacation or something. I will remove the isolation test until he has time to address it more fully. However, we did learn something valuable from the fact that all the -DCLOBBER_CACHE_ALWAYS critters failed on it: per my earlier message, brin_page_items() is unsafe against a relcache flush on the index. I'll put that on the 9.5 open items list. (If I were tasked with fixing it, I'd be tempted to rewrite it to do all the work in one call and return a tuplestore; the alternative seems to be to try to keep the index open across multiple calls, which would be a mess.) regards, tom lane
Re: Using contrib modules in check (Re: pgsql: Fix BRIN to use SnapshotAny during summarization)
От
Christoph Berg
Дата:
Re: Tom Lane 2015-08-10 <2713.1439215712@sss.pgh.pa.us> > I was waiting for Alvaro to deal with this, but perhaps he's on summer > vacation or something. I will remove the isolation test until he has > time to address it more fully. Thanks, the build worked now. Wouldn't a possible fix be to promote this/some module to core? It doesn't have any SQL- or OS-side dependencies. Christoph -- cb@df7cb.de | http://www.df7cb.de/
Christoph Berg <myon@debian.org> writes: > Thanks, the build worked now. > Wouldn't a possible fix be to promote this/some module to core? It > doesn't have any SQL- or OS-side dependencies. Meh. I think we'd agreed in another nearby thread that pageinspect is exactly the sort of thing we don't want in core. Even if it were in core, I'm dubious that it's a good way to implement the desired testing here: the risks of platform-dependent results (endianness, varying numbers of tuples per page, etc etc) seem far too high. regards, tom lane
On 2015-08-10 11:19:56 -0400, Tom Lane wrote: > Meh. I think we'd agreed in another nearby thread that pageinspect > is exactly the sort of thing we don't want in core. I think we agreed that it shouldn't be included in an initdb'ed database, but that doesn't preclude pageinspect being an extension in src/extension or something. > Even if it were > in core, I'm dubious that it's a good way to implement the desired > testing here: the risks of platform-dependent results (endianness, > varying numbers of tuples per page, etc etc) seem far too high. That on the other hand I can agree with being a danger. Greetings, Andres Freund