Re: "page is not marked all-visible" warning in regression tests

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: "page is not marked all-visible" warning in regression tests
Дата
Msg-id 201206062107.29300.andres@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: "page is not marked all-visible" warning in regression tests  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: "page is not marked all-visible" warning in regression tests  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Wednesday, June 06, 2012 08:19:15 PM Robert Haas wrote:
> On Wed, Jun 6, 2012 at 1:46 PM, Andres Freund <andres@2ndquadrant.com> 
wrote:
> > On a cursory lock it might just be a race condition in
> > vacuumlazy.c:lazy_scan_heap. If scan_all is set, which it has to be for
> > the warning to be visible, all_visible_according_to_vm is determined
> > before we loop over all blocks. At the point where one specific heap
> > block is actually read and locked that knowledge might be completely
> > outdated by any concurrent backend. Am I missing something?
> 
> No, I think you're right.  I think that warning is bogus.  I added it
> in place of some older warning which no longer made sense, but I think
> this one doesn't make sense either.
Agreed.

It might be interesting to recheck the visibility and warn if its wrong. That 
should be infrequent enough to bearable and it does check for an actually 
dangerous case in a new code path.

> > I have to say the whole visibilitymap correctness and crash-safety seems
> > to be quite under documented, especially as it seems to be somewhat
> > intricate (to me). E.g. not having any note why visibilitymap_test
> > doesn't need locking. (I guess the theory is that a 1 byte read will
> > always be consistent. But how does that ensure other backends see an
> > up2date value?).
> 
> It's definitely intricate, and it's very possible that we should have
> some more documentation.  I am not sure exactly what and where, but
> feel free to suggest something.
I think some addition to the LOCKING part of visibilitymap.c's header 
explaining some of what you wrote in your email might be a good start.
I would also suggest explictly mentioning that its ok to have the 
visibilitymap and the page disagreeing about the visibility if its the 
visibility map that think that the page contains invisible data but not the 
other way round (I think that can currently happen).

visibilitymap_test() should explain that its results can be outdated if youre 
not holding a buffer lock.

> visibilitymap_test() does have a comment saying that:
> 
>         /*
>          * We don't need to lock the page, as we're only looking at a
> single bit.
>          */
Oh. I conveniently skipped that comment in my brain ;)

> But that's a bit unsatisfying, because, as you say, it doesn't address
> the question of memory-ordering issues.  I think that there's no
> situation in which it causes a problem to see the visibility map bit
> as unset when in reality it has just recently been set by some other
> back-end.  It would be bad if someone did something like:
> 
> if (visibilitymap_test(...))
>     visibilitymap_clear();
> 
> ...because then memory-ordering issues could cause us to accidentally
> fail to clear the bit.   No one should be doing that, though; the
> relevant locking and conditional logic is built directly into
> visibilitymap_clear().
Then _test should document that... I don't think its impossible that we will 
grow more uses of the visibilitymap logic.

> On the flip side, if someone sees the visibility map bit as set when
> it's actually just been cleared, that could cause a problem - most
> seriously, index-only scans could return wrong answers.  For that to
> happen, someone would have to insert a heap tuple onto a previously
> all-visible page, clearing the visibility map bit, and then insert an
> index tuple; concurrently, some other backend would need to see the
> index tuple but not the fact that the visibility map bit had been
> cleared.  I don't think that can happen: after inserting the heap
> tuple, the inserting backend would release buffer content lock, which
> acts as a full memory barrier; before reading the index tuple, the
> index-only-scanning backend would have to take the content lock on the
> index buffer, which also acts as a full memory barrier.  So the
> inserter can't do the writes out of order, and the index-only-scanner
> can't do the reads out of order, so I think it's safe.... but we
> probably do need to explain that somewhere.
Hm. For a short while I thought there would be an issue with heap_delete and 
IOS because the deleting transaction can commit without any barriers happening 
on the IOS side. But that only seems to be possible with non MVCC snapshots 
which are currently not allowed with index only scans.

Andres
-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ants Aasma
Дата:
Сообщение: Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
Следующее
От: Robert Haas
Дата:
Сообщение: Re: 9.3: load path to mitigate load penalty for checksums