Re: new heapcheck contrib module

Поиск
Список
Период
Сортировка
От Mark Dilger
Тема Re: new heapcheck contrib module
Дата
Msg-id 142B4FB1-C7ED-43E4-95E3-FF80DC26F3A6@enterprisedb.com
обсуждение исходный текст
Ответ на Re: new heapcheck contrib module  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: new heapcheck contrib module  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers

> On Jul 30, 2020, at 5:53 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, Jul 30, 2020 at 6:10 PM Mark Dilger
> <mark.dilger@enterprisedb.com> wrote:
>> No, that wasn't my concern.  I was thinking about CLOG entries disappearing during the scan as a consequence of
concurrentvacuums, and the effect that would have on the validity of the cached [relfrozenxid..next_valid_xid] range.
Inthe absence of corruption, I don't immediately see how this would cause any problems.  But for a corrupt table, I'm
lesscertain how it would play out. 
>
> Oh, hmm. I wasn't thinking about that problem. I think the only way
> this can happen is if we read a page and then, before we try to look
> up the CID, vacuum zooms past, finishes the whole table, and truncates
> clog. But if that's possible, then it seems like it would be an issue
> for SELECT as well, and it apparently isn't, or we would've done
> something about it by now. I think the reason it's not possible is
> because of the locking rules described in
> src/backend/storage/buffer/README, which require that you hold a
> buffer lock until you've determined that the tuple is visible. Since
> you hold a share lock on the buffer, a VACUUM that hasn't already
> processed that freeze the tuples in that buffer; it would need an
> exclusive lock on the buffer to do that. Therefore it can't finish and
> truncate clog either.
>
> Now, you raise the question of whether this is still true if the table
> is corrupt, but I don't really see why that makes any difference.
> VACUUM is supposed to freeze each page it encounters, to the extent
> that such freezing is necessary, and with Andres's changes, it's
> supposed to ERROR out if things are messed up. We can postulate a bug
> in that logic, but inserting a VACUUM-blocking lock into this tool to
> guard against a hypothetical vacuum bug seems strange to me. Why would
> the right solution not be to fix such a bug if and when we find that
> there is one?

Since I can't think of a plausible concrete example of corruption which would elicit the problem I was worrying about,
I'llwithdraw the argument.  But that leaves me wondering about a comment that Andres made upthread: 

> On Apr 20, 2020, at 12:42 PM, Andres Freund <andres@anarazel.de> wrote:

> I don't think random interspersed uses of CLogTruncationLock are a good
> idea. If you move to only checking visibility after tuple fits into
> [relfrozenxid, nextXid), then you don't need to take any locks here, as
> long as a lock against vacuum is taken (which I think this should do
> anyway).

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: HashAgg's batching counter starts at 0, but Hash's starts at 1. (now: incremental sort)
Следующее
От: James Coleman
Дата:
Сообщение: Re: HashAgg's batching counter starts at 0, but Hash's starts at 1. (now: incremental sort)