Re: Online enabling of checksums

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Online enabling of checksums
Дата
Msg-id 20180406172206.hplj42fwzoas6xi3@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: Online enabling of checksums  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: Online enabling of checksums  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
Hi,

On 2018-04-06 14:34:43 +0200, Tomas Vondra wrote:
> > Oh, that's not my intention either -- I just wanted to make sure I
> > was thinking about the same issue you were.

> I agree we shouldn't rely on chance here - if we might read a stale
> value, we need to fix that of course.

It's perfectly possible that some side-conditions mitigate this. What
concerns me that
a) Nobody appears to have raised this issue beforehand, besides an
   unlocked read of a critical variable being a fairly obvious
   issue. This kind of thing needs to be carefully thought about.
b) If there's some "side channel" interlock, it's not documented.

I noticed the issue because of an IM question about the general feature,
and I did a three minute skim and saw the read without a comment.


> I'm not quite sure I fully understand the issue, though. I assume both
> LockBufHdr and UnlockBufHdr are memory barriers, so for bad things to
> happen the process would need to be already past LockBufHdr when the
> checksum version is updated. In which case it can use a stale version
> when writing the buffer out. Correct?

Yes, they're are memory barriers.


> I wonder if that's actually a problem, considering the checksum worker
> will then overwrite all data with correct checksums anyway. So the other
> process would have to overwrite the buffer after checksum worker, at
> which point it'll have to go through LockBufHdr.

Again, I'm not sure if there's some combination of issues that make this
not a problem in practice. I just *asked* if there's something
preventing this from being a problem.

The really problematic case would be if it is possible for some process
to wait long enough, without executing a barrier implying operation,
that it'd try to write out a page that the checksum worker has already
passed over.

Greetings,

Andres Freund


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Teodor Sigaev
Дата:
Сообщение: Re: WIP: Covering + unique indexes.
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: WIP: Covering + unique indexes.