Re: WAL record CRC calculated incorrectly because of underlying buffer modification

Поиск
Список
Период
Сортировка
От Jeff Davis
Тема Re: WAL record CRC calculated incorrectly because of underlying buffer modification
Дата
Msg-id 895723aba95f108207a744c799588ec3f2fef819.camel@j-davis.com
обсуждение исходный текст
Ответ на Re: WAL record CRC calculated incorrectly because of underlying buffer modification  (Thomas Munro <thomas.munro@gmail.com>)
Ответы Re: WAL record CRC calculated incorrectly because of underlying buffer modification
Список pgsql-hackers
On Mon, 2024-05-13 at 11:15 +1200, Thomas Munro wrote:
> > > > > Perhaps a no-image, no-change registered buffer should not be
> > > > > including an image, even for XLR_CHECK_CONSISTENCY?  It's
> > > > > actually
> > > > > useless for consistency checking too I guess, this issue
> > > > > aside,
> > > > > because it doesn't change anything so there is nothing to
> > > > > check.

I'm not convinced by that reasoning. Can't it check that nothing has
changed?

> > >
> > > Does it reproduce if you do this?
> > >
> > > -               include_image = needs_backup || (info &
> > > XLR_CHECK_CONSISTENCY) != 0;
> > > +               include_image = needs_backup ||
> > > +                       ((info & XLR_CHECK_CONSISTENCY) != 0 &&
> > > +                        (regbuf->flags & REGBUF_NO_CHANGE) ==
> > > 0);
> >
> > No, it doesn't (at least with the latter, more targeted
> > reproducer).
>
> OK so that seems like a candidate fix, but ...

...

> ... we'd need to figure out how to fix this in the back-branches too.
> One idea would be to back-patch REGBUF_NO_CHANGE, and another might
> be
> to deduce that case from other variables.  Let me CC a couple more
> people from this thread, which most recently hacked on this stuff, to
> see if they have insights:


Starting from the beginning, XLogRecordAssemble() calculates the CRC of
the record (technically just the non-header portions, but that's not
important here), including any backup blocks. Later,
CopyXLogRecordToWAL() copies that data into the actual xlog buffers. If
the data changes between those two steps, the CRC will be bad.

For most callers, the contents are exclusive-locked, so there's no
problem. For checksums, the data is copied out of shared memory into a
stack variable first, so no concurrent activity can change it. For hash
indexes, it tries to protect itself by passing REGBUF_NO_IMAGE.

There are two problems:

1. That implies another invariant that we aren't checking for: that
REGBUF_NO_CHANGE must be accompanied by REGBUF_NO_IMAGE. That doesn't
seem to be true for all callers, see XLogRegisterBuffer(1, wbuf,
wbuf_flags) in _hash_freeovflpage().

2. As you point out, REGBUF_NO_IMAGE still allows an image to be taken
if XLR_CHECK_CONSISTENCY is set, so we need to figure out what to do
there.

Can we take a step back and think harder about what hash indexes are
doing and if there's a better way? Maybe hash indexes need to take a
copy of the page, like in XLogSaveBufferForHint()?

I'd prefer that we find a way to get rid of REGBUF_NO_CHANGE and make
all callers follow the rules than to find new special cases that depend
on REGBUF_NO_CHANGE. See these messages here:

https://www.postgresql.org/message-id/b1f2d0f230d60fa8df33bb0b2af3236fde9d750d.camel%40j-davis.com

https://www.postgresql.org/message-id/CA%2BTgmoY%2BdagCyrMKau7UQeQU6w4LuVEu%2ByjsmJBoXKAo6XbUUA%40mail.gmail.com

In other words, we added REGBUF_NO_CHANGE for the call sites (only hash
indexes) that don't follow the rules and where it wasn't easy to make
them follow the rules. Now that we know of a concrete problem with the
design, there's more upside to fixing it properly.

Regards,
    Jeff Davis




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: Requiring LLVM 14+ in PostgreSQL 18
Следующее
От: Peter Smith
Дата:
Сообщение: Re: Docs: Always use true|false instead of sometimes on|off for the subscription options