Re: [HACKERS] [PATCH] Assert that the correct locks are held whencalling PageGetLSN()

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: [HACKERS] [PATCH] Assert that the correct locks are held whencalling PageGetLSN()
Дата
Msg-id CAB7nPqQq7FHyfAM3dARint5vaPnb5ZXo=OvHexnWivRC5m1cjA@mail.gmail.com
обсуждение исходный текст
Ответ на [HACKERS] [PATCH] Assert that the correct locks are held when calling PageGetLSN()  (Jacob Champion <pchampion@pivotal.io>)
Ответы Re: [HACKERS] [PATCH] Assert that the correct locks are held whencalling PageGetLSN()  (Jacob Champion <pchampion@pivotal.io>)
Re: [HACKERS] [PATCH] Assert that the correct locks are held whencalling PageGetLSN()  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Fri, Sep 1, 2017 at 3:53 AM, Jacob Champion <pchampion@pivotal.io> wrote:
> The patch is really two pieces: add the assertion, and fix the callers
> that would trip it. The latter part is still in progress, because I'm
> running into some places where I'm not sure what the correct way
> forward is.

I looked at your patch.

> While working on checksum support for GPDB, we noticed that several
> callers of PageGetLSN() didn't follow the correct locking procedure.

Well, PageGetLSN can be used in some hot code paths, xloginsert.c
being one, so it does not seem wise to me to switch it to something
more complicated than a macro, and also it is not bounded to any
locking contracts now. For example, pageinspect works on a copy of the
buffer, so that's one case where we just don't care. So we should it
be tightened in Postgres? That's the portion of the proposal I don't
get. You could introduce a custom API for this purpose, isn't the
origin of your problems with GPDB for checksums that data is
replicated across many nodes so things need to be locked uniquely?

> - This change introduces a subtle source incompatibility: whereas
> previously both Pages and PageHeaders could be passed to PageGetLSN(),
> now callers must explicitly cast to Page if they don't already have
> one.

That would produce warnings for extensions, so people would likely
catch that when compiling with a newer version, if they use -Werror...

> - Is my use of FRONTEND here correct? The documentation made it seem
> like some compilers could try to link against the
> AssertPageIsLockedForLSN function, even if PageGetLSN was never
> called.

+#if defined(USE_ASSERT_CHECKING) && !defined(FRONTEND)
+void
+AssertPageIsLockedForLSN(Page page)
+{
From a design point of view, bufmgr.c is an API interface for buffer
management on the backend-size, so just using FRONTEND there looks
wrong to me (bufmgr.h is already bad enough IMO now).

> - Use of PageGetLSN() in heapam.c seems to be incorrect, but I don't
> really know what the best way is to fix it. It explicitly unlocks the
> buffer, with the comment that visibilitymap_set() needs to handle
> locking itself, but then calls PageGetLSN() to determine whether
> visibilitymap_set() needs to be called.

This bit in src/backend/access/transam/README may interest you:
Note that we must only use PageSetLSN/PageGetLSN() when we know the action
is serialised. Only Startup process may modify data blocks during recovery,
so Startup process may execute PageGetLSN() without fear of serialisation
problems. All other processes must only call PageSet/GetLSN when holding
either an exclusive buffer lock or a shared lock plus buffer header lock,
or be writing the data block directly rather than through shared buffers
while holding AccessExclusiveLock on the relation.

So I see no problem with the exiting caller.
-- 
Michael



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tatsuro Yamada
Дата:
Сообщение: Re: [HACKERS] CLUSTER command progress monitor
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: [HACKERS] POC: Sharing record typmods between backends