pgsql: Avoid improbable PANIC during heap_update.

Поиск
Список
Период
Сортировка
От Tom Lane
Тема pgsql: Avoid improbable PANIC during heap_update.
Дата
Msg-id E1lWLjP-00006Y-Ml@gemulon.postgresql.org
обсуждение исходный текст
Ответы Re: pgsql: Avoid improbable PANIC during heap_update.  (Jaime Casanova <jcasanov@systemguards.com.ec>)
Список pgsql-committers
Avoid improbable PANIC during heap_update.

heap_update needs to clear any existing "all visible" flag on
the old tuple's page (and on the new page too, if different).
Per coding rules, to do this it must acquire pin on the appropriate
visibility-map page while not holding exclusive buffer lock;
which creates a race condition since someone else could set the
flag whenever we're not holding the buffer lock.  The code is
supposed to handle that by re-checking the flag after acquiring
buffer lock and retrying if it became set.  However, one code
path through heap_update itself, as well as one in its subroutine
RelationGetBufferForTuple, failed to do this.  The end result,
in the unlikely event that a concurrent VACUUM did set the flag
while we're transiently not holding lock, is a non-recurring
"PANIC: wrong buffer passed to visibilitymap_clear" failure.

This has been seen a few times in the buildfarm since recent VACUUM
changes that added code paths that could set the all-visible flag
while holding only exclusive buffer lock.  Previously, the flag
was (usually?) set only after doing LockBufferForCleanup, which
would insist on buffer pin count zero, thus preventing the flag
from becoming set partway through heap_update.  However, it's
clear that it's heap_update not VACUUM that's at fault here.

What's less clear is whether there is any hazard from these bugs
in released branches.  heap_update is certainly violating API
expectations, but if there is no code path that can set all-visible
without a cleanup lock then it's only a latent bug.  That's not
100% certain though, besides which we should worry about extensions
or future back-patch fixes that could introduce such code paths.

I chose to back-patch to v12.  Fixing RelationGetBufferForTuple
before that would require also back-patching portions of older
fixes (notably 0d1fe9f74), which is more code churn than seems
prudent to fix a hypothetical issue.

Discussion: https://postgr.es/m/2247102.1618008027@sss.pgh.pa.us

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/34f581c39e97e2ea237255cf75cccebccc02d477

Modified Files
--------------
src/backend/access/heap/heapam.c | 44 ++++++++++++++++++++++++----------------
src/backend/access/heap/hio.c    | 24 ++++++++++++++++------
2 files changed, 45 insertions(+), 23 deletions(-)


В списке pgsql-committers по дате отправления:

Предыдущее
От: Fujii Masao
Дата:
Сообщение: pgsql: doc: Fix typo in logicaldecoding.sgml.
Следующее
От: Tom Lane
Дата:
Сообщение: pgsql: Redesign the caching done by get_cached_rowtype().