Inval reliability, especially for inplace updates

Поиск
Список
Период
Сортировка
От Noah Misch
Тема Inval reliability, especially for inplace updates
Дата
Msg-id 20240523000548.58.nmisch@google.com
обсуждение исходный текст
Ответы Re: Inval reliability, especially for inplace updates
Список pgsql-hackers
https://postgr.es/m/20240512232923.aa.nmisch@google.com wrote:
> Separable, nontrivial things not fixed in the attached patch stack:
> 
> - Inplace update uses transactional CacheInvalidateHeapTuple().  ROLLBACK of
>   CREATE INDEX wrongly discards the inval, leading to the relhasindex=t loss
>   still seen in inplace-inval.spec.  CacheInvalidateRelmap() does this right.

I plan to fix that like CacheInvalidateRelmap(): send the inval immediately,
inside the critical section.  Send it in heap_xlog_inplace(), too.  The
interesting decision is how to handle RelationCacheInitFilePreInvalidate(),
which has an unlink_initfile() that can fail with e.g. EIO.  Options:

1. Unlink during critical section, and accept that EIO becomes PANIC.  Replay
   may reach the same EIO, and the system won't reopen to connections until
   the storage starts cooperating.a  Interaction with checkpoints is not ideal.
   If we checkpoint and then crash between inplace XLogInsert() and inval,
   we'd be relying on StartupXLOG() -> RelationCacheInitFileRemove().  That
   uses elevel==LOG, so replay would neglect to PANIC on EIO.

2. Unlink before critical section, so normal xact abort suffices.  This would
   hold RelCacheInitLock and a buffer content lock at the same time.  In
   RecordTransactionCommit(), it would hold RelCacheInitLock and e.g. slru
   locks at the same time.

The PANIC risk of (1) seems similar to the risk of PANIC at
RecordTransactionCommit() -> XLogFlush(), which hasn't been a problem.  The
checkpoint-related risk bothers me more, and (1) generally makes it harder to
reason about checkpoint interactions.  The lock order risk of (2) feels
tolerable.  I'm leaning toward (2), but that might change.  Other preferences?

Another decision is what to do about LogLogicalInvalidations().  Currently,
inplace update invalidations do reach WAL via LogLogicalInvalidations() at the
next CCI.  Options:

a. Within logical decoding, cease processing invalidations for inplace
   updates.  Inplace updates don't affect storage or interpretation of table
   rows, so they don't affect logicalrep_write_tuple() outcomes.  If they did,
   invalidations wouldn't make it work.  Decoding has no way to retrieve a
   snapshot-appropriate version of the inplace-updated value.

b. Make heap_decode() of XLOG_HEAP_INPLACE recreate the invalidation.  This
   would be, essentially, cheap insurance against invalidations having a
   benefit I missed in (a).

I plan to pick (a).

> - AtEOXact_Inval(true) is outside the RecordTransactionCommit() critical
>   section, but it is critical.  We must not commit transactional DDL without
>   other backends receiving an inval.  (When the inplace inval becomes
>   nontransactional, it will face the same threat.)

This faces the same RelationCacheInitFilePreInvalidate() decision, and I think
the conclusion should be the same as for inplace update.

Thanks,
nm



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Martijn Wallet
Дата:
Сообщение: Re: processes stuck in shutdown following OOM/recovery
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)