Re: Something is rotten in the state of Denmark...

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Something is rotten in the state of Denmark...
Дата
Msg-id 14837.1427993642@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Something is rotten in the state of Denmark...  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Something is rotten in the state of Denmark...  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Apr 1, 2015 at 7:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I've not fully tracked it down, but I think that the blame falls on the
>> MVCC-snapshots-for-catalog-scans patch; it appears that it's trying to
>> read pg_am's pg_class entry with a snapshot that's too old, possibly
>> because it assumes that sinval signaling is alive which I think ain't so.

> Hmm, sorry for the bug.  It looks to me like sinval signaling gets
> started up at the beginning of InitPostgres().  Perhaps something like
> this?

Yeah, it does, so you'd think this would work.  I've now tracked down
exactly what's happening, and it's this: while we're reading pg_authid
during PerformAuthentication, we establish a catalog snapshot.  Later on,
we read pg_database to find out what DB we're connecting to.  If any
sinval messages for unshared system catalogs arrive at that point, they'll
effectively be ignored, *because our MyDatabaseId is still zero and so
doesn't match the incoming message*.  That's okay as far as the catcaches
are concerned, cause they're empty, and the relcache only knows about
shared catalogs so far.  But InvalidateCatalogSnapshot doesn't get called
by LocalExecuteInvalidationMessage, and so we think we can plow ahead with
the old snapshot.

It looks to me like an appropriate fix would be as attached; thoughts?
(I've tested this with a delay during relation lock acquisition and
concurrent whole-database VACUUM FULLs, and so far been unable to break
it.)

            regards, tom lane

diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 1e646a1..6aa6868 100644
*** a/src/backend/utils/init/postinit.c
--- b/src/backend/utils/init/postinit.c
*************** InitPostgres(const char *in_dbname, Oid
*** 853,858 ****
--- 853,866 ----
      MyProc->databaseId = MyDatabaseId;

      /*
+      * We may have already established a catalog snapshot while trying to read
+      * pg_authid; but until we have set up MyDatabaseId, we won't react to
+      * incoming sinval messages properly, so we won't realize it if the
+      * snapshot has been invalidated.  Best to assume it's no good anymore.
+      */
+     InvalidateCatalogSnapshot();
+
+     /*
       * Now, take a writer's lock on the database we are trying to connect to.
       * If there is a concurrently running DROP DATABASE on that database, this
       * will block us until it finishes (and has committed its update of

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: POLA violation with \c service=
Следующее
От: Tom Lane
Дата:
Сообщение: Re: vac truncation scan problems