Re: found xmin from before relfrozenxid on pg_catalog.pg_authid

Поиск
Список
Период
Сортировка
От Jeremy Finzel
Тема Re: found xmin from before relfrozenxid on pg_catalog.pg_authid
Дата
Msg-id CAMa1XUgztNAk8qcRNQuyCQtYU_tOB3+g4nkUGRhCSR_mKCmJdA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: found xmin from before relfrozenxid on pg_catalog.pg_authid  (Andres Freund <andres@anarazel.de>)
Ответы Re: found xmin from before relfrozenxid on pg_catalog.pg_authid  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers


On Fri, May 25, 2018 at 3:37 PM, Andres Freund <andres@anarazel.de> wrote:
Hi,

Moving discussion to -hackers.  Tom, I think you worked most with this
code, your input would be appreciated.

Original discussion is around:
http://archives.postgresql.org/message-id/20180524211311.tnswfnjwnii54htx%40alvherre.pgsql

On 2018-05-24 17:13:11 -0400, Alvaro Herrera wrote:
> On 2018-May-24, Andres Freund wrote:
> > Then there's also:
> > http://archives.postgresql.org/message-id/1527193504642.36340%40amazon.com
>
> ah, so deleting the relcache file makes the problem to go away?  That's
> definitely pretty strange.  I see no reason for the value in relcache to
> become out of step with the catalogued value in the same database ... I
> don't think we transmit in any way values of one database to another.

I can reproduce the issue. As far as I can tell we just don't ever
actually update nailed relcache entries in the normal course, leaving
the "physical address" aside.  VACUUM will, via
vac_update_relstats() -> heap_inplace_update() -> CacheInvalidateHeapTuple(),
send out an invalidation. But invalidation, in my case another session,
will essentially ignore most of that due to:

static void
RelationClearRelation(Relation relation, bool rebuild)
...
        /*
         * Never, never ever blow away a nailed-in system relation, because we'd
         * be unable to recover.  However, we must redo RelationInitPhysicalAddr
         * in case it is a mapped relation whose mapping changed.
         *
         * If it's a nailed-but-not-mapped index, then we need to re-read the
         * pg_class row to see if its relfilenode changed. We do that immediately
         * if we're inside a valid transaction and the relation is open (not
         * counting the nailed refcount).  Otherwise just mark the entry as
         * possibly invalid, and it'll be fixed when next opened.
         */
        if (relation->rd_isnailed)
        {
                RelationInitPhysicalAddr(relation);

                if (relation->rd_rel->relkind == RELKIND_INDEX ||
                        relation->rd_rel->relkind == RELKIND_PARTITIONED_INDEX)
                {
                        relation->rd_isvalid = false;   /* needs to be revalidated */
                        if (relation->rd_refcnt > 1 && IsTransactionState())
                                RelationReloadIndexInfo(relation);
                }
                return;
        }

Which basically means that once running we'll never update the relcache
data for nailed entries.  That's unproblematic for most relcache fields,
but not for things like RelationData->rd_rel->relfrozenxid / relminmxid.

This'll e.g. lead to lazy_vacuum_rel() wrongly not using aggressive
vacuums despite being required. And it'll lead, triggering this thread,
to wrong errors being raised during vacuum because relfrozenxid just is
some random value from the past.  I suspect this might also be
co-responsible for a bunch of planning issues for queries involving the
catalog, because the planner will use wrong relcache data until the next
time the init file is thrown away?

This looks like a very longstanding bug to me.  I'm not yet quite sure
what the best way to deal with this is.  I suspect we might get away
with just looking up a new version of the pg_class tuple and copying
rd_rel over?

Greetings,

Andres Freund

I have a question related to this - and specifically, preventing the error until we have a patch :).  We are encountering this error every few weeks on one very high transaction db, and have to restart to fix it.

If I read you correctly, the cache may never be invalidated for these catalogs even if I manually VACUUM them?  I was thinking if I routinely run VACUUM FREEZE on these tables in every database I might avoid the issue.  But given the cause of the issue, would that just make no difference and I will still hit the error eventually?

Thanks,
Jeremy

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: WAL prefetch
Следующее
От: Matheus de Oliveira
Дата:
Сообщение: Re: found xmin from before relfrozenxid on pg_catalog.pg_authid