Re: snapshot too old issues, first around wraparound and then more.

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: snapshot too old issues, first around wraparound and then more.
Дата
Msg-id 20200401183706.i5ob2pdvmmo76fcz@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: snapshot too old issues, first around wraparound and then more.  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: snapshot too old issues, first around wraparound and then more.
Список pgsql-hackers
Hi,

On 2020-04-01 13:27:56 -0400, Robert Haas wrote:
> Perhaps "irresponsible" is the wrong word, but it's certainly caused
> problems for multiple EnterpriseDB customers, and in my view, those
> problems weren't necessary. Either a WARNING or an ERROR would have
> shown up in the log, but an ERROR terminates VACUUM for that table and
> thus basically causes autovacuum to be completely broken. That is a
> really big problem. Perhaps you will want to argue, as Andres did,
> that the value of having ERROR rather than WARNING in the log
> justifies that outcome, but I sure don't agree.

If that had been a really viable option, I would have done so. At the
very least in the back branches, but quite possibly also in master. Or
if somebody had brought them up as an issue at the time.

What is heap_prepare_freeze_tuple/FreezeMultiXactId supposed to do after
issuing a WARNING in these cases. Without the ERROR, e.g.,
            if (!TransactionIdDidCommit(xid))
                ereport(ERROR,
                        (errcode(ERRCODE_DATA_CORRUPTED),
                         errmsg_internal("uncommitted xmin %u from before xid cutoff %u needs to be frozen",
                                         xid, cutoff_xid)));
would make a deleted tuple visible.


        if (TransactionIdPrecedes(xid, relfrozenxid))
            ereport(ERROR,
                    (errcode(ERRCODE_DATA_CORRUPTED),
                     errmsg_internal("found xmin %u from before relfrozenxid %u",
                                     xid, relfrozenxid)));
would go on replace xmin of a potentially uncommitted tuple with
relfrozenxid, making it appear visible.


        if (TransactionIdPrecedes(xid, relfrozenxid))
            ereport(ERROR,
                    (errcode(ERRCODE_DATA_CORRUPTED),
                     errmsg_internal("found xmax %u from before relfrozenxid %u",
                                     xid, relfrozenxid)));
would replace the xmax indicating a potentially deleted tuple with ?, either
making the tuple become, potentially wrongly, visible/invisible

or
    else if (MultiXactIdPrecedes(multi, relminmxid))
        ereport(ERROR,
                (errcode(ERRCODE_DATA_CORRUPTED),
                 errmsg_internal("found multixact %u from before relminmxid %u",
                                 multi, relminmxid)));
or ...


Just continuing is easier said than done. Especially with the background
of knowing that several users had hit the bug that allowed all of the
above to be hit, and that advancing relfrozenxid further would make it
worse.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: snapshot too old issues, first around wraparound and then more.
Следующее
От: Julien Rouhaud
Дата:
Сообщение: Re: BUG #16109: Postgres planning time is high across version (Exposebuffer usage during planning in EXPLAIN)