Re: snapshot too old issues, first around wraparound and then more.

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: snapshot too old issues, first around wraparound and then more.
Дата
Msg-id CAH2-Wz=s9fJx=KkOX1UH54UnvPQe3j6OqJxQGxX+ZNz2B0YH8A@mail.gmail.com
обсуждение исходный текст
Ответ на snapshot too old issues, first around wraparound and then more.  (Andres Freund <andres@anarazel.de>)
Ответы Re: snapshot too old issues, first around wraparound and then more.
Список pgsql-hackers
On Tue, Mar 31, 2020 at 11:40 PM Andres Freund <andres@anarazel.de> wrote:
> The problem, as far as I can tell, is that
> oldSnapshotControl->head_timestamp appears to be intended to be the
> oldest value in the ring. But we update it unconditionally in the "need
> a new bucket, but it might not be the very next one" branch of
> MaintainOldSnapshotTimeMapping().
>
> While there's not really a clear-cut comment explaining whether
> head_timestamp() is intended to be the oldest or the newest timestamp,
> it seems to me that the rest of the code treats it as the "oldest"
> timestamp.

At first, I was almost certain that it's supposed to be the oldest
based only on the OldSnapshotControlData struct fields themselves. It
seemed pretty unambiguous:

    int         head_offset;    /* subscript of oldest tracked time */
    TimestampTz head_timestamp; /* time corresponding to head xid */

(Another thing that supports this interpretation is the fact that
there is a separate current_timestamp latest timestamp field in
OldSnapshotControlData.)

But then I took another look at the "We need a new bucket, but it
might not be the very next one" branch. It does indeed seem to
directly contradict the OldSnapshotControlData comments/documentation.
Note just the code itself, either. Even comments from this "new
bucket" branch disagree with the OldSnapshotControlData comments:

                if (oldSnapshotControl->count_used ==
OLD_SNAPSHOT_TIME_MAP_ENTRIES)
                {
                    /* Map full and new value replaces old head. */
                    int         old_head = oldSnapshotControl->head_offset;

                    if (old_head == (OLD_SNAPSHOT_TIME_MAP_ENTRIES - 1))
                        oldSnapshotControl->head_offset = 0;
                    else
                        oldSnapshotControl->head_offset = old_head + 1;
                    oldSnapshotControl->xid_by_minute[old_head] = xmin;
                }

Here, the comment says the map (circular buffer) is full, and that we
must replace the current head with a *new* value/timestamp (the one we
just got in GetSnapshotData()). It looks as if the design of the data
structure changed during the development of the original patch, but
this entire branch was totally overlooked.

In conclusion, I share Andres' concerns here. There are glaring
problems with how we manipulate the data structure that controls the
effective horizon for pruning. Maybe they can be fixed while leaving
the code that manages the OldSnapshotControl circular buffer in
something resembling its current form, but I doubt it. In my opinion,
there is no approach to fixing "snapshot too old" that won't have some
serious downside.

-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: backup manifests
Следующее
От: Mark Dilger
Дата:
Сообщение: Re: Should we add xid_current() or a int8->xid cast?