Re: [HACKERS] Re: pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: [HACKERS] Re: pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <
Дата
Msg-id 20160524215616.y4xvze5f5tfv64gn@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: [HACKERS] Re: pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <  (Kevin Grittner <kgrittn@gmail.com>)
Ответы Re: [HACKERS] Re: pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <  (Kevin Grittner <kgrittn@gmail.com>)
Список pgsql-committers
On 2016-05-24 16:09:27 -0500, Kevin Grittner wrote:
> On Tue, May 24, 2016 at 3:54 PM, Andres Freund <andres@anarazel.de> wrote:
>
> > what about e.g. concurrent index builds? E.g. IndexBuildHeapRangeScan() doesn't
> > seem to contain any checks against outdated blocks
>
> Why would it?  We're talking about blocks where there were dead
> tuples, with the transaction which updated or deleted the tuples
> being complete, where those dead tuples were vacuumed away.  These
> should not appear in the index, should they?

(it appears I had switched around the concerns for concurrent and
!concurrent in my head. But I think both might be negatively affected.)

For concurrent indexes we'll actually use a normal mvcc snapshot, which
means heap_getnext() in IndexBuildHeapRangeScan() and
validate_index_heapscan() will error out when encountering removed
tuples. Which means it'll be hard to ever perform a concurrent reindex
where an individual phase takes more than old_snapshot_threshold -
problematic from my POV, given that old_snapshot_threshold cannot be
changed at runtime.


For normal index scans the following appears to be problematic:
                case HEAPTUPLE_RECENTLY_DEAD:
                    /*
                     * If tuple is recently deleted then we must index it
                     * anyway to preserve MVCC semantics.  (Pre-existing
                     * transactions could try to use the index after we finish
                     * building it, and may need to see such tuples.)
                     *
                     * However, if it was HOT-updated then we must only index
                     * the live tuple at the end of the HOT-chain.  Since this
                     * breaks semantics for pre-existing snapshots, mark the
                     * index as unusable for them.
                     */

afaics that detection is broken if we advance the xmin horizon more
aggressively than the snapshot. The LSNs of the created index pages will
be new, and we'll thus not necessarily error out when requried. At the
very least we'd have to set indexInfo->ii_BrokenHotChain in that case.
As far as I can see normal index builds will allow concurrent hot prunes
and everything; since those only require page-level exclusive locks.

So for !concurrent builds we might end up with a corrupt index.


I think many of the places relying on heap scans with !mvcc snapshots
are problematic in that way. Outdated reads will not be detected by
TestForOldSnapshot() (given the (snapshot)->satisfies ==
HeapTupleSatisfiesMVCC condition therein), and rely on the snapshot to
be actually working.  E.g. CLUSTER/ VACUUM FULL rely on accurate
HEAPTUPLE_RECENTLY_DEAD
        switch (HeapTupleSatisfiesVacuum(tuple, OldestXmin, buf))
        {
...
            case HEAPTUPLE_RECENTLY_DEAD:
                tups_recently_dead += 1;
                /* fall through */
            case HEAPTUPLE_LIVE:
                /* Live or recently dead, must copy it */
                isdead = false;
                break;

If an old session with >= repeatable read accesses a clustered table
(after the cluster committed), they'll now not see any errors, because
all the LSNs look new.



> >>> Is there anything preventing this from becoming a problem?
> >>
> >> The fundamental approach that the error can only appear on
> >> user-facing scans, not internal reads and positioning.
> >
> >> Unless there is some code path that uses a scan where the snapshot
> >> age is checked, this cannot be a problem.  I don't see any such
> >> path, but if you do, please let me know, and I'll fix things.
> >
> > That scares me. Not throwing an error, and not being broken are entirely
> > different things.
>
> Absolutely, but let's not start pointing at random chunks of code
> and asking why an error isn't thrown there without showing that you
> get bad results otherwise, or at least some plausible argument why
> you might.

This attitude is disturbing. You've evidently not at all looked at the
snapshot issues around analyze before; even though snapshot too old at
the very least introduces a behavioural issue there (if not a bug at
least in the case of expression based stuff). I've asked you about
principled defenses, and your answer was "we don't error out".  Now I
point to another place, and you respond with a relatively strong
dismissal.  Independent of me being right or wrong, that seems to be the
completely wrong way round.


Andres


В списке pgsql-committers по дате отправления:

Предыдущее
От: Kevin Grittner
Дата:
Сообщение: Re: [HACKERS] Re: pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <
Следующее
От: Stephen Frost
Дата:
Сообщение: pgsql: Qualify table usage in dumpTable() and use regclass