Re: Instability in the postgres_fdw regression test

Поиск
Список
Период
Сортировка
От Etsuro Fujita
Тема Re: Instability in the postgres_fdw regression test
Дата
Msg-id 5AD020A5.10605@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: Instability in the postgres_fdw regression test  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
(2018/04/13 3:49), Robert Haas wrote:
> On Thu, Apr 12, 2018 at 12:49 PM, Tom Lane<tgl@sss.pgh.pa.us>  wrote:
>> We've been seeing $subject since commit 1bc0100d2, with little clue
>> as to the cause.  Previous attempts to fix it by preventing autovacuum
>> from running on the relevant tables didn't seem to help.
>>
>> I have now managed to reproduce the issue reliably enough to study it.
>> (It turns out that on longfin's host, running the buildfarm script *under
>> cron* produces the failure a reasonable percentage of the time.  The
>> identical test case, launched from an interactive shell, never fails.
>> I speculate that the kernel scheduler treats cron jobs differently.)
>> It is in fact a timing issue, and what triggers it is there being an
>> active autovacuum task in another database.  The fact that the buildfarm
>> script uses USE_MODULE_DB in the contrib tests greatly increases the
>> surface area of the problem, since that creates a lot more databases that
>> autovacuum could be active in.  Once I realized that, I found that it can
>> trivially be reproduced by hand, in a "make installcheck" test, simply by
>> having an open transaction in another DB in the cluster.  For instance
>> "select pg_sleep(60);" and then run make installcheck in postgres_fdw.
>>
>> So now, drilling down to the specific problem: what we're seeing is that
>> the plans for some queries change because the "remote" server reports
>> a different rowcount estimate than usual for
>>
>> EXPLAIN SELECT "C 1", c2, c4, c5, c6, c7, c8, ctid FROM "S 1"."T 1" WHERE (("C 1">  2000)) FOR UPDATE
>>
>> Normally the estimate is one, but in the failure cases it's 12 or so.
>> This estimate is coming out of ineq_histogram_selectivity, of course.
>> Furthermore, the highest value actually present in the histogram is
>> 1000, because that was the highest value of "C 1" when the test did
>> ANALYZE up at the top.  That means we'll invoke get_actual_variable_range
>> to try to identify the actual current maximum.  Most of the time, it
>> returns 2010, which is correct, and we end up estimating that only
>> about one row will exceed 2000.
>>
>> However, in the failure cases, what's getting extracted from the index
>> is 9999.  That's because we'd inserted and deleted that value further
>> up in the test, and if there's been an open transaction holding back
>> RecentGlobalXmin, then SnapshotNonVacuumable is going to consider that
>> entry still good.  This value is enough larger than 2000 to move the
>> selectivity estimate appreciably, and then kaboom.
>>
>> This theory successfully explains the observed fact that some buildfarm
>> failures show differences in two query plans, while others show a
>> difference just in the first one.  In the latter cases, the external
>> transaction ended, so that RecentGlobalXmin could advance, in between.
>>
>> What I propose to do to fix the instability is to change the test
>> stanza that uses 9999 as a key-chosen-at-random to use something less
>> far away from the normal range of "C 1" values, so that whether it's
>> still visible to get_actual_variable_range has less effect on this
>> selectivity estimate.  That's a hack, for sure, but I don't see any
>> other fix that's much less of a hack.
>
> Thanks for the detective work.

Thanks a lot!

Best regards,
Etsuro Fujita


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: wal_consistency_checking reports an inconsistency on masterbranch
Следующее
От: Amit Langote
Дата:
Сообщение: Re: wal_consistency_checking reports an inconsistency on masterbranch