Re: In progress INSERT wrecks plans on table

Поиск
Список
Период
Сортировка
От Jeff Janes
Тема Re: In progress INSERT wrecks plans on table
Дата
Msg-id CAMkU=1xH7QwTv_H1=QFu7MFr=YGiwReHWZDmXOHyuJbB-Z3qhQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: In progress INSERT wrecks plans on table  (Simon Riggs <simon@2ndQuadrant.com>)
Ответы Re: In progress INSERT wrecks plans on table  (Ants Aasma <ants@cybertec.at>)
Список pgsql-performance
On Mon, May 6, 2013 at 1:14 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 6 May 2013 02:51, Mark Kirkwood <mark.kirkwood@catalyst.net.nz> wrote:
>> On 05/05/13 00:49, Simon Riggs wrote:
>>>
>>> On 3 May 2013 13:41, Simon Riggs <simon@2ndquadrant.com> wrote:
>>>
>>>> (3) to make the check on TransactionIdIsInProgress() into a heuristic,
>>>> since we don't *need* to check that, so if we keep checking the same
>>>> xid repeatedly we can reduce the number of checks or avoid xids that
>>>> seem to be long running. That's slightly more coding than my quick
>>>> hack here but seems worth it.
>>>>
>>>> I think we need both (1) and (3) but the attached patch does just (1).
>>>>
>>>> This is a similar optimisation to the one I introduced for
>>>> TransactionIdIsKnownCompleted(), except this applies to repeated
>>>> checking of as yet-incomplete xids, and to bulk concurrent
>>>> transactions.
>>>
>>>
>>> ISTM we can improve performance of TransactionIdIsInProgress() by
>>> caching the procno of our last xid.
>>>
>>> Mark, could you retest with both these patches? Thanks.
>>>
>>
>> Thanks Simon, will do and report back.
>
> OK, here's a easily reproducible test...
>
> Prep:
> DROP TABLE IF EXISTS plan;
> CREATE TABLE plan
> (
>   id INTEGER NOT NULL,
>   typ INTEGER NOT NULL,
>   dat TIMESTAMP,
>   val TEXT NOT NULL
> );
> insert into plan select generate_series(1,100000), 0,
> current_timestamp, 'some texts';
> CREATE UNIQUE INDEX plan_id ON plan(id);
> CREATE INDEX plan_dat ON plan(dat);
>
> testcase.pgb
> select count(*) from plan where dat is null and typ = 3;
>
> Session 1:
> pgbench -n -f testcase.pgb -t 100
>
> Session 2:
> BEGIN; insert into plan select 1000000 + generate_series(1, 100000),
> 3, NULL, 'b';
>
> Transaction rate in Session 1: (in tps)
> (a) before we run Session 2:
> Current: 5600tps
> Patched: 5600tps
>
> (b) after Session 2 has run, yet before transaction end
> Current: 56tps
> Patched: 65tps


When I run this test case in single-client mode, I don't see nearly
that much speedup, it just goes from 38.99 TPS to 40.12 TPS.  But
still, it is a speedup, and very reproducible (t-test p-val < 1e-40, n
of 21 for both)

But I also tried it with 4 pgbench clients, and ran into a collapse of
the performance, TPS dropping down to ~8 TPS.   It is too variable to
figure out how reliable the speed-up with this patch is, so far.
Apparently they are all fighting over the spinlock on the
ProcArrayLock.

This is a single quad core, "Intel(R) Xeon(R) CPU           X3210  @ 2.13GHz"

So I agree with (3) above, about not checking
TransactionIdIsInProgress repeatedly.  Or could we change the order of
operations so that TransactionIdIsInProgress is checked only after
XidInMVCCSnapshot?

Or perhaps the comment "XXX Can we test without the lock first?" could
be implemented and save the day here?

Looking at the code, there is something that bothers me about this part:

        pxid = cxid = InvalidTransactionId;
        return false;

If it is safe to return false at this point (as determined by the
stale values of pxid and cxid) then why would we clear out the stale
values so they can't be used in the future to also short circuit
things?  On the other hand, if the stale values need to be cleared so
they are not misleading to future invocations, why is it safe for this
invocation to have made a decision based on them?  Maybe with more
thought I will see why this is so.

....


>
> Which brings me back to Mark's original point, which is that we are
> x100 times slower in this case and it *is* because the choice of
> IndexScan is a bad one for this situation.
>
> After some thought on this, I do think we need to do something about
> it directly, rather than by tuning infrastructire (as I just
> attempted). The root cause here is that IndexScan plans are sensitive
> to mistakes in data distribution, much more so than other plan types.
>
> The two options, broadly, are to either
>
> 1. avoid IndexScans in the planner unless they have a *significantly*
> better cost. At the moment we use IndexScans if cost is lowest, even
> if that is only by a whisker.

This wouldn't work well in Mark's specific case, because the problem
is that it is using the wrong index, not that it is using an index at
all.  There are two candidate indexes, and one looks slightly better
but then gets ruined by the in-progress insert, while the other looks
slightly worse but would be resistant to the in-progress insert.
Switching from the bad index to the seq scan is not going to fix
things.  I don't think there is any heuristic solution here other than
to keep track of the invisible data distribution as well as the
visible data distribution.

The more I've thought about it, the more I see the charm of Mark's
original proposal.  Why not build the statistics assuming that the
in-progress insert will commit?  It is not a complete solution,
because autoanalyze will not get triggered until the transaction
completes.  But why not let the manual ANALYZE get the benefit of
seeing them?

The statistics serve two masters.  One is to estimate how many rows
will actually be returned.  The other is to estimate how much work it
will take to return them (including the work of groveling through a
list of in-process tuples).  Right now those are implicitly considered
the same thing--we could change that and keep separate sets of
statistics, but I think we could improve things some without doing
that.  For the first case of estimating actual rows returned, I think
counting in-progress rows is a wash.  It seems just about as likely
that the bulk transaction which was in progress at the time of the
last ANALYZE has already committed at the time of the planning (but
not yet completed the autoanalyze) as it is that the bulk transaction
is still in progress at the time of the planning; which means counting
the rows as if they committed is sometimes right and sometimes wrong
but in about equal measure.  But for the second master, counting the
in progress rows seems like a win all the time.  Either we actually
see them and do the work, or we refuse to see them but still have to
do the work to figure that out.

Cheers,

Jeff


В списке pgsql-performance по дате отправления:

Предыдущее
От: Josh Berkus
Дата:
Сообщение: Re: Trying to eliminate union and sort
Следующее
От: Ants Aasma
Дата:
Сообщение: Re: In progress INSERT wrecks plans on table