Re: Making "COPY partitioned_table FROM" faster

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: Making "COPY partitioned_table FROM" faster
Дата
Msg-id CAKJS1f9ZZ-ZaLq4rbbJWYHKC9-VG0GOnNb4VdOZSG2=3fHJ5Sg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Making "COPY partitioned_table FROM" faster  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Ответы Re: Making "COPY partitioned_table FROM" faster  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Список pgsql-hackers
On 30 July 2018 at 20:33, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> Two more thoughts:
>
> - Add some tests.  The if (nBufferedTuples > 0) that flushes the tuples
> when the partition changes is not currently exercised.

That seems like a good idea. In fact, it uncovered a bug around
ConvertPartitionTupleSlot() freeing the previously stored tuple out
the slot which resulted in a crash. I didn't notice before because my
test had previously not required any tuple conversions.

While ensuring my test was working correctly I noticed that I had a
thinko in the logic that decided if another avgTuplesPerPartChange
calculation was due.  The problem was that the (cstate->cur_lineno -
RECHECK_MULTI_INSERT_THRESHOLD) is unsigned and when cur_lineno is
below RECHECK_MULTI_INSERT_THRESHOLD it results in an underflow which
makes the if test always true until cur_lineno gets up to 1000. I
considered just making all those counters signed, but chickened out
when it came to changing "processed". I thought it was quite strange
to have "processed" unsigned and the other counters that do similar
things signed.  Of course, signed would have enough range, but it
would mean changing the return type of CopyFrom() which I wasn't
willing to do.

In the end, I just protected the if test so that it only calculates
the average again if cur_lineno is at least
RECHECK_MULTI_INSERT_THRESHOLD. This slightly changes when
avgTuplesPerPartChange is first set, but it'll still be at least 1000
tuples before we start doing multi-inserts. Another option would be to
cast the expression as int64... I'm a bit undecided what's best here.

> - With proute becoming a function-level variable,
> cstate->partition_tuple_routing is obsolete and could be removed.  (No
> point in saving this in cstate if it's only used in one function anyway.)

Good idea. I've removed that.

I've attached a delta patch based on the v4 patch.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jesper Pedersen
Дата:
Сообщение: Re: partition tree inspection functions
Следующее
От: "Jonathan S. Katz"
Дата:
Сообщение: Re: Explain buffers wrong counter with parallel plans