Re: Reduce WAL logging of INSERT SELECT

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: Reduce WAL logging of INSERT SELECT
Дата
Msg-id 201108112049.p7BKnS727775@momjian.us
обсуждение исходный текст
Ответ на Re: Reduce WAL logging of INSERT SELECT  (Simon Riggs <simon@2ndQuadrant.com>)
Список pgsql-hackers
Simon Riggs wrote:
> On Sat, Aug 6, 2011 at 4:16 AM, Bruce Momjian <bruce@momjian.us> wrote:
>
> > Well, if the table is created in the same transaction (which is the only
> > case under consideration), no other sessions can write to the table so
> > you are just writing the entire table on commit, rather than to the WAL.
>
> Below a certain point, skipping WAL is slower and over an intermediate
> range there is no benefit. So small amounts of data on large servers
> goes slower.
>
> heap_fsync() requires a scan of shared buffers, which may not be cheap.
>
> There is a difficulty because you would need to calculate the cut-off
> is for a particular database, and then predict ahead of time whether
> the number of rows that will be handled by the statement is low enough
> to warrant using the optimisation. Both of which I call a hard
> problem.
>
> I think we should remove the COPY optimisation because of this and
> definitely not extend INSERT SELECT to perform it automatically.

I ran some tests and Simon was correct and I was wrong.  There is a
measurable overhead to the optimization of avoiding WAL traffic for
small tables.

I tested git head with COPY and created the table inside and outside the
COPY transaction, with the attached script.  It ran in 11 seconds
without the optimization, and 12 seconds with the CREATE TABLE inside
the COPY transaction.

With these results, I withdraw my idea of adding this optimization to
other commands.  I think COPY is usually used in bulk mode, but the
other commands are often used in smaller batches that would be worse
with this optimization.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +
:

. traprm

echo 1 > $TMP/1

echo "DROP TABLE copytest;" > $TMP/0
for X in $(jot 1000)
do
    cat >> $TMP/0 <<END
-- move the next line into the transaction block to test the optimization overhead
CREATE TABLE copytest(x int);
BEGIN WORK;
COPY copytest FROM '$TMP/1';
COMMIT;
DROP TABLE copytest;
END
done
time sql test < $TMP/0 > /dev/null


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexander Korotkov
Дата:
Сообщение: Re: WIP: Fast GiST index build
Следующее
От: "Greg Sabino Mullane"
Дата:
Сообщение: Re: index-only scans