Re: Parallel copy

Поиск
Список
Период
Сортировка
От Greg Nancarrow
Тема Re: Parallel copy
Дата
Msg-id CAJcOf-cQ8r1P8gLE+PFR9BWMi2hPLZSXE=deK4iOx0eHsYLFUA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Parallel copy  (vignesh C <vignesh21@gmail.com>)
Ответы Re: Parallel copy
Re: Parallel copy
Список pgsql-hackers
Hi Vignesh,

>Can you share with me the script you used to generate the data & the ddl of the table, so that it will help me check
that>scenario you faced the >problem.
 

Unfortunately I can't directly share it (considered company IP),
though having said that it's only doing something that is relatively
simple and unremarkable, so I'd expect it to be much like what you are
currently doing. I can describe it in general.

The table being used contains 100 columns (as I pointed out earlier),
with the first column of "bigserial" type, and the others of different
types like "character varying(255)", "numeric", "date" and "time
without timezone". There's about 60 of the "character varying(255)"
overall, with the other types interspersed.

When testing with indexes, 4 b-tree indexes were used that each
included the first column and then distinctly 9 other columns.

A CSV record (row) template file was created with test data
(corresponding to the table), and that was simply copied and appended
over and over with a record prefix in order to create the test data
file.
The following shell-script basically does it (but very slowly). I was
using a small C program to do similar, a lot faster.
In my case, N=2550000 produced about a 5GB CSV file.

    file_out=data.csv; for i in {1..N}; do echo -n "$i," >> $file_out;
cat sample_record.csv >> $file_out; done

One other thing I should mention is that between each test run, I
cleared the OS page cache, as described here:
https://linuxhint.com/clear_cache_linux/
That way, each COPY FROM is not taking advantage of any OS-cached data
from a previous COPY FROM.

If your data is somehow significantly different and you want to (and
can) share your script, then I can try it in my environment.


Regards,
Greg



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fujii Masao
Дата:
Сообщение: Re: New statistics for tuning WAL buffer size
Следующее
От: Michael Banck
Дата:
Сообщение: [patch] Fix checksum verification in base backups for zero page headers