[PATCH] Initial progress reporting for COPY command

Поиск
Список
Период
Сортировка
От Josef Šimánek
Тема [PATCH] Initial progress reporting for COPY command
Дата
Msg-id CAFp7QwqMGEi4OyyaLEK9DR0+E+oK3UtA4bEjDVCa4bNkwUY2PQ@mail.gmail.com
обсуждение исходный текст
Ответы Re: [PATCH] Initial progress reporting for COPY command  (Michael Paquier <michael@paquier.xyz>)
Re: [PATCH] Initial progress reporting for COPY command  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Re: [PATCH] Initial progress reporting for COPY command  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Re: [PATCH] Initial progress reporting for COPY command  (Josef Šimánek <josef.simanek@gmail.com>)
Список pgsql-hackers
Hello, as proposed by Pavel Stěhule and discussed on local czech PostgreSQL maillist (https://groups.google.com/d/msgid/postgresql-cz/CAFj8pRCZ42CBCa1bPHr7htffSV%2BNAcgcHHG0dVqOog4bsu2LFw%40mail.gmail.com?utm_medium=email&utm_source=footer), I have prepared an initial patch for COPY command progress reporting.

Few examples first:

"COPY (SELECT * FROM test) TO '/tmp/ids';"

yr=# SELECT * from pg_stat_progress_copy;
   pid   | datid | datname | relid | direction | file | program | lines_processed | file_bytes_processed
---------+-------+---------+-------+-----------+------+---------+-----------------+----------------------
 3347126 | 16384 | yr      |     0 | TO        | t    | f       |         3529943 |             24906226
(1 row)
 
"COPY test FROM '/tmp/ids';

yr=# SELECT * from pg_stat_progress_copy;
   pid   | datid | datname | relid | direction | file | program | lines_processed | file_bytes_processed
---------+-------+---------+-------+-----------+------+---------+-----------------+----------------------
 3347126 | 16384 | yr      | 16385 | FROM      | t    | f       |       121591999 |            957218816
(1 row)

Columns are inspired by CREATE INDEX progress report system view.

pid - integer - PID of backend
datid - oid - OID of related database
datname - name - name of related database (this seems redundant, since oid should be enough, but it is the same in CREATE INDEX)
relid - oid - oid of table related to COPY command, when not known (for example when copying to file, it is 0)
direction - text - one of "FROM" or "TO" depends on command used
file - bool - is file is used?
program - bool - is program used?
lines_processed - bigint - amount of processed lines, works for both directions (FROM/TO)
file_bytes_processed - amount of bytes processed when file is used (otherwise 0), works for both direction (
FROM/TO) when file is used (file = t)

Patch is attached and can be found also at https://github.com/simi/postgres/pull/5.

Diff version: https://github.com/simi/postgres/pull/5.diff
Patch version: https://github.com/simi/postgres/pull/5.patch

I havefew initial notes and questions.

I'm using ftell to get current position in file to populate file_bytes_processed without error handling (ftell can return -1L and also populate errno on problems).

1. Is that a good way to get progress of file processing?
2. Is it safe in given context to not care about errors? If not, what to do on error?

Some columns are not populated on certain COPY commands. For example when a file is not used, file_bytes_processed is set to 0. Would it be better to use NULL instead when the column is not related to the current command? Same problem is for relid column.

I have not found any tests for progress reporting. Are there any? It would need two backends running (one running COPY, one checking output of report view). Is there any similar test I can inspire at? In theory, it should be possible to use dblink_send_query to run async COPY command in the background.

My initial (attached) patch also doesn't introduce documentation for this system view. I can add that later once this patch is finalized (if that happens).
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fabien COELHO
Дата:
Сообщение: Re: Internal key management system
Следующее
От: Amit Langote
Дата:
Сообщение: Re: problem with RETURNING and update row movement