Re: [HACKERS] GSOC'17 project introduction: Parallel COPY executionwith errors handling

Поиск
Список
Период
Сортировка
От Alexey Kondratov
Тема Re: [HACKERS] GSOC'17 project introduction: Parallel COPY executionwith errors handling
Дата
Msg-id 2F15DA8D-4FFF-4C2E-8110-F6FDB7DB9C09@gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] GSOC'17 project introduction: Parallel COPY executionwith errors handling  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers

On 13 Jun 2017, at 01:44, Peter Geoghegan <pg@bowt.ie> wrote:
I am not going to start with "speculative insertion" right now, but it would
be very
useful, if you give me a point, where to start. Maybe I will at least try to
evaluate
the complexity of the problem.

Speculative insertion has the following special entry points to
heapam.c and execIndexing.c, currently only called within
nodeModifyTable.c

Offhand, it doesn't seem like it would be that hard to teach another
heap_insert() caller the same tricks.

I went through the nodeModifyTable.c code and it seems not to be so 
difficult to do the same inside COPY.

My sense is that it's going to be hard to sell a committer on any
design that consumes subtransactions in a way that's not fairly
obvious to the user, and doesn't have a pretty easily understood worse
case. 

Yes, and worse case probably will be a quite frequent case, since it is not possible to do heap_multy_insert, if BEFORE/INSTEAD triggers or partitioning exist (according to the current copy.c code). Thus, it will frequently fall back into a single heap_insert, each being wrapped with subtransaction will consume XIDs too greedy and seriously affect performance. I like my previous idea less and less.

I haven't thought about this very carefully, but I guess you could do
something like passing a flag to ExecConstraints() that indicates
"don't throw an error; instead, just return false so I know not to
proceed"

Currently ExecConstraints always throws an error and I do not think, that it would be wise from my side to modify its behaviour.

I have updated my patch (rebased over the topmost master commit 94da2a6a9a05776953524424a3d8079e54bc5d94). Please, find patch file attached or always up to date version on GitHub https://github.com/ololobus/postgres/pull/1/files

Currently, It caches all major errors in the input data:

1) Rows with less/extra columns cause WARNINGs and are skipped

2) I found that input type format errors are thrown from the InputFunctionCall; and wrapped it up with PG_TRY/CATCH. I am not 100%



Alexey



Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: [HACKERS] Why forcing Hot_standby_feedback to be enabled whencreating a logical decoding slot on standby
Следующее
От: Andres Freund
Дата:
Сообщение: Re: [HACKERS] pg_waldump command line arguments