Re: [HACKERS] GSOC'17 project introduction: Parallel COPY executionwith errors handling

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: [HACKERS] GSOC'17 project introduction: Parallel COPY executionwith errors handling
Дата
Msg-id CAH2-Wzniezd1NrLyN+338T7O3Y=ukQuULSp6sN-rE2-6-EpHFw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] GSOC'17 project introduction: Parallel COPY executionwith errors handling  (Alexey Kondratov <kondratov.aleksey@gmail.com>)
Ответы Re: [HACKERS] GSOC'17 project introduction: Parallel COPY executionwith errors handling  (Alexey Kondratov <kondratov.aleksey@gmail.com>)
Re: [HACKERS] GSOC'17 project introduction: Parallel COPY executionwith errors handling  (Alexey Kondratov <kondratov.aleksey@gmail.com>)
Список pgsql-hackers
On Mon, Jun 12, 2017 at 3:52 AM, Alexey Kondratov
<kondratov.aleksey@gmail.com> wrote:
> I am not going to start with "speculative insertion" right now, but it would
> be very
> useful, if you give me a point, where to start. Maybe I will at least try to
> evaluate
> the complexity of the problem.

Speculative insertion has the following special entry points to
heapam.c and execIndexing.c, currently only called within
nodeModifyTable.c:

* SpeculativeInsertionLockAcquire()

* HeapTupleHeaderSetSpeculativeToken()

* heap_insert() called with HEAP_INSERT_SPECULATIVE argument

* ExecInsertIndexTuples() with specInsert = true

* heap_finish_speculative()

* heap_abort_speculative()

Offhand, it doesn't seem like it would be that hard to teach another
heap_insert() caller the same tricks.

>> My advice right now is: see if you can figure out a way of doing what
>> you want without subtransactions at all, possibly by cutting some
>> scope. For example, maybe it would be satisfactory to have the
>> implementation just ignore constraint violations, but still raise
>> errors for invalid input for types.
>
>
> Initially I was thinking only about malformed rows, e.g. less or extra
> columns.
> Honestly, I did not know that there are so many levels and ways where error
> can occur.

My sense is that it's going to be hard to sell a committer on any
design that consumes subtransactions in a way that's not fairly
obvious to the user, and doesn't have a pretty easily understood worse
case. But, that's just my opinion, and it's possible that someone else
will disagree. Try to get a second opinion.

Limiting the feature to just skip rows on the basis of a formally
defined constraint failing (not including type input failure, or a
trigger throwing an error, and probably not including foreign key
failures because they're really triggers) might be a good approach.
MySQL's INSERT IGNORE is a bit like that, I think. (It doesn't *just*
ignore duplicate violations, unlike our ON CONFLICT DO NOTHING
feature).

I haven't thought about this very carefully, but I guess you could do
something like passing a flag to ExecConstraints() that indicates
"don't throw an error; instead, just return false so I know not to
proceed". Plus maybe one or two other cases, like using speculative
insertion to back out of unique violation without consuming a subxact.

-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: [HACKERS] Relpartbound, toasting and pg_class
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] Relpartbound, toasting and pg_class