Re: Split copy.c

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Split copy.c
Дата
Msg-id 21d9ed60-1269-c1f2-7f89-6bd722d535e2@iki.fi
обсуждение исходный текст
Ответ на Re: Split copy.c  (David Rowley <dgrowleyml@gmail.com>)
Ответы Re: Split copy.c
Список pgsql-hackers
On 03/11/2020 04:15, David Rowley wrote:
> On Tue, 3 Nov 2020 at 07:35, Andres Freund <andres@anarazel.de> wrote:
>>
>> On 2020-11-02 19:43:38 +0200, Heikki Linnakangas wrote:
>>> On 02/11/2020 19:23, Andres Freund wrote:
>>>> On 2020-11-02 11:03:29 +0200, Heikki Linnakangas wrote:
>>>>> There isn't much common code between COPY FROM and COPY TO, so I propose
>>>>> that we split copy.c into two: copyfrom.c and copyto.c. See attached. I thin
>>>>> that's much nicer.
>>>>
>>>> Not quite convinced that's the right split - or perhaps there's just
>>>> more potential. My feeling is that splitting out all the DML related
>>>> code would make the code considerably easier to read.
>>>
>>> What do you mean by DML related code?
>>
>> Basically all the insertion related code (e.g CopyMultiInsert*, lots of
>> code in CopyFrom()) and perhaps also the type input invocations.

Hmm. COPY FROM consists of two parts:

1. Parse the input text/CSV/binary format into Datums.

2. Store the Datums in the table.

They're somewhat split already. If you want to only do the parsing part, 
you can use the BeginCopyFrom() and NextCopyFrom() functions to get 
Datums, without storing them to a table. file_fdw uses that.

Yeah, that might indeed be another good split point. Attached is quick 
prototype of that. I tried to avoid doing other changes as part of this 
split, but some further refactoring might be good. Like extracting the 
state for the input parsing from CopyFromStateData into a separate struct.

With these patches:

$ wc -l src/backend/commands/copy*.c
    782 src/backend/commands/copy.c
   1641 src/backend/commands/copyfrom.c
   1646 src/backend/commands/copyfromparse.c
   1363 src/backend/commands/copyto.c
   5432 total

> I quite like the fact that those are static and inline-able.  I very
> much imagine there'd be a performance hit if we moved them out to
> another .c file and made them extern.  Some of those functions can be
> quite hot when copying into a partitioned table.

Agreed.

- Heikki

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jinbao Chen
Дата:
Сообщение: Re: Add table AM 'tid_visible'
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: Dereference before NULL check (src/backend/storage/ipc/latch.c)