Re: WIP patch: add (PRE|POST)PROCESSOR options to COPY

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: WIP patch: add (PRE|POST)PROCESSOR options to COPY
Дата
Msg-id 23999.1352921872@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: WIP patch: add (PRE|POST)PROCESSOR options to COPY  (Andrew Dunstan <andrew@dunslane.net>)
Ответы Re: WIP patch: add (PRE|POST)PROCESSOR options to COPY  (Andrew Dunstan <andrew@dunslane.net>)
Список pgsql-hackers
Andrew Dunstan <andrew@dunslane.net> writes:
> On 11/14/2012 02:05 PM, Peter Eisentraut wrote:
>> Why don't you filter the data before it gets to stdin?  Some program is
>> feeding the data to "stdin" on the client side.  Why doesn't that do the
>> filtering?  I don't see a large advantage in having the data be sent
>> unfiltered to the server and having the server do the filtering.

> Centralization of processing would be one obvious reason.

If I understand correctly, what you're imagining is that the client
sources data to a COPY FROM STDIN type of command, then the backend
pipes that out to stdin of some filtering program, which it then reads
the stdout of to get the data it processes and stores.

We could in principle make that work, but there are some pretty serious
implementation problems: popen doesn't do this so we'd have to cons up
our own fork and pipe setup code, and we would have to write a bunch of
asynchronous processing logic to account for the possibility that the
filter program doesn't return data in similar-size chunk to what it
reads.  (IOW, it will never be clear when to try to read data from the
filter and when to try to write data to it.)

I think it's way too complicated for the amount of functionality you'd
get.  As Peter says, there's no strong reason not to do such processing
on the client side.  In fact there are pretty strong reasons to prefer
to do it there, like not needing database superuser privilege to invoke
the filter program.

What I'm imagining is a very very simple addition to COPY that just
allows it to execute popen() instead of fopen() to read or write the
data source/sink.  What you suggest would require hundreds of lines and
create many opportunities for new bugs.
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: Re: WIP patch: add (PRE|POST)PROCESSOR options to COPY
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Further pg_upgrade analysis for many tables