Re: Make COPY extendable in order to support Parquet and other formats

Поиск
Список
Период
Сортировка
От Aleksander Alekseev
Тема Re: Make COPY extendable in order to support Parquet and other formats
Дата
Msg-id CAJ7c6TNFD84KK62xrGP-PDwPM7OESM8=TTv8TjsZpbOuNMnwGA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Make COPY extendable in order to support Parquet and other formats  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Список pgsql-hackers
Hi Ashutosh,

> IIUC, you want extensibility in FORMAT argument to COPY command
> https://www.postgresql.org/docs/current/sql-copy.html. Where the
> format is pluggable. That seems useful.
> Another option is to dump the data in csv format but use external
> utility to convert csv to parquet or whatever other format is. I
> understand that that's not going to be as efficient as dumping
> directly in the desired format.

Exactly. However, to clarify, I suspect this may be a bit more
involved than simply extending the FORMAT arguments.

This change per se will not be extremely useful. Currently nothing
prevents an extension author to iterate over a table using
heap_open(), heap_getnext(), etc API and dump its content in any
format. The user will have to write "dump_table(foo, filename)"
instead of "COPY ..." but that's not a big deal.

The problem is that every new extension has to re-invent things like
figuring out the schema, the validation of the data, etc. If we could
do this in the core so that an extension author has to implement only
the minimal format-dependent list of callbacks that would be really
great. In order to make the interface practical though one will have
to implement a practical extension as well, for instance, a Parquet
one.

This being said, if it turns out that for some reason this is not
realistic to deliver, ending up with simply extending this part of the
syntax a bit should be fine too.

-- 
Best regards,
Aleksander Alekseev



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "houzj.fnst@fujitsu.com"
Дата:
Сообщение: RE: Replica Identity check of partition table on subscriber
Следующее
От: "Drouvot, Bertrand"
Дата:
Сообщение: SYSTEM_USER reserved word implementation