Re: Make COPY extendable in order to support Parquet and other formats

Поиск
Список
Период
Сортировка
От Aleksander Alekseev
Тема Re: Make COPY extendable in order to support Parquet and other formats
Дата
Msg-id CAJ7c6TPcsFScSneXHJShZAfatcYS-VqX+TtVU8TAmHwnVoTioQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Make COPY extendable in order to support Parquet and other formats  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Ответы Re: Make COPY extendable in order to support Parquet and other formats  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Список pgsql-hackers
Hi Ashutosh,

> An extension just for COPY to/from parquet looks limited in
> functionality. Shouldn't this be viewed as an FDW or Table AM support
> for parquet or other formats? Of course the later is much larger in
> scope compared to the first one. But there may already be efforts
> underway
> https://www.postgresql.org/about/news/parquet-s3-fdw-01-was-newly-released-2179/

Many thanks for sharing your thoughts on this!

We are using parquet_fdw [2] but this is a read-only FDW.

What users typically need is to dump their data as fast as possible in
a given format and either to upload it to the cloud as historical data
or to transfer it to another system (Spark, etc). The data can be
accessed later if needed, as read only one.

Note that when accessing the historical data with parquet_fdw you
basically have a zero ingestion time.

Another possible use case is transferring data to PostgreSQL from
another source. Here the requirements are similar - the data should be
dumped as fast as possible from the source, transferred over the
network and imported as fast as possible.

In other words, personally I'm unaware of use cases when somebody
needs a complete read/write FDW or TableAM implementation for formats
like Parquet, ORC, etc. Also to my knowledge they are not particularly
optimized for this.

[2]: https://github.com/adjust/parquet_fdw

-- 
Best regards,
Aleksander Alekseev



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrey Borodin
Дата:
Сообщение: Re: Use fadvise in wal replay
Следующее
От: Jakub Wartak
Дата:
Сообщение: RE: Use fadvise in wal replay