Re: New Copy Formats - avro/orc/parquet

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: New Copy Formats - avro/orc/parquet
Дата
Msg-id 3274b427-fe3a-89fc-2302-52082bf817aa@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: New Copy Formats - avro/orc/parquet  (Nicolas Paris <niparisco@gmail.com>)
Ответы Re: New Copy Formats - avro/orc/parquet  (Andres Freund <andres@anarazel.de>)
Список pgsql-general

On 02/10/2018 04:30 PM, Nicolas Paris wrote:
>>> I d'found useful to be able to import/export from postgres to those modern data
>>> formats:
>>> - avro (c writer=https://avro.apache.org/docs/1.8.2/api/c/index.html)
>>> - parquet (c++ writer=https://github.com/apache/parquet-cpp)
>>> - orc (all writers=https://github.com/apache/orc)
>>>
>>> Something like :
>>> COPY table TO STDOUT ORC;
>>>
>>> Would be lovely.
>>>
>>> This would greatly enhance how postgres integrates in big-data ecosystem.
>>>
>>> Any thought ?
>>
>> https://www.postgresql.org/docs/10/static/sql-copy.html
>>
>> "PROGRAM
>>
>>     A command to execute. In COPY FROM, the input is read from standard
>> output of the command, and in COPY TO, the output is written to the standard
>> input of the command.
>>
>>     Note that the command is invoked by the shell, so if you need to pass
>> any arguments to shell command that come from an untrusted source, you must
>> be careful to strip or escape any special characters that might have a
>> special meaning for the shell. For security reasons, it is best to use a
>> fixed command string, or at least avoid passing any user input in it.
>> "
>>
> 
> PROGRAM would involve overhead of transforming data from CSV or
> BINARY to AVRO for example.
> 
> Here, I am talking about native format exports/imports for
> performance considerations.
> 

That is true, but the question is how significant the overhead is. If
it's 50% then reducing it would make perfect sense. If it's 1% then no
one if going to be bothered by it.

Without these numbers it's hard to make any judgments.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


В списке pgsql-general по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: New Copy Formats - avro/orc/parquet
Следующее
От: PegoraroF10
Дата:
Сообщение: execute block like Firebird does