Re: New Copy Formats - avro/orc/parquet

Поиск
Список
Период
Сортировка
От Nicolas Paris
Тема Re: New Copy Formats - avro/orc/parquet
Дата
Msg-id 20180211204126.i3sbze3dor2llxty@gmail.com
обсуждение исходный текст
Ответ на Re: New Copy Formats - avro/orc/parquet  (Andres Freund <andres@anarazel.de>)
Ответы Re: New Copy Formats - avro/orc/parquet  (Andres Freund <andres@anarazel.de>)
Список pgsql-general
Le 11 févr. 2018 à 21:03, Andres Freund écrivait :
> 
> 
> On February 11, 2018 12:00:12 PM PST, Nicolas Paris <niparisco@gmail.com> wrote:
> >> > That is true, but the question is how significant the overhead is.
> >If
> >> > it's 50% then reducing it would make perfect sense. If it's 1% then
> >no
> >> > one if going to be bothered by it.
> >> 
> >> I think it's pretty clear that it's going to be way way much more
> >than
> >> 1%. 
> >
> >Good news but not sure to anderstand why.
> 
> I think you might have misunderstood my reply? I'm saying that going through PROGRAM will have significant overhead.
Ican't quite make sense of the rest of your reply otherwise?
 

True, I misunderstood. Then I agree the computation overhead should be
non-negligible.

I have also the storage and network transfers overhead in mind:
All those new formats are compressed; this is not true for current
postgres BINARY format and obviously text based format. By experience,
the binary format is 10 to 30% larger than the text one. On the
contrary, an ORC file can be up to 10 times smaller than a text base
format.


В списке pgsql-general по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: New Copy Formats - avro/orc/parquet
Следующее
От: Andres Freund
Дата:
Сообщение: Re: New Copy Formats - avro/orc/parquet