Re: VARIANT / ANYTYPE datatype

Поиск
Список
Период
Сортировка
От Merlin Moncure
Тема Re: VARIANT / ANYTYPE datatype
Дата
Msg-id BANLkTikDx9TAbQMbLsFh_AX+-7k-7KSzkQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: VARIANT / ANYTYPE datatype  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Wed, May 4, 2011 at 8:03 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> As a followup idea there exists the desire to store records as records
>> and not text representation of same (given differing record types, of
>> course), for which it'd be more worthwhile.
>
> Maybe.  The conventional wisdom is that text representation of data is
> more compact than PG's internal representation by a significant factor
> --- our FAQ says up to 5x, in fact.  I know that that's including row
> overhead and indexes and so on, but I still don't find it to be a given
> that you're going to win on space with this sort of trick.

I've done a lot of testing of the text vs binary format on the wire
format...not exactly the same set of issues, but pretty close since
you have to send all the oids, lengths, etc.   Conventional wisdom is
correct although overstated for this topic.  Even in truly
pathological cases for text, for example in sending multiple levels of
redundant escaping in complex structures, the text format will almost
always be smaller.  For 'typical' data it can be significantly
smaller.  Two exceptions most people will run into are bytea obviously
and the timestamp family of types where binary style manipulation is a
huge win both in terms of space and performance.

For complex data (say 3+ levels of composites stacked in arrays),
binary type formats are much *faster*, albeit larger, via binary as
long as you are not bandwidth constrained, and presumably they would
be as well for variants. Perhaps even more so, because some of the
manipulations made converting tuple storage to binary wire formats
don't have to happen.  That said, while there are use cases for
sending highly structured data over the wire, I can't think of any for
direct storage on a table in variant type scenarios, at least not yet
:-).

merlin


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexander Korotkov
Дата:
Сообщение: Re: GSoC 2011: Fast GiST index build
Следующее
От: Magnus Hagander
Дата:
Сообщение: Re: adding a new column in IDENTIFY_SYSTEM