Re: A space-efficient, user-friendly way to store categorical data

Поиск
Список
Период
Сортировка
От Andrew Dunstan
Тема Re: A space-efficient, user-friendly way to store categorical data
Дата
Msg-id CAA8=A7-df9JSaVqHy2bRJBfNP=NjqdfmKHMPbPcM6Cs_3x7RoQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: A space-efficient, user-friendly way to store categorical data  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: A space-efficient, user-friendly way to store categorical data
Re: A space-efficient, user-friendly way to store categorical data
Список pgsql-hackers
On Mon, Feb 12, 2018 at 9:10 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andrew Kane <andrew@chartkick.com> writes:
>> A better option could be a new "dynamic enum" type, which would have
>> similar storage requirements as an enum, but instead of labels being
>> declared ahead of time, they would be added as data is inserted.
>
> You realize, of course, that it's possible to add labels to an enum type
> today.  (Removing them is another story.)
>
> You haven't explained exactly what you have in mind that is going to be
> able to duplicate the advantages of the current enum implementation
> without its disadvantages, so it's hard to evaluate this proposal.
>


This sounds rather like the idea I have been tossing around in my head
for a while, and in sporadic discussions with a few people, for a
dictionary object. The idea is to have an append-only list of labels
which would not obey transactional semantics, and would thus help us
avoid the pitfalls of enums - there wouldn't be any rollback of an
addition.  The use case would be for a jsonb representation which
would replace object keys with the oid value of the corresponding
dictionary entry rather like enums now. We could have a per-table
dictionary which in most typical json use cases would be very small,
and we know from some experimental data that the compression in space
used from such a change would often be substantial.

This would have to be modifiable dynamically rather than requiring
explicit additions to the dictionary, to be of practical use for the
jsonb case, I believe.

I hadn't thought about this as a sort of super enum that was usable
directly by users, but it makes sense.

I have no idea how hard or even possible it would be to implement.

cheers

andrew

-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: A space-efficient, user-friendly way to store categorical data
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: [HACKERS] A misconception about the meaning of 'volatile' in GetNewTransactionId?