Re: A space-efficient, user-friendly way to store categorical data

Поиск
Список
Период
Сортировка
От Mark Dilger
Тема Re: A space-efficient, user-friendly way to store categorical data
Дата
Msg-id A838A06F-5481-49BF-A4F1-23A17F7779DB@gmail.com
обсуждение исходный текст
Ответ на Re: A space-efficient, user-friendly way to store categorical data  (Andrew Kane <andrew@chartkick.com>)
Список pgsql-hackers
> On Feb 12, 2018, at 5:08 PM, Andrew Kane <andrew@chartkick.com> wrote:
>
> Thanks everyone for the feedback. The current enum implementation requires you to create a new type and add labels
outsidea transaction prior to an insert. 
>
> -- on table creation
> CREATE TYPE city AS ENUM ();
> CREATE TABLE "users" ("city" city);
>
> -- on insert
> ALTER TYPE city ADD VALUE IF NOT EXISTS 'Chicago';
> BEGIN;
> INSERT INTO "users" ("city") VALUES ('Chicago');
> COMMIT;
>
> What would be ideal:
>
> -- on table creation
> CREATE TABLE "users" ("city" dynamic_enum);
>
> -- on insert
> BEGIN;
> INSERT INTO "users" ("city") VALUES ('Chicago');
> COMMIT;
>
> Since enums have a fixed number of labels, this type of feature may be better off as a property you could add to text
columns(as Thomas mentions). This would avoid issues with hitting the max number of labels. 


In your proposed feature, what happens if I create two tables:

CREATE TABLE myusers (city dynamic_enum);
CREATE TABLE yourusers (city dynamic_enum);

Do you imagine that myusers and yourusers are referring to the
same enum or to two different enums?  Are the enums stored in
a new table within pg_catalog, or are they stored in something akin
to a toast table?  If you insert billions of rows into a table, but only
have 30 distinct values, can you quickly query for all 30 distinct enum
values, or would you have to walk billions of rows to find them all?

mark




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Masahiko Sawada
Дата:
Сообщение: Typo in origin.c
Следующее
От: Mark Dilger
Дата:
Сообщение: Re: A space-efficient, user-friendly way to store categorical data