Re: Storing large documents - one table or partition by doc?

Поиск

Список

Период

Сортировка

От	Dev Nop
Тема	Re: Storing large documents - one table or partition by doc?
Дата	26 сентября 2016 г. 11:27:39
Msg-id	CACjtUOR-YX1NW9iyE=3K0rsAXvwgxHAB9iSjR5WCr3CupfSbxA@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Storing large documents - one table or partition by doc? (Dev Nop <devnop0@gmail.com>)
Список	pgsql-performance

Дерево обсуждения

If GUIDs *stored in a binary format* were too large, then you won't be terribly happy with the 24 byte per-row overhead in Postgres.

Heh. In this case the ids have a life outside the database in various text formats.

What I would look into at this point is using int ranges and arrays to greatly reduce your overhead:
CREATE TABLE ...(
document_version_id int NOT NULL REFERENCES document_version
, document_line_range int4range NOT NULL
, document_lines text[] NOT NULL
, EXCLUDE USING gist( document_version_id =, document_line_range && )
);

Thanks! Some new things for me to learn about there. Had to read "Range Types: Your Life Will Never Be The Same" - lol. https://wiki.postgresql.org/images/7/73/Range-types-pgopen-2012.pdf

To check I understand what you are proposing: the current version and history is stored in the same table. Each line is referred to by a sequential line number and then lines are stored in sequential chunks with range + array. The gist index is preventing any insert with the same version & line range. This sounds very compact for a static doc but doesn't it mean lines must be renumbered on inserts/moves?

On Mon, Sep 26, 2016 at 9:26 AM, Dev Nop <devnop0@gmail.com> wrote:

If GUIDs *stored in a binary format* were too large, then you won't be terribly happy with the 24 byte per-row overhead in Postgres.

Heh. In this case the ids have a life outside the database in various text formats.

What I would look into at this point is using int ranges and arrays to greatly reduce your overhead:
CREATE TABLE ...(
document_version_id int NOT NULL REFERENCES document_version
, document_line_range int4range NOT NULL
, document_lines text[] NOT NULL
, EXCLUDE USING gist( document_version_id =, document_line_range && )
);

Thanks! Some new things for me to learn about there. Had to read "Range Types: Your Life Will Never Be The Same" - lol. https://wiki.postgresql.org/images/7/73/Range-types-pgopen-2012.pdf

To check I understand what you are proposing: the current version and history is stored in the same table. Each line is referred to by a sequential line number and then lines are stored in sequential chunks with range + array. The gist index is preventing any insert with the same version & line range. This sounds very compact for a static doc but doesn't it mean lines must be renumbered on inserts/moves?

В списке pgsql-performance по дате отправления:

Предыдущее

От: julyanto SUTANDANG
Дата: 26 сентября 2016 г., 09:28:17
Сообщение: Re: Millions of tables

Следующее

От: Álvaro Hernández Tortosa
Дата: 26 сентября 2016 г., 11:29:03
Сообщение: Re: Millions of tables

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Storing large documents - one table or partition by doc?

Предыдущее

Следующее