Re: Storing large documents - one table or partition by doc?

Поиск
Список
Период
Сортировка
От Jim Nasby
Тема Re: Storing large documents - one table or partition by doc?
Дата
Msg-id 649c3d87-0df1-a8e2-49da-bc200bc16e86@BlueTreble.com
обсуждение исходный текст
Ответ на Re: Storing large documents - one table or partition by doc?  (Dev Nop <devnop0@gmail.com>)
Список pgsql-performance
On 9/24/16 6:33 AM, Dev Nop wrote:
> This means that the applications are sensitive to the size of ids. A
> previous incarnation used GUIDs which was a brutal overhead for large
> documents.

If GUIDs *stored in a binary format* were too large, then you won't be
terribly happy with the 24 byte per-row overhead in Postgres.

What I would look into at this point is using int ranges and arrays to
greatly reduce your overhead:

CREATE TABLE ...(
   document_version_id int NOT NULL REFERENCES document_version
   , document_line_range int4range NOT NULL
   , document_lines text[] NOT NULL
   , EXCLUDE USING gist( document_version_id =, document_line_range && )
);

That allows you to store the lines of a document as an array of values, ie:

INSERT INTO ... VALUES(
   1
   , '[11-15]'
   , '[11:15]={line11,line12,line13,line14,line15}'
);

Note that I'm using explicit array bounds syntax to make the array
bounds match the line numbers. I'm not sure that's a great idea, but it
is possible.


> My nightmares are of a future filled with hours of down-time caused by
> struggling to restore a gargantuan table from a backup due to a problem
> with just one tiny document or schema changes that require disconnecting
> all clients for hours when instead I could ignore best practice, create
> 10k tables and process them iteratively and live in a utopia where I
> never have 100% downtime only per document unavailability.

At some size you'd certainly want partitioning. The good news is that
you can mostly hide partitioning from the application and other database
logic, so there's not a lot of incentive to set it up immediately. You
can always do that after the fact.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


В списке pgsql-performance по дате отправления:

Предыдущее
От: Dev Nop
Дата:
Сообщение: Re: Storing large documents - one table or partition by doc?
Следующее
От: Jeff Janes
Дата:
Сообщение: Re: Storing large documents - one table or partition by doc?