Re: partitioning an existing table

Поиск
Список
Период
Сортировка
От Justin Pryzby
Тема Re: partitioning an existing table
Дата
Msg-id 20171230053821.GE4172@telsasoft.com
обсуждение исходный текст
Ответ на partitioning an existing table  (Robert Blayzor <rblayzor.bulk@inoc.net>)
Ответы Re: partitioning an existing table
Список pgsql-performance
On Fri, Dec 29, 2017 at 11:37:56PM -0500, Robert Blayzor wrote:
> The docs claim that the master table “should” be empty. It it possible to just create child tables off an existing
mastertable with data, then just inserting data into the new child tables.
 
> 
> THe plan would be to keep data in the master table and purge it over time until it’s eventually empty, then drop the
indexesas well.
 
> 
> Fully understanding that data needs to be placed in the right child tables. Data outside of those child ranges would
remainas “old data” in the master table.
 
> 
> Just trying to grab if that’s an acceptable migration of live data from a single large table and move into
partitioning.Think of it as a very large table of cyclic data that ages out. New data in child tables while removing
datafrom the master table over time.
 

For PG10 "partitions" (as in relkind='p') the parent is defined as empty
(actually has no underlying storage).

For inheritance (available in and before PG10), the parent may be nonempty,
which works fine, although someone else might find it unintuitive.  (Does the
doc actually say "should" somewhere ?)

You almost certainly want child tables to have constraints, to allow
constraint_exclusion (which is the only reason one child table is more "right"
than any other, besides the associated pruning/retention schedule).

Since you'll be running DELETE rather than DROP on the parent, you might
consider DELETE ONLY..  but it won't matter if your children's constraints are
usable with DELETE's WHERE condition.

Also, note that autoanalyze doesn't know to analyze the PARENT's statistics
when its children are INSERTED/DROPPED/etc.  So I'd suggest to consider ANALYZE
each parent following DROP of its children (or maybe on some more frequent
schedule to handle inserted rows, too).  Perhaps that should be included as a
CAVEAT?
https://www.postgresql.org/docs/10/static/ddl-inherit.html#DDL-INHERIT-CAVEATS

Just curious: are your constraints/indices on starting time or ending time?

BTW depending on your requirements, it may be possible to make pg_dump much
more efficient.  For our data, it's reasonable to assume that a table is
"final" if its constraints exclude data older than a few days ago, and it can
be permanently dumped and excluded from future, daily backups, which makes the
backups smaller and faster, and probably causes less cache churn, etc.  But I
imagine you might have different requirements, so that may be infeasible, or
you'd maybe have to track insertions, either via pg_stat_user_tables, or at the
application layer, and redump the relevant table.

Justin


В списке pgsql-performance по дате отправления:

Предыдущее
От: Robert Blayzor
Дата:
Сообщение: partitioning an existing table
Следующее
От: Justin Pryzby
Дата:
Сообщение: analyze stats: child vs parent