Re: best practice for moving millions of rows to child table when setting up partitioning?

Поиск
Список
Период
Сортировка
От Mark Stosberg
Тема Re: best practice for moving millions of rows to child table when setting up partitioning?
Дата
Msg-id ip9tkc$jek$1@dough.gmane.org
обсуждение исходный текст
Ответ на best practice for moving millions of rows to child table when setting up partitioning?  (Mark Stosberg <mark@summersault.com>)
Ответы Re: Re: best practice for moving millions of rows to child table when setting up partitioning?  ("ktm@rice.edu" <ktm@rice.edu>)
Re: Re: best practice for moving millions of rows to child table when setting up partitioning?  (Scott Marlowe <scott.marlowe@gmail.com>)
Список pgsql-admin
On 04/27/2011 10:48 AM, Mark Stosberg wrote:
>
> Hello,
>
> I'm working on moving a table with over 30 million to rows to be
> partitioned. The table seeing several inserts per second. It's
> essentially an activity log that only sees insert activity and is
> lightly used for reporting, such that queries against it can safely be
> disabled during a transition.
>
> I'm looking for recommendations for a way to do this that will be least
> disruptive to the flow of inserts statements that will continue to
> stream in.
>
> Here's the plan which seems best to me at the moment. Is it is
> reasonable?

I revised my plan based on feedback and mentioned resources here, and
also due to my own concerns about the resource and performance impact of
using the INSERT .. SELECT pattern on millions of rows at a time.

Here's my revised plan for the transition, which avoids using large
INSERT..SELECT statements, and only requires a minimal amount of the
transition to work to happen once the application has started to insert
data into the child tables. Seem reasonable?

New Plan
---------

1. Create the child tables targeted to contain data from the parent as
   standalone tables (including archive tables and the current month).

2. Take a full database backup from this morning and extract the COPY
   statement for the parent table. Manually split it up by date to create
   multiple copy statements, one for each partition.

3. Run the COPY statements to load the data into each child table.  So
   far, everything has happened outside of application access.

4. Find the max ID that has been inserted in the current child table.

5. INSERT .. SELECT the missing rows from the last backup from the parent
   table to the current child table to be. Again, note the max ID.

6. Now, during a maintenance window:
   - alter the child tables to inherit the parent
   - Set up the trigger which starts redirecting inserts
     from the parent table to the child table.
   - INSERT .. SELECT the file the final few missing rows from
     the parent to the current child
   - TRUNCATE then CLUSTER the parent table.

В списке pgsql-admin по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Starting psql without a database?
Следующее
От: ramon cruel
Дата:
Сообщение: Please, i want exit here