Re: [GENERAL] efficiently migrating 'old' data from one table to another

Поиск
Список
Период
Сортировка
От Merlin Moncure
Тема Re: [GENERAL] efficiently migrating 'old' data from one table to another
Дата
Msg-id CAHyXU0y5nY6Zp-mYANNbyOa0L=9HMVBbrhJaLq5HHcNeTenAEA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [GENERAL] efficiently migrating 'old' data from one table to another  ("" <kbrannen@pwhome.com>)
Список pgsql-general
On Fri, Jan 13, 2017 at 12:03 PM,  <kbrannen@pwhome.com> wrote:
> On Jan 12, 2017, Jonathan Vanasco <postgres@2xlp.com> wrote:
>>On Jan 12, 2017, at 5:52 PM, Merlin Moncure wrote:
>>
>>> On Thu, Jan 12, 2017 at 2:19 PM, btober@computer.org
>>> <btober@broadstripe.net> wrote:
>>>>
>>>> Review manual section 7.8.2. Data-Modifying Statements in WITH
>>>>
>>>>
>>>> https://www.postgresql.org/docs/9.6/static/queries-with.html
>>>
>>> this.
>>>
>>> with data as (delete from foo where ... returning * ) insert into
>>> foo_backup select * from data;
>>
>>Thanks, btober and merlin.  that's exactly what i want.
>
> To help you a little more, I just did this for a set of tables within the
> last week. :) The heart of the program is this sql:
>
>     my $Chunk_size = 10000;
>     my $Interval = 24;
>     my $sql = "
>         WITH
>             keys AS (
>                 SELECT $pk_column
>                 FROM $table
>                 WHERE $time_column < NOW() - '$Interval MONTHS'::INTERVAL
>                 ORDER BY $pk_column
>                 LIMIT $Chunk_size ),
>             data AS (
>                 DELETE FROM $table
>                 WHERE $pk_column <= (SELECT MAX($pk_column) FROM keys)
>                 RETURNING * )
>         INSERT INTO archive_$table SELECT * FROM data;";
>
> That's from Perl, but I suspect you can guess as to what each var should be for
> your application. You can set $Chunk_size to whatever you want. There is
> obviously a loop around that which executes until we get 0 rows, then we move
> on to the next table.
>
> The point of the chunks was to limit the impact on the production tables
> as we move data out of them. If you don't have that concern and want to do all
> rows at once then remove the LIMIT and ORDER BY.

FYI, although it's likely ok in this instance, directly inserting
table names without precaution is considered dubious and should be
avoided as practice.  SQL injection is a risk, and your code will fail
in the presence of unusual bug legal table names containing spaces.

For posterity handling this kind of action inside the database (via
plpgsql/EXECUTE) in order to leverage some internal routines,
especially quote_ident(), is generally a good idea.

merlin


В списке pgsql-general по дате отправления:

Предыдущее
От: "Peter J. Holzer"
Дата:
Сообщение: Re: [GENERAL] Are new connection/security features in order, givenconnection pooling?
Следующее
От: "Christopher J. Bottaro"
Дата:
Сообщение: [GENERAL] Streaming replication protocol