Re: Add --include-table-data-where option to pg_dump, to export onlya subset of table data

Поиск
Список
Период
Сортировка
От Stephen Frost
Тема Re: Add --include-table-data-where option to pg_dump, to export onlya subset of table data
Дата
Msg-id 20180521160517.GB27724@tamriel.snowman.net
обсуждение исходный текст
Ответ на Add --include-table-data-where option to pg_dump, to export only asubset of table data  (Carter Thaxton <carter.thaxton@gmail.com>)
Ответы Re: Add --include-table-data-where option to pg_dump, to export onlya subset of table data  (Thomas Munro <thomas.munro@enterprisedb.com>)
Список pgsql-hackers
Greetings,

* Carter Thaxton (carter.thaxton@gmail.com) wrote:
> Many times I've wanted to export a subset of a database, using some sort of
> row filter condition on some of the large tables.  E.g. copying a
> production database to a staging environment, but with some time series
> data only from the past month.
>
> We have the existing options:
>   --include-table=table    (and its -t synonym)
>   --exclude-table=table
>   --exclude-table-data=table
>
> I propose a new option:
>   --include-table-data-where=table:filter_clause
>
> One would use this option as follows:
>
>   pg_dump --include-table-data-where=largetable:"created_at >=
> '2018-05-01'" database_name
>
> The filter_clause is used as the contents of a WHERE clause when querying
> the data to generate the COPY statement produced by pg_dump.

I've wanted something similar to this in the past as well, and, as
you've seen, we have some support for this kind of thing in pg_dump
already and what you're doing is exposing that.

> I've prepared a proposed patch for this, which is attached.  The code
> changes are rather straightforward.  I did have to add the ability to carry
> around an extra pointer-sized object to the simple_list implementation, in
> order to allow the filter clause to be associated to the matching oids of
> the table pattern.  It seemed the best way to augment the existing
> simple_list implementation, but change as little as possible elsewhere in
> the codebase.  (Note that SimpleOidList is actually only used by pg_dump).
>
> Feel free to review and propose any amendments.

I've only taken a quick look but I don't see any regression tests, for
starters, and it's not clear if this can be passed multiple times for
one pg_dump run (I'd certainly hope that it could be...).

Also, if you haven't already, this should be registered on the
commitfest app, so we don't lose track of it.

Thanks!

Stephen

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Allowing printf("%m") only where it actually works
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Postgres 11 release notes