Обсуждение: pg_dump --where option

Поиск
Список
Период
Сортировка

pg_dump --where option

От
Surafel Temesgen
Дата:

Internally pg_dump have capability to filter the table data to dump by same filter clause but it have no interface to use it and the patch here [1] adds interface to it but it have at-least two issue, one is error message in case of incorrect where clause specification is somehow hard to debug and strange to pg_dump .Other issue is it applies the same filter clause to multiple tables if pattern matching return multiple tables and it seems undesired behavior to me because mostly we don’t want to applied the same where clause specification to multiple table. The attached patch contain a fix for both issue

[1].https://www.postgresql.org/message-id/flat/CAGiT_HNav5B=OfCdfyFoqTa+oe5W1vG=PXkTETCxXg4kcUTktA@mail.gmail.com


regards

Surafel

Вложения

Re: pg_dump --where option

От
Cary Huang
Дата:
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:       tested, failed
Spec compliant:           tested, failed
Documentation:            tested, failed

Hi

I had a look at the patch and it cleanly applies to postgres master branch. I tried to do a quick test on the new
"whereclause" functionality and for the most part it does the job as described and I'm sure some people will find this
featureuseful to their database dump needs. However I tried the feature with a case where I have a subquery in the
whereclause, but it seems to be failing to dump the data. I ran the pg_dump like:
 

  $ pg_dump -d cary --where="test1:a3 = ( select max(aa1) from test2 )" > testdump2
  $ pg_dump: error: processing of table "public.test1" failed

both test1 and test2 exist in the database and the same subquery works under psql.
 
I also notice that the regression tests for pg_dump is failing due to the patch, I think it is worth looking into the
failuremessages and also add some test cases on the new "where" clause to ensure that it can cover as many use cases as
possible.

thank you
Best regards

Cary Huang
-------------
HighGo Software Inc. (Canada)
cary.huang@highgo.ca
www.highgo.ca

Re: pg_dump --where option

От
Daniel Gustafsson
Дата:
> On 10 Jul 2020, at 02:03, Cary Huang <cary.huang@highgo.ca> wrote:
>
> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, failed
> Implements feature:       tested, failed
> Spec compliant:           tested, failed
> Documentation:            tested, failed
>
> Hi
>
> I had a look at the patch and it cleanly applies to postgres master branch. I tried to do a quick test on the new
"whereclause" functionality and for the most part it does the job as described and I'm sure some people will find this
featureuseful to their database dump needs. However I tried the feature with a case where I have a subquery in the
whereclause, but it seems to be failing to dump the data. I ran the pg_dump like: 
>
>  $ pg_dump -d cary --where="test1:a3 = ( select max(aa1) from test2 )" > testdump2
>  $ pg_dump: error: processing of table "public.test1" failed
>
> both test1 and test2 exist in the database and the same subquery works under psql.
>
> I also notice that the regression tests for pg_dump is failing due to the patch, I think it is worth looking into the
failuremessages and also add some test cases on the new "where" clause to ensure that it can cover as many use cases as
possible.

As this is being reviewed, but time is running out in this CF, I'm moving this
to the next CF.  The entry will be moved to Waiting for Author based on the
above review.

cheers ./daniel


Re: pg_dump --where option

От
Surafel Temesgen
Дата:


On Fri, Jul 31, 2020 at 1:38 AM Daniel Gustafsson <daniel@yesql.se> wrote:

>  $ pg_dump -d cary --where="test1:a3 = ( select max(aa1) from test2 )" > testdump2
>  $ pg_dump: error: processing of table "public.test1" failed
>
> both test1 and test2 exist in the database and the same subquery works under psql.
>

This is because pg_dump uses schema-qualified object name I add documentation about to use schema-qualified name when using sub query


 
> I also notice that the regression tests for pg_dump is failing due to the patch, I think it is worth looking into the failure messages and also add some test cases on the new "where" clause to ensure that it can cover as many use cases as possible.


I fix regression test  failure on the attached patch.

I don’t add tests because single-quotes and double-quotes are meta-characters for PROVE too.

regards

Surafel

Вложения

Re: pg_dump --where option

От
Daniel Gustafsson
Дата:
> On 14 Sep 2020, at 12:04, Surafel Temesgen <surafel3000@gmail.com> wrote:
> On Fri, Jul 31, 2020 at 1:38 AM Daniel Gustafsson <daniel@yesql.se <mailto:daniel@yesql.se>> wrote:
>
> >  $ pg_dump -d cary --where="test1:a3 = ( select max(aa1) from test2 )" > testdump2
> >  $ pg_dump: error: processing of table "public.test1" failed
> >
> > both test1 and test2 exist in the database and the same subquery works under psql.
> This is because pg_dump uses schema-qualified object name I add documentation about to use schema-qualified name when
usingsub query 

Documenting something is well and good, but isn't allowing arbitrary SQL
copy-pasted into the query (which isn't checked for schema qualification)
opening up for some of the ill-effects of CVE-2018-1058?

> I don’t add tests because single-quotes and double-quotes are meta-characters for PROVE too.

I'm not sure I follow. Surely tests can be added for this functionality?


How should one invoke this on a multibyte char table name which require
quoting, like --table='"x"' (where x would be an mb char).  Reading the
original thread and trying the syntax from there, it's also not clear how table
names with colons should be handled.  I know they're not common, but if they're
not supported then the tradeoff should be documented.

A nearby thread [0] is adding functionality to read from an input file due to
the command line being too short.  Consumers of this might not run into the
issues mentioned there, but it doesn't seem far fetched that someone who does
also adds a small WHERE clause too.  Maybe these patches should join forces?

cheers ./daniel

[0] CAFj8pRB10wvW0CC9Xq=1XDs=zCQxer3cbLcNZa+qiX4cUH-G_A@mail.gmail.com


Re: pg_dump --where option

От
Michael Paquier
Дата:
On Mon, Sep 14, 2020 at 05:00:19PM +0200, Daniel Gustafsson wrote:
> I'm not sure I follow. Surely tests can be added for this functionality?

We should have tests for that.  I can see that this has not been
answered in two weeks, so this has been marked as returned with
feedback in the CF app.
--
Michael

Вложения