Обсуждение: ON CONFLICT DO NOTHING on pg_dump
Hello,
Sometimes I have to maintain two similar database and I have to update one from the other and notice having the option to add ON CONFLICT DO NOTHING clause to INSERT command in the dump data will allows pg_restore to be done with free of ignore error.
The attache patch add --on-conflect-do-nothing option to pg_dump in order to do the above.
regards
Surafel 
Вложения
>From: Surafel Temesgen [mailto:surafel3000@gmail.com] 
>Subject: ON CONFLICT DO NOTHING on pg_dump
>Sometimes I have to maintain two similar database and I have to update one from the other and notice having the option
toadd ON CONFLICT DO NOTHING clause to >INSERT command in the dump data will allows pg_restore to be done with free of
ignoreerror.
 
Hi,
I feel like that on-conflict-do-nothing support is useful especially coupled with --data-only option.
Only the difference of data can be restored.
>The attache patch add --on-conflect-do-nothing option to pg_dump in order to do the above. 
The followings are some comments.
+      <term><option>--on-conflect-do-nothing</option></term>
Here's a typo: conflect -> conflict. This typo also applies to pg_dump.c
        printf(_("  --inserts                    dump data as INSERT commands, rather than COPY\n"));
+       printf(_("  --on-conflect-do-nothing     dump data as INSERT commands with on conflect do nothing\n"));
        printf(_("  --no-comments                do not dump comments\n"));
The output of help should be in alphabetical order according to the convention. So changing the order seems logical. 
Please apply my review to the documentation as well.
By the way, 4d6a854 breaks the patch on this point.
+        This option is not valid unless <option>--inserts</option> is also specified.
+       </para>
       
+       if (dopt.do_nothing && !dopt.dump_inserts)
+               exit_horribly(NULL, "option --on-conflect-do-nothing requires option --inserts\n");
How about mentioning --column-inserts? --on-conflict-do-nothing with --column-inserts should work.
Do you have any plan to support on-conlict-do-update? Supporting this seems to me complicated and take much time so I
don'tmind not implementing this.
 
What do you think about adding some test cases? 
command_fails_like() at 001_basic.pl checks command fail pattern with invalid comnibation of option.
And 002_pg_dump.pl checks the feature iteself.
Regards,
Takeshi Ideriha
			
		On Tue, Jun 12, 2018 at 09:05:23AM +0000, Ideriha, Takeshi wrote: > >From: Surafel Temesgen [mailto:surafel3000@gmail.com] > >Subject: ON CONFLICT DO NOTHING on pg_dump > > >Sometimes I have to maintain two similar database and I have to update one from the other and notice having the optionto add ON CONFLICT DO NOTHING clause to >INSERT command in the dump data will allows pg_restore to be done with freeof ignore error. > > Hi, > I feel like that on-conflict-do-nothing support is useful especially coupled with --data-only option. > Only the difference of data can be restored. But that's additive-only. Only missing rows are restored this way, and differences are not addressed. If you want restore to restore data properly and concurrently (as opposed to renaming a new database into place or whatever) then you'd want a) MERGE, b) dump to generate MERGE statements. A concurrent data restore operation would be rather neat. Nico --
On Tue, Jun 12, 2018 at 12:05 PM, Ideriha, Takeshi <ideriha.takeshi@jp.fujitsu.com> wrote:
thank you for the review 
Hi,
I feel like that on-conflict-do-nothing support is useful especially coupled with --data-only option.
Only the difference of data can be restored.
>The attache patch add --on-conflect-do-nothing option to pg_dump in order to do the above.
The followings are some comments.
+ <term><option>--on-conflect-do-nothing</option></term> 
Here's a typo: conflect -> conflict. This typo also applies to pg_dump.c
printf(_(" --inserts dump data as INSERT commands, rather than COPY\n"));
+ printf(_(" --on-conflect-do-nothing dump data as INSERT commands with on conflect do nothing\n"));
printf(_(" --no-comments do not dump comments\n"));
The output of help should be in alphabetical order according to the convention. So changing the order seems logical.
Please apply my review to the documentation as well.
By the way, 4d6a854 breaks the patch on this point.
+ This option is not valid unless <option>--inserts</option> is also specified.
+ </para>
+ if (dopt.do_nothing && !dopt.dump_inserts)
+ exit_horribly(NULL, "option --on-conflect-do-nothing requires option --inserts\n");
How about mentioning --column-inserts? --on-conflict-do-nothing with --column-inserts should work.
fixed  
Do you have any plan to support on-conlict-do-update? Supporting this seems to me complicated and take much time so I don't mind not implementing this.
i agree its complicated and i don't have a plan to implementing it.   
What do you think about adding some test cases?
command_fails_like() at 001_basic.pl checks command fail pattern with invalid comnibation of option.
And 002_pg_dump.pl checks the feature iteself.
thank you for pointing me that i add basic test and it seems to me the rest of the test is covered by column_inserts test 
Regards,
Takeshi Ideriha
Вложения
>-----Original Message----- >From: Nico Williams [mailto:nico@cryptonector.com] >On Tue, Jun 12, 2018 at 09:05:23AM +0000, Ideriha, Takeshi wrote: >> >From: Surafel Temesgen [mailto:surafel3000@gmail.com] >> >Subject: ON CONFLICT DO NOTHING on pg_dump >> >> >Sometimes I have to maintain two similar database and I have to update one from >the other and notice having the option to add ON CONFLICT DO NOTHING clause to >>INSERT command in the dump data will allows pg_restore to be done with free of >ignore error. >> >> Hi, >> I feel like that on-conflict-do-nothing support is useful especially coupled with >--data-only option. >> Only the difference of data can be restored. > >But that's additive-only. Only missing rows are restored this way, and differences are >not addressed. > >If you want restore to restore data properly and concurrently (as opposed to renaming >a new database into place or whatever) then you'd want a) MERGE, b) dump to >generate MERGE statements. A concurrent data restore operation would be rather >neat. I agree with you though supporting MERGE or ON-CONFLICT-DO-UPDATE seems hard work. Only ON-CONCLICT-DO-NOTHING use case may be narrow. -- Takeshi
Hi, >-----Original Message----- >From: Surafel Temesgen [mailto:surafel3000@gmail.com] >thank you for the review > > Do you have any plan to support on-conlict-do-update? Supporting this seems >to me complicated and take much time so I don't mind not implementing this. > > >i agree its complicated and i don't have a plan to implementing it. Sure. >thank you for pointing me that i add basic test and it seems to me the rest of the test >is covered by column_inserts test Thank you for updating. Just one comment for the code. + if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts)) I think you can remove dopt.column_inserts here because at this point dopt.dump_inserts has already turned true if --column-inserts is specified. But I'm just inclined to think that use case of this feature is vague. Could you specify more concrete use case that is practical for many users? (It would lead to more attention to this patch.) For example, is it useful to backup just before making a big change to DB like delete tables? === Takeshi
On Thu, Jun 14, 2018 at 4:09 PM, Surafel Temesgen <surafel3000@gmail.com> wrote: > > > thank you for pointing me that i add basic test and it seems to me the rest > of the test is covered by column_inserts test @@ -172,6 +172,7 @@ typedef struct _dumpOptions char *outputSuperuser; int sequence_data; /* dump sequence data even in schema-only mode */ + int do_nothing; } DumpOptions; The new structure member appears out of place, can you move up along with other "command-line long options" ? -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
On Sat, Jun 16, 2018 at 11:36 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
@@ -172,6 +172,7 @@ typedef struct _dumpOptions
char *outputSuperuser;
int sequence_data; /* dump sequence data even in schema-only mode */
+ int do_nothing;
} DumpOptions;
The new structure member appears out of place, can you move up along
with other "command-line long options" ?
Done 
regards Surafel
Вложения
On Fri, Jun 15, 2018 at 02:20:21AM +0000, Ideriha, Takeshi wrote: > >From: Nico Williams [mailto:nico@cryptonector.com] > >On Tue, Jun 12, 2018 at 09:05:23AM +0000, Ideriha, Takeshi wrote: > >> Only the difference of data can be restored. > > > >But that's additive-only. Only missing rows are restored this way, and differences are > >not addressed. > > > >If you want restore to restore data properly and concurrently (as opposed to renaming > >a new database into place or whatever) then you'd want a) MERGE, b) dump to > >generate MERGE statements. A concurrent data restore operation would be rather > >neat. > > I agree with you though supporting MERGE or ON-CONFLICT-DO-UPDATE seems hard work. > Only ON-CONCLICT-DO-NOTHING use case may be narrow. Is it narrow, or is it just easy enough to add quickly? And by the way, you don't need MERGE. You can just generate INSERT/ UPDATE/DELETE statements -- MERGE is mainly an optimization on that, and could wait until PG has a MERGE. Nico --
>> I agree with you though supporting MERGE or ON-CONFLICT-DO-UPDATE seems >hard work. >> Only ON-CONCLICT-DO-NOTHING use case may be narrow. > >Is it narrow, or is it just easy enough to add quickly? Sorry for late replay. I read your comment and rethought about it. What I meant by "narrow" is that the number of people who use this new feature seems limited to me. And I had a wrong impression that small improvements are always rejected by hackers. But that's only the case for small improvements with huge source code modification. In short, cost/benefit ratio is low case. This patch have some benefits with source code change is small. So I just wait for other people reviews. >And by the way, you don't need MERGE. You can just generate INSERT/ >UPDATE/DELETE statements -- MERGE is mainly an optimization on that, and could >wait until PG has a MERGE. Oh, thank you for clarifying this. Now I understand it. Best regards, Takeshi Ideriha
Hi, > The new structure member appears out of place, can you move up along > with other "command-line long options" ? > > > >Done > I did regression tests (make check-world) and checked manually pg_dump --on-conflict-do-nothing works properly. Also it seems to me the code has no problem. This feature has advantage to some users with small code change. So I marked it as 'Ready for committer'. I'd like to wait and see committers opinions. Regards, Takeshi Ideriha
On Wed, Jul 11, 2018 at 2:20 PM, Ideriha, Takeshi
<ideriha.takeshi@jp.fujitsu.com> wrote:
> I did regression tests (make check-world) and
> checked manually pg_dump --on-conflict-do-nothing works properly.
> Also it seems to me the code has no problem.
> This feature has advantage to some users with small code change.
>
> So I marked it as 'Ready for committer'.
> I'd like to wait and see committers opinions.
Yeah, it's not an earth-shattering feature, but it might occasionally
be useful and isn't complicated.
+        Add ON CONFLICT DO NOTHING clause in the INSERT commands.
I think this would be better as: Add <literal>ON CONFLICT DO
NOTHING</literal> to <command>INSERT</command> commands.
+    printf(_("  --on-conflict-do-nothing     dump data as INSERT
commands with ON CONFLICT DO NOTHING \n"));
That's slightly misleading... let's just use the same wording again,
eg "add ON CONFLICT DO NOTHING to INSERT commands".
                {"no-unlogged-table-data", no_argument,
&no_unlogged_table_data, 1},
+               {"on-conflict-do-nothing", no_argument,
&on_conflict_do_nothing, 1},
I was tempted to say that this should be in alphabetical order, but
then I noticed that these are only *almost* in alphabetical order, not
quite.  Oh well.
Here's the version I'd like to commit, if there are no objections.
-- 
Thomas Munro
http://www.enterprisedb.com
			
		Вложения
Hi, thanks for the revision.
>
>+        Add ON CONFLICT DO NOTHING clause in the INSERT commands.
>
>I think this would be better as: Add <literal>ON CONFLICT DO NOTHING</literal> to
><command>INSERT</command> commands.
Agreed.
>+    printf(_("  --on-conflict-do-nothing     dump data as INSERT
>commands with ON CONFLICT DO NOTHING \n"));
>
>That's slightly misleading... let's just use the same wording again, eg "add ON
>CONFLICT DO NOTHING to INSERT commands".
Agreed. But you forgot fixing it at pg_dump.c.
So could you please fix this and commit it?
Regards,
Takeshi Ideriha
			
		On Fri, Jul 13, 2018 at 12:33 PM, Ideriha, Takeshi
<ideriha.takeshi@jp.fujitsu.com> wrote:
>>+        Add ON CONFLICT DO NOTHING clause in the INSERT commands.
>>
>>I think this would be better as: Add <literal>ON CONFLICT DO NOTHING</literal> to
>><command>INSERT</command> commands.
>
> Agreed.
>
>>+    printf(_("  --on-conflict-do-nothing     dump data as INSERT
>>commands with ON CONFLICT DO NOTHING \n"));
>>
>>That's slightly misleading... let's just use the same wording again, eg "add ON
>>CONFLICT DO NOTHING to INSERT commands".
>
> Agreed. But you forgot fixing it at pg_dump.c.
> So could you please fix this and commit it?
Right, thanks.
I noticed one more thing: pg_dumpall.c doesn't really need to prohibit
--on-conflict-do-nothing without --insert.  Its existing validation
rejects illegal combinations of the settings that are *not* passed on
to pg_dump.  It seems OK to just pass those on and let pg_dump
complain.  For example, if you say "pg_dumpall --data-only
--schema-only", it's pg_dump that complains, not pg_dumpall.  I think
we should do the same thing here.
Pushed, with those changes.
Thanks for the patch and the reviews!
-- 
Thomas Munro
http://www.enterprisedb.com
			
		>I noticed one more thing: pg_dumpall.c doesn't really need to prohibit >--on-conflict-do-nothing without --insert. Its existing validation rejects illegal >combinations of the settings that are *not* passed on to pg_dump. It seems OK to >just pass those on and let pg_dump complain. For example, if you say "pg_dumpall >--data-only --schema-only", it's pg_dump that complains, not pg_dumpall. I think we >should do the same thing here. Thank you for the clarification. I didn't give thought to pg_dumpall internally running pg_dump. >Pushed, with those changes. Thanks!