Обсуждение: pg_dump why no indicator of completion

Поиск
Список
Период
Сортировка

pg_dump why no indicator of completion

От
richard coleman
Дата:
Hi all, 

I've been working with pg_dump and one thing that's always struck me as strange is that there never seems to be an indication in the log that the pg_dump process completed successfully.

For example, I've been running a bunch of backups with pg_dump in directory mode with multiple jobs.  When I come back several hours later there are no pg_dump processes running.  The log is filled with:
dumping contents of foo
finished item 123456 of TABLE DATA bar
etc.

Other than trying to restore the multi terabyte database somewhere else is there any way to know that it actually finished successfully?

Why doesn't pg_dump add a line in the output like:
pg_backup finished
when it's completed successfully? 

This seems like a terrible oversight.

Am I missing something obvious?

Thanks, 
rik.

Re: pg_dump why no indicator of completion

От
Gavan Schneider
Дата:
On 1 May 2023, at 10:18, richard coleman wrote:

> Why doesn't pg_dump add a line in the output like:
>
> pg_backup finished
>
And, consider an option to include the MD5 and/or SHA256 checksum(s) for what should be on the file system.

> when it's completed successfully?
>
Or, more generally, the appropriate status

From a basic perspective we can expect the *nix shell to get a non-zero exit status for pg_dump in the event of
failure,but this is not an explicit promise in the diagnostics section of my copy of the man entry (maybe too obvious
:).

Regards
Gavan



Re: pg_dump why no indicator of completion

От
Ron
Дата:
On 4/30/23 19:18, richard coleman wrote:
Hi all, 

I've been working with pg_dump and one thing that's always struck me as strange is that there never seems to be an indication in the log that the pg_dump process completed successfully.

For example, I've been running a bunch of backups with pg_dump in directory mode with multiple jobs.  When I come back several hours later there are no pg_dump processes running.  The log is filled with:
dumping contents of foo
finished item 123456 of TABLE DATA bar
etc.

Other than trying to restore the multi terabyte database somewhere else is there any way to know that it actually finished successfully?

Why doesn't pg_dump add a line in the output like:
pg_backup finished
when it's completed successfully? 

This seems like a terrible oversight.

Am I missing something obvious?

"They" expect you to do the Unix Thing and check $?. Thus, that's what I do; any non-zero value generates an email with a scary subject line, the specifics of which are based on the exact code value.  I'm not near my work computer; otherwise I'd share them with you.

--
Born in Arizona, moved to Babylonia.

Re: pg_dump why no indicator of completion

От
Holger Jakobs
Дата:
At least it you use plaintext dump, you can check with tail dumpFile for the message


Am 1. Mai 2023 02:18:06 MESZ schrieb richard coleman <rcoleman.ascentgl@gmail.com>:
Hi all, 

I've been working with pg_dump and one thing that's always struck me as strange is that there never seems to be an indication in the log that the pg_dump process completed successfully.

For example, I've been running a bunch of backups with pg_dump in directory mode with multiple jobs.  When I come back several hours later there are no pg_dump processes running.  The log is filled with:
dumping contents of foo
finished item 123456 of TABLE DATA bar
etc.

Other than trying to restore the multi terabyte database somewhere else is there any way to know that it actually finished successfully?

Why doesn't pg_dump add a line in the output like:
pg_backup finished
when it's completed successfully? 

This seems like a terrible oversight.

Am I missing something obvious?

Thanks, 
rik.

Re: pg_dump why no indicator of completion

От
Frank Lanitz
Дата:
Hello,

On 01.05.23 02:18, richard coleman wrote:
> Hi all,
> 
> I've been working with pg_dump and one thing that's always struck me as 
> strange is that there never seems to be an indication in the log that 
> the pg_dump process completed successfully.

I don't think it's needed.

How are you running pg_dump? There is a return code 0 if successful or 
!= 0 in case of an error.

Cheer,
Frank

Вложения

Re: pg_dump why no indicator of completion

От
Frank Lanitz
Дата:
On 01.05.23 05:28, Ron wrote:
> "They" expect you to do the Unix Thing and check $?. Thus, that's what I 
> do; any non-zero value generates an email with a scary subject line, the 
> specifics of which are based on the exact code value.  I'm not near my 
> work computer; otherwise I'd share them with you.

Or put it to an SystemD unit and check for failed ones inside your 
monitoring ;)

.f

Вложения

Re: pg_dump why no indicator of completion

От
richard coleman
Дата:
Ron, 
So what you are saying is that I have to write a shell script around pg_dump to catch the exit code of pg_dump since the developers of pg_dump didn't have pg_dump simply write an exit message?

Since multiple pg_dump commands are often run at the same command prompt, and they can take hours, if not days to run, and there are a myriad of other commands the will be run in the interim, any exit code generated by pg_dump with be lost in the flotsam and jetsam of the multitude of exit codes created by every other command run between the calling of pg_dump and it's eventual termination. It could have completed successfully, crashed, been killed, etc.

Or am I mistaken?

thanks, 
rik. 

On Sun, Apr 30, 2023 at 11:28 PM Ron <ronljohnsonjr@gmail.com> wrote:
On 4/30/23 19:18, richard coleman wrote:
Hi all, 

I've been working with pg_dump and one thing that's always struck me as strange is that there never seems to be an indication in the log that the pg_dump process completed successfully.

For example, I've been running a bunch of backups with pg_dump in directory mode with multiple jobs.  When I come back several hours later there are no pg_dump processes running.  The log is filled with:
dumping contents of foo
finished item 123456 of TABLE DATA bar
etc.

Other than trying to restore the multi terabyte database somewhere else is there any way to know that it actually finished successfully?

Why doesn't pg_dump add a line in the output like:
pg_backup finished
when it's completed successfully? 

This seems like a terrible oversight.

Am I missing something obvious?

"They" expect you to do the Unix Thing and check $?. Thus, that's what I do; any non-zero value generates an email with a scary subject line, the specifics of which are based on the exact code value.  I'm not near my work computer; otherwise I'd share them with you.

--
Born in Arizona, moved to Babylonia.

Re: pg_dump why no indicator of completion

От
Scott Ribe
Дата:
> On May 1, 2023, at 6:56 AM, richard coleman <rcoleman.ascentgl@gmail.com> wrote:
>
> Since multiple pg_dump commands are often run at the same command prompt, and they can take hours, if not days to
run,and there are a myriad of other commands the will be run in the interim, any exit code generated by pg_dump with be
lostin the flotsam and jetsam of the multitude of exit codes created by every other command run between the calling of
pg_dumpand it's eventual termination. It could have completed successfully, crashed, been killed, etc. 

That sounds like a really bad way to do things...


Re: pg_dump why no indicator of completion

От
richard coleman
Дата:
Scott, 

Which, running more than one command from the shell prompt in a single logon session, or not emitting an exit message?

rik.

On Mon, May 1, 2023 at 8:59 AM Scott Ribe <scott_ribe@elevated-dev.com> wrote:
> On May 1, 2023, at 6:56 AM, richard coleman <rcoleman.ascentgl@gmail.com> wrote:
>
> Since multiple pg_dump commands are often run at the same command prompt, and they can take hours, if not days to run, and there are a myriad of other commands the will be run in the interim, any exit code generated by pg_dump with be lost in the flotsam and jetsam of the multitude of exit codes created by every other command run between the calling of pg_dump and it's eventual termination. It could have completed successfully, crashed, been killed, etc.

That sounds like a really bad way to do things...

Re: pg_dump why no indicator of completion

От
Scott Ribe
Дата:
> On May 1, 2023, at 7:01 AM, richard coleman <rcoleman.ascentgl@gmail.com> wrote:
>
> Which, running more than one command from the shell prompt in a single logon session, or not emitting an exit
message?

Running multiple long-lived processes in a shell prompt, whose exit values you care about, running in the background,
alongwith "a myriad of other commands", presumably short-lived, in the interim, then looking through "the flotsam and
jetsam"for a success message. Sounds like a total mess. Put your important long-lived processesin their own sessions.
Or,if you must interleave them, wrap the call to pg_dump in a simple conditional. 


Re: pg_dump why no indicator of completion

От
richard coleman
Дата:
Scott, 

Generally I run pg_dump nohup in the background with all outputs captured to a file.  Finding the exit message is simply a matter of navigating to the bottom of the output file and checking.
Excepting for pg_dump, I don't generally have to worry about exit values as most command line programs are either short-lived and obvious, like ls, or emit their own text exit status.
In my experience checking exit status codes are what you do when writing programs to determine the status of a subroutine or function that you are calling.  Stand alone programs that are expected to take any appreciable amount of time to run generally provide an status output, precisely for this reason.

I hope that the devs reconsider and add a status output message.  If checking the exit status is the only way to determine if the pg_dump process completed successfully, then I am going to be forced to create a wrapper to capture and emit the status around the pg_dump executable.  If I need to do that, so will every other user who needs to know the exit status and runs pg_dump in a typical linux workflow.  Seems like a terrible waste having untold numbers of users having to constantly reinvent the wheel.

rik.

On Mon, May 1, 2023 at 9:07 AM Scott Ribe <scott_ribe@elevated-dev.com> wrote:
> On May 1, 2023, at 7:01 AM, richard coleman <rcoleman.ascentgl@gmail.com> wrote:
>
> Which, running more than one command from the shell prompt in a single logon session, or not emitting an exit message?

Running multiple long-lived processes in a shell prompt, whose exit values you care about, running in the background, along with "a myriad of other commands", presumably short-lived, in the interim, then looking through "the flotsam and jetsam" for a success message. Sounds like a total mess. Put your important long-lived processesin their own sessions. Or, if you must interleave them, wrap the call to pg_dump in a simple conditional.

Re: pg_dump why no indicator of completion

От
Ron
Дата:
On 5/1/23 08:07, Scott Ribe wrote:
>> On May 1, 2023, at 7:01 AM, richard coleman <rcoleman.ascentgl@gmail.com> wrote:
>>
>> Which, running more than one command from the shell prompt in a single logon session, or not emitting an exit
message?
> Running multiple long-lived processes in a shell prompt, whose exit values you care about, running in the
background,

Exactly.  Cron is your friend!!!

> along with "a myriad of other commands", presumably short-lived, in the interim, then looking through "the flotsam
andjetsam" for a success message. Sounds like a total mess. Put your important long-lived processesin their own
sessions.Or, if you must interleave them, wrap the call to pg_dump in a simple conditional.
 

-- 
Born in Arizona, moved to Babylonia.



Re: pg_dump why no indicator of completion

От
Ron
Дата:


On 5/1/23 07:56, richard coleman wrote:
Ron, 
So what you are saying is that I have to write a shell script around pg_dump to catch the exit code of pg_dump since the developers of pg_dump didn't have pg_dump simply write an exit message?

Since multiple pg_dump commands are often run at the same command prompt, and they can take hours, if not days to run,

Running a days-long pg_dump at any time except special occasions isn't very wise.  Especially from a command prompt.

Binary backup/restore programs are much faster (if for no other reason than they let you do incremental and differential backups).

and there are a myriad of other commands the will be run in the interim, any exit code generated by pg_dump with be lost in the flotsam and jetsam of the multitude of exit codes created by every other command run between the calling of pg_dump and it's eventual termination. It could have completed successfully, crashed, been killed, etc.

Or am I mistaken?

thanks, 
rik. 

On Sun, Apr 30, 2023 at 11:28 PM Ron <ronljohnsonjr@gmail.com> wrote:
On 4/30/23 19:18, richard coleman wrote:
Hi all, 

I've been working with pg_dump and one thing that's always struck me as strange is that there never seems to be an indication in the log that the pg_dump process completed successfully.

For example, I've been running a bunch of backups with pg_dump in directory mode with multiple jobs.  When I come back several hours later there are no pg_dump processes running.  The log is filled with:
dumping contents of foo
finished item 123456 of TABLE DATA bar
etc.

Other than trying to restore the multi terabyte database somewhere else is there any way to know that it actually finished successfully?

Why doesn't pg_dump add a line in the output like:
pg_backup finished
when it's completed successfully? 

This seems like a terrible oversight.

Am I missing something obvious?

"They" expect you to do the Unix Thing and check $?. Thus, that's what I do; any non-zero value generates an email with a scary subject line, the specifics of which are based on the exact code value.  I'm not near my work computer; otherwise I'd share them with you.

--
Born in Arizona, moved to Babylonia.

--
Born in Arizona, moved to Babylonia.

Re: pg_dump why no indicator of completion

От
richard coleman
Дата:
Ron, 

Are you writing that pg_dump is unfit for purpose and that I should be using a commercial backup solution instead?

rik.

On Mon, May 1, 2023 at 9:43 AM Ron <ronljohnsonjr@gmail.com> wrote:


On 5/1/23 07:56, richard coleman wrote:
Ron, 
So what you are saying is that I have to write a shell script around pg_dump to catch the exit code of pg_dump since the developers of pg_dump didn't have pg_dump simply write an exit message?

Since multiple pg_dump commands are often run at the same command prompt, and they can take hours, if not days to run,

Running a days-long pg_dump at any time except special occasions isn't very wise.  Especially from a command prompt.

Binary backup/restore programs are much faster (if for no other reason than they let you do incremental and differential backups).

and there are a myriad of other commands the will be run in the interim, any exit code generated by pg_dump with be lost in the flotsam and jetsam of the multitude of exit codes created by every other command run between the calling of pg_dump and it's eventual termination. It could have completed successfully, crashed, been killed, etc.

Or am I mistaken?

thanks, 
rik. 

On Sun, Apr 30, 2023 at 11:28 PM Ron <ronljohnsonjr@gmail.com> wrote:
On 4/30/23 19:18, richard coleman wrote:
Hi all, 

I've been working with pg_dump and one thing that's always struck me as strange is that there never seems to be an indication in the log that the pg_dump process completed successfully.

For example, I've been running a bunch of backups with pg_dump in directory mode with multiple jobs.  When I come back several hours later there are no pg_dump processes running.  The log is filled with:
dumping contents of foo
finished item 123456 of TABLE DATA bar
etc.

Other than trying to restore the multi terabyte database somewhere else is there any way to know that it actually finished successfully?

Why doesn't pg_dump add a line in the output like:
pg_backup finished
when it's completed successfully? 

This seems like a terrible oversight.

Am I missing something obvious?

"They" expect you to do the Unix Thing and check $?. Thus, that's what I do; any non-zero value generates an email with a scary subject line, the specifics of which are based on the exact code value.  I'm not near my work computer; otherwise I'd share them with you.

--
Born in Arizona, moved to Babylonia.

--
Born in Arizona, moved to Babylonia.

Re: pg_dump why no indicator of completion

От
richard coleman
Дата:
Frank, 

As I've mentioned in another response I typically run pg_dump as follows:

nohup pg_dump {various settings} > something.log 2>&1 &

rik.

On Mon, May 1, 2023 at 6:12 AM Frank Lanitz <frank@frank.uvena.de> wrote:
Hello,

On 01.05.23 02:18, richard coleman wrote:
> Hi all,
>
> I've been working with pg_dump and one thing that's always struck me as
> strange is that there never seems to be an indication in the log that
> the pg_dump process completed successfully.

I don't think it's needed.

How are you running pg_dump? There is a return code 0 if successful or
!= 0 in case of an error.

Cheer,
Frank

Re: pg_dump why no indicator of completion

От
Ron
Дата:
That's like complaining that a Toyota Camry isn't fit for hauling 10 cubic yards of sand.


On 5/1/23 08:55, richard coleman wrote:
Ron, 

Are you writing that pg_dump is unfit for purpose and that I should be using a commercial backup solution instead?

rik.

On Mon, May 1, 2023 at 9:43 AM Ron <ronljohnsonjr@gmail.com> wrote:


On 5/1/23 07:56, richard coleman wrote:
Ron, 
So what you are saying is that I have to write a shell script around pg_dump to catch the exit code of pg_dump since the developers of pg_dump didn't have pg_dump simply write an exit message?

Since multiple pg_dump commands are often run at the same command prompt, and they can take hours, if not days to run,

Running a days-long pg_dump at any time except special occasions isn't very wise.  Especially from a command prompt.

Binary backup/restore programs are much faster (if for no other reason than they let you do incremental and differential backups).

and there are a myriad of other commands the will be run in the interim, any exit code generated by pg_dump with be lost in the flotsam and jetsam of the multitude of exit codes created by every other command run between the calling of pg_dump and it's eventual termination. It could have completed successfully, crashed, been killed, etc.

Or am I mistaken?

thanks, 
rik. 

On Sun, Apr 30, 2023 at 11:28 PM Ron <ronljohnsonjr@gmail.com> wrote:
On 4/30/23 19:18, richard coleman wrote:
Hi all, 

I've been working with pg_dump and one thing that's always struck me as strange is that there never seems to be an indication in the log that the pg_dump process completed successfully.

For example, I've been running a bunch of backups with pg_dump in directory mode with multiple jobs.  When I come back several hours later there are no pg_dump processes running.  The log is filled with:
dumping contents of foo
finished item 123456 of TABLE DATA bar
etc.

Other than trying to restore the multi terabyte database somewhere else is there any way to know that it actually finished successfully?

Why doesn't pg_dump add a line in the output like:
pg_backup finished
when it's completed successfully? 

This seems like a terrible oversight.

Am I missing something obvious?

"They" expect you to do the Unix Thing and check $?. Thus, that's what I do; any non-zero value generates an email with a scary subject line, the specifics of which are based on the exact code value.  I'm not near my work computer; otherwise I'd share them with you.

--
Born in Arizona, moved to Babylonia.

--
Born in Arizona, moved to Babylonia.

--
Born in Arizona, moved to Babylonia.

Re: pg_dump why no indicator of completion

От
Rui DeSousa
Дата:

On May 1, 2023, at 9:55 AM, richard coleman <rcoleman.ascentgl@gmail.com> wrote:

Are you writing that pg_dump is unfit for purpose and that I should be using a commercial backup solution instead?


pg_dump is a logical backup.  If you need a fast logical backup; then it’s the right tool.  To get a fast logical backup use the directory format, multiple threads (jobs), and turn off compression.  You shouldn’t have to wait days to get a logical backup; if so, maybe your system is too small and/or disks are too slow.

i.e.  pg_dump --format=d --file=prod --compress=0 —jobs=16 prod

However, for database backups a binary solution is best as it is faster and allows for point in time recovery if archiving is enabled.

Relying on logical backups as a backup solution seems like a bad idea.

-rui

Re: pg_dump why no indicator of completion

От
richard coleman
Дата:
Ron, 

That seems to be a very odd stance.  PostgreSQL and it's various utilities are and have been used in very large commercial operations for decades.
pg_dump successfully backs up databases from 1MB to 10TB and beyond.  The only issue I am having is for some inexplicable reason the devs haven't included a status message upon completion.

If you don't think that pg_dump is fit for purpose, then what do you propose using in its stead?

thanks, 
rik.

On Mon, May 1, 2023 at 10:07 AM Ron <ronljohnsonjr@gmail.com> wrote:
That's like complaining that a Toyota Camry isn't fit for hauling 10 cubic yards of sand.


On 5/1/23 08:55, richard coleman wrote:
Ron, 

Are you writing that pg_dump is unfit for purpose and that I should be using a commercial backup solution instead?

rik.

On Mon, May 1, 2023 at 9:43 AM Ron <ronljohnsonjr@gmail.com> wrote:


On 5/1/23 07:56, richard coleman wrote:
Ron, 
So what you are saying is that I have to write a shell script around pg_dump to catch the exit code of pg_dump since the developers of pg_dump didn't have pg_dump simply write an exit message?

Since multiple pg_dump commands are often run at the same command prompt, and they can take hours, if not days to run,

Running a days-long pg_dump at any time except special occasions isn't very wise.  Especially from a command prompt.

Binary backup/restore programs are much faster (if for no other reason than they let you do incremental and differential backups).

and there are a myriad of other commands the will be run in the interim, any exit code generated by pg_dump with be lost in the flotsam and jetsam of the multitude of exit codes created by every other command run between the calling of pg_dump and it's eventual termination. It could have completed successfully, crashed, been killed, etc.

Or am I mistaken?

thanks, 
rik. 

On Sun, Apr 30, 2023 at 11:28 PM Ron <ronljohnsonjr@gmail.com> wrote:
On 4/30/23 19:18, richard coleman wrote:
Hi all, 

I've been working with pg_dump and one thing that's always struck me as strange is that there never seems to be an indication in the log that the pg_dump process completed successfully.

For example, I've been running a bunch of backups with pg_dump in directory mode with multiple jobs.  When I come back several hours later there are no pg_dump processes running.  The log is filled with:
dumping contents of foo
finished item 123456 of TABLE DATA bar
etc.

Other than trying to restore the multi terabyte database somewhere else is there any way to know that it actually finished successfully?

Why doesn't pg_dump add a line in the output like:
pg_backup finished
when it's completed successfully? 

This seems like a terrible oversight.

Am I missing something obvious?

"They" expect you to do the Unix Thing and check $?. Thus, that's what I do; any non-zero value generates an email with a scary subject line, the specifics of which are based on the exact code value.  I'm not near my work computer; otherwise I'd share them with you.

--
Born in Arizona, moved to Babylonia.

--
Born in Arizona, moved to Babylonia.

--
Born in Arizona, moved to Babylonia.

Re: pg_dump why no indicator of completion

От
Rui DeSousa
Дата:


On May 1, 2023, at 10:00 AM, richard coleman <rcoleman.ascentgl@gmail.com> wrote:

As I've mentioned in another response I typically run pg_dump as follows:

nohup pg_dump {various settings} > something.log 2>&1 &

Why redirect standard error to standard out when using pg_dump? I assume that using standard out is not the backup stream. 

If you want a success message in the log; write the success message to the log file if pg_dump exits successfully. 

i.e.:   nohup pg_dump --verbose prod > prod.sql 2> prod.log && echo "Backup Completed Successfully!" >> prod.log &


Re: pg_dump why no indicator of completion

От
richard coleman
Дата:
Rui, 

I'll be the first to admit that in this case some of the servers that were procured for PostgreSQL are way too underspeced for the databases they are now hosting.  I am guessing that it's a result of the databases outgrowing their servers.

As I've asked Ron, if pg_dump isn't fit for purpose, then what do you believe is?

Thanks, 
rik.

On Mon, May 1, 2023 at 10:13 AM Rui DeSousa <rui@crazybean.net> wrote:

On May 1, 2023, at 9:55 AM, richard coleman <rcoleman.ascentgl@gmail.com> wrote:

Are you writing that pg_dump is unfit for purpose and that I should be using a commercial backup solution instead?


pg_dump is a logical backup.  If you need a fast logical backup; then it’s the right tool.  To get a fast logical backup use the directory format, multiple threads (jobs), and turn off compression.  You shouldn’t have to wait days to get a logical backup; if so, maybe your system is too small and/or disks are too slow.

i.e.  pg_dump --format=d --file=prod --compress=0 —jobs=16 prod

However, for database backups a binary solution is best as it is faster and allows for point in time recovery if archiving is enabled.

Relying on logical backups as a backup solution seems like a bad idea.

-rui

Re: pg_dump why no indicator of completion

От
Ron
Дата:
I backup some databases every night using pg_dump.  From a bash script initiated by cron, the way God intended them to be run.  And I check the return code, like every good production shell script should.

I back up other databases using pgBackRest (also from a bash script initiated by cron, the way God intended them to be run).  And I check the return code, like every good production shell script should (even though it prints a message saying "Finished").

Because who in their right mind manually checks log files every morning, when a shell script will gladly email you if something went wrong?  Make the computer work for you!!!

On 5/1/23 09:31, richard coleman wrote:
Ron, 

That seems to be a very odd stance.  PostgreSQL and it's various utilities are and have been used in very large commercial operations for decades.
pg_dump successfully backs up databases from 1MB to 10TB and beyond.  The only issue I am having is for some inexplicable reason the devs haven't included a status message upon completion.

If you don't think that pg_dump is fit for purpose, then what do you propose using in its stead?

thanks, 
rik.

On Mon, May 1, 2023 at 10:07 AM Ron <ronljohnsonjr@gmail.com> wrote:
That's like complaining that a Toyota Camry isn't fit for hauling 10 cubic yards of sand.


On 5/1/23 08:55, richard coleman wrote:
Ron, 

Are you writing that pg_dump is unfit for purpose and that I should be using a commercial backup solution instead?

rik.

On Mon, May 1, 2023 at 9:43 AM Ron <ronljohnsonjr@gmail.com> wrote:


On 5/1/23 07:56, richard coleman wrote:
Ron, 
So what you are saying is that I have to write a shell script around pg_dump to catch the exit code of pg_dump since the developers of pg_dump didn't have pg_dump simply write an exit message?

Since multiple pg_dump commands are often run at the same command prompt, and they can take hours, if not days to run,

Running a days-long pg_dump at any time except special occasions isn't very wise.  Especially from a command prompt.

Binary backup/restore programs are much faster (if for no other reason than they let you do incremental and differential backups).

and there are a myriad of other commands the will be run in the interim, any exit code generated by pg_dump with be lost in the flotsam and jetsam of the multitude of exit codes created by every other command run between the calling of pg_dump and it's eventual termination. It could have completed successfully, crashed, been killed, etc.

Or am I mistaken?

thanks, 
rik. 

On Sun, Apr 30, 2023 at 11:28 PM Ron <ronljohnsonjr@gmail.com> wrote:
On 4/30/23 19:18, richard coleman wrote:
Hi all, 

I've been working with pg_dump and one thing that's always struck me as strange is that there never seems to be an indication in the log that the pg_dump process completed successfully.

For example, I've been running a bunch of backups with pg_dump in directory mode with multiple jobs.  When I come back several hours later there are no pg_dump processes running.  The log is filled with:
dumping contents of foo
finished item 123456 of TABLE DATA bar
etc.

Other than trying to restore the multi terabyte database somewhere else is there any way to know that it actually finished successfully?

Why doesn't pg_dump add a line in the output like:
pg_backup finished
when it's completed successfully? 

This seems like a terrible oversight.

Am I missing something obvious?

"They" expect you to do the Unix Thing and check $?. Thus, that's what I do; any non-zero value generates an email with a scary subject line, the specifics of which are based on the exact code value.  I'm not near my work computer; otherwise I'd share them with you.

--
Born in Arizona, moved to Babylonia.

--
Born in Arizona, moved to Babylonia.

--
Born in Arizona, moved to Babylonia.

--
Born in Arizona, moved to Babylonia.

Re: pg_dump why no indicator of completion

От
Scott Ribe
Дата:
> On May 1, 2023, at 8:34 AM, richard coleman <rcoleman.ascentgl@gmail.com> wrote:
>
> As I've asked Ron, if pg_dump isn't fit for purpose, then what do you believe is?

He didn't say pg_dump wasn't fit for purpose. He said use a binary format and multiple processes--which are pg_dump
options.


Re: pg_dump why no indicator of completion

От
richard coleman
Дата:
Scott, 

I do use multiple processes, when I use the directory output option which is the only one that supports it.
I will admit that I haven't tried writing custom binary outputs yet, typically using either the builtin custom format or directory format.

rik.

On Mon, May 1, 2023 at 10:48 AM Scott Ribe <scott_ribe@elevated-dev.com> wrote:
> On May 1, 2023, at 8:34 AM, richard coleman <rcoleman.ascentgl@gmail.com> wrote:
>
> As I've asked Ron, if pg_dump isn't fit for purpose, then what do you believe is?

He didn't say pg_dump wasn't fit for purpose. He said use a binary format and multiple processes--which are pg_dump options.

Re: pg_dump why no indicator of completion

От
"David G. Johnston"
Дата:
On Sun, Apr 30, 2023 at 5:18 PM richard coleman <rcoleman.ascentgl@gmail.com> wrote:
For example, I've been running a bunch of backups with pg_dump in directory mode with multiple jobs.  When I come back several hours later there are no pg_dump processes running.  The log is filled with:
dumping contents of foo
finished item 123456 of TABLE DATA bar


I agree that any log file that contains this level of detail regarding the operational detail can (IMO, should) also have an indication regarding the final outcome of the process.

The arguments about "the Unix way" seem to have forgotten the point about being silent except in cases of exception, and this level of detail already violates that premise.

David J.

Re: pg_dump why no indicator of completion

От
Rui DeSousa
Дата:


On May 1, 2023, at 10:34 AM, richard coleman <rcoleman.ascentgl@gmail.com> wrote:

As I've asked Ron, if pg_dump isn't fit for purpose, then what do you believe is?

If you need a logical backup then pg_dump is the right tool; as noted before but with better options to improve performance.

For backups; I would recommend physical backups which there are many solutions for.  I have personally used snapshots and believe that is the best solution.  It is the fastest and best solution to meet recovery time objective (RTO).  For example, I’m able to make a backup of a multi-terabyte database in under a second and multiple backups are cheap.  This allows me to do a backup of the system every 6 hours. The most expensive part of a database restore is going to be applying the WAL files when doing a point in time recovery (PITR).  This means that at most, I would only have to apply around 6 hours of WAL files; which can still take a long time given a high volume system.

In my opinion snapshots are the best solution but it will depend on your infrastructure and requires knowledge of the infrastructure and setting it up correctly.

If I could not use snapshots; I would look into either pgBackRest or pgBarman.   

Re: pg_dump why no indicator of completion

От
richard coleman
Дата:
Rui, 

Thanks for those suggestions, and your earlier ones.

It would be helpful though if the dev included a completion status message in pg_dump, even if it was only included when you specified the -v flag.

rik.

On Mon, May 1, 2023 at 11:29 AM Rui DeSousa <rui@crazybean.net> wrote:


On May 1, 2023, at 10:34 AM, richard coleman <rcoleman.ascentgl@gmail.com> wrote:

As I've asked Ron, if pg_dump isn't fit for purpose, then what do you believe is?

If you need a logical backup then pg_dump is the right tool; as noted before but with better options to improve performance.

For backups; I would recommend physical backups which there are many solutions for.  I have personally used snapshots and believe that is the best solution.  It is the fastest and best solution to meet recovery time objective (RTO).  For example, I’m able to make a backup of a multi-terabyte database in under a second and multiple backups are cheap.  This allows me to do a backup of the system every 6 hours. The most expensive part of a database restore is going to be applying the WAL files when doing a point in time recovery (PITR).  This means that at most, I would only have to apply around 6 hours of WAL files; which can still take a long time given a high volume system.

In my opinion snapshots are the best solution but it will depend on your infrastructure and requires knowledge of the infrastructure and setting it up correctly.

If I could not use snapshots; I would look into either pgBackRest or pgBarman.   

Re: pg_dump why no indicator of completion

От
Scott Ribe
Дата:
> On May 1, 2023, at 9:25 AM, David G. Johnston <david.g.johnston@gmail.com> wrote:
>
> The arguments about "the Unix way" seem to have forgotten the point about being silent except in cases of exception,
andthis level of detail already violates that premise. 

It would be a reasonable convenience. I think some of the pushback is from the framing that not having it is a defect.




Re: pg_dump why no indicator of completion

От
M Sarwar
Дата:
If somebody is seriously looking into these suggestions, I like to bring another point for the development of pgdump log.
In pgdump while displaying backedup table names, the same row in the log can include the number of rows backedup for each table.

Thank you,
Sarwar
Working on the project where I do not know the application 🙂



From: richard coleman <rcoleman.ascentgl@gmail.com>
Sent: Monday, May 1, 2023 11:34 AM
To: Rui DeSousa <rui@crazybean.net>
Cc: Ron <ronljohnsonjr@gmail.com>; pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Subject: Re: pg_dump why no indicator of completion
 
Rui, 

Thanks for those suggestions, and your earlier ones.

It would be helpful though if the dev included a completion status message in pg_dump, even if it was only included when you specified the -v flag.

rik.

On Mon, May 1, 2023 at 11:29 AM Rui DeSousa <rui@crazybean.net> wrote:


On May 1, 2023, at 10:34 AM, richard coleman <rcoleman.ascentgl@gmail.com> wrote:

As I've asked Ron, if pg_dump isn't fit for purpose, then what do you believe is?

If you need a logical backup then pg_dump is the right tool; as noted before but with better options to improve performance.

For backups; I would recommend physical backups which there are many solutions for.  I have personally used snapshots and believe that is the best solution.  It is the fastest and best solution to meet recovery time objective (RTO).  For example, I’m able to make a backup of a multi-terabyte database in under a second and multiple backups are cheap.  This allows me to do a backup of the system every 6 hours. The most expensive part of a database restore is going to be applying the WAL files when doing a point in time recovery (PITR).  This means that at most, I would only have to apply around 6 hours of WAL files; which can still take a long time given a high volume system.

In my opinion snapshots are the best solution but it will depend on your infrastructure and requires knowledge of the infrastructure and setting it up correctly.

If I could not use snapshots; I would look into either pgBackRest or pgBarman.   

Re: pg_dump why no indicator of completion

От
Ron
Дата:
On 5/1/23 10:50, Scott Ribe wrote:
On May 1, 2023, at 9:25 AM, David G. Johnston <david.g.johnston@gmail.com> wrote:

The arguments about "the Unix way" seem to have forgotten the point about being silent except in cases of exception, and this level of detail already violates that premise.
It would be a reasonable convenience. I think some of the pushback is from the framing that not having it is a defect.

Exactly.  It's a convenience, the lack of which is easily worked around.

What would really be useful are timestamps (both start and completion of dump) and an easy association of ".dat numbers" with table names.

--
Born in Arizona, moved to Babylonia.

Re: pg_dump why no indicator of completion

От
Tim
Дата:
This has been a very entertaining thread to read through. Thankyou.

> I will admit that I haven't tried writing custom binary outputs yet, typically using either the builtin custom format or directory format.

If you really are working with Muli-TB databases (kinda hard to believe in the first place, doing a backup that takes many hours with pg_dump will cause your WAL folder to fill up), then its time to learn some more mature tools.

pg-dump made sense maybe a decade ago. As recommended above, nowadays people use *pgbackrest* and *pgbarman*.

> Since multiple pg_dump commands are often run at the same command prompt, and they can take hours, if not days to run,
> and there are a myriad of  other commands the will be run in the interim, any exit code generated by pg_dump with be lost
>  in the flotsam and jetsam of the multitude of exit codes created by every other command run between the calling of pg_dump
>  and it's eventual termination. It could have completed successfully, crashed, been killed, etc.

Your workflow sounds completely mystifying.

> then I am going to be forced to create a wrapper to capture and emit the status around the pg_dump executable.
> If I need to do that, so will every other user who needs to know the exit status and runs pg_dump in a typical linux
> workflow. 

I'm sorry, but If putting your multi terabyte backup process into a 3 line script that checks the output and maybe emailing
you or just appending to the logfile seems like a crazy idea, you're going to be completely floored once you learn a few extra basic scripting
patterns in the world of Linux.

You're basically saying you're entirely constrained to triggering processes from the terminal by hand (please use a cronjob) and due to that insurmountable limitation, you also wont learn how an && works.

> Seems like a terrible waste having untold numbers of users having to constantly reinvent the wheel.

You might be overstating things a bit



On Mon, May 1, 2023 at 12:15 PM Ron <ronljohnsonjr@gmail.com> wrote:
On 5/1/23 10:50, Scott Ribe wrote:
On May 1, 2023, at 9:25 AM, David G. Johnston <david.g.johnston@gmail.com> wrote:

The arguments about "the Unix way" seem to have forgotten the point about being silent except in cases of exception, and this level of detail already violates that premise.
It would be a reasonable convenience. I think some of the pushback is from the framing that not having it is a defect.

Exactly.  It's a convenience, the lack of which is easily worked around.

What would really be useful are timestamps (both start and completion of dump) and an easy association of ".dat numbers" with table names.

--
Born in Arizona, moved to Babylonia.

Re: pg_dump why no indicator of completion

От
richard coleman
Дата:
Tim, 

I'm glad that I was able to bring you a modicum of joy.

Yes, I have databases that range from tens of megabytes to tens of terabytes.  Most are on servers that are far too under powered for the databases they are running.  As these databases are both in constant use, with users that have, how do I put it kindly, interesting workflows, it really isn't possible to cron a backup job.  I do use cron for a number of other things though.  Backups have to be on a manual basis for the time being when I can schedule enough time away from the users to complete it.  Manged a 2.2T database this afternoon, it only took about 10 hours with pg_dump in the database format running 7 jobs.  Amazingly, my WAL folder has yet to fill up, I guess I must be doing it wrong. ¯\_(ツ)_/¯

Since pg_dumpall is limited to SQL I typically use a custom bash shell script where I use pg_dumpall to backup the globals to a .tar.gz file, and then loop through the cluster calling pg_dump on each database in parallel.  Depending on the situation, I either use one that writes each to a separate compressed custom format (.backup), or a separate directory.  Each backup gets it's own log file that records the status with a double verbose switch.  After reading the responses in this thread I've adjusted it to capture the exit status (using the same && that I use in my collection of commands to apt-get upgrade, update, autoremove, and test server updates) and append a "pg_dump complete." message to the end of the log file if pg_dump completes successfully.

Looking over standard pg_restore log files is what prompted me to ask the question in the first place.  Whenever I run pg_restore it completes with messages along the lines of:
pg_restore: finished main parallel loop
pg_restore: entering restore_toc_entries_postfork
pg_restore: warning: errors ignored on restore: 6
I found it rather odd that pg_dump, being the mirror of pg_restore, didn't do something similar.

We would all like to be able to do things in a "better way" with the best equipment, workflows, etc., but at the end of the day we are all limited by the reality we find ourselves in.

Take care and keep smiling, 
rik.


On Mon, May 1, 2023 at 9:30 PM Tim <timfosho@gmail.com> wrote:
This has been a very entertaining thread to read through. Thankyou.

> I will admit that I haven't tried writing custom binary outputs yet, typically using either the builtin custom format or directory format.

If you really are working with Muli-TB databases (kinda hard to believe in the first place, doing a backup that takes many hours with pg_dump will cause your WAL folder to fill up), then its time to learn some more mature tools.

pg-dump made sense maybe a decade ago. As recommended above, nowadays people use *pgbackrest* and *pgbarman*.

> Since multiple pg_dump commands are often run at the same command prompt, and they can take hours, if not days to run,
> and there are a myriad of  other commands the will be run in the interim, any exit code generated by pg_dump with be lost
>  in the flotsam and jetsam of the multitude of exit codes created by every other command run between the calling of pg_dump
>  and it's eventual termination. It could have completed successfully, crashed, been killed, etc.

Your workflow sounds completely mystifying.

> then I am going to be forced to create a wrapper to capture and emit the status around the pg_dump executable.
> If I need to do that, so will every other user who needs to know the exit status and runs pg_dump in a typical linux
> workflow. 

I'm sorry, but If putting your multi terabyte backup process into a 3 line script that checks the output and maybe emailing
you or just appending to the logfile seems like a crazy idea, you're going to be completely floored once you learn a few extra basic scripting
patterns in the world of Linux.

You're basically saying you're entirely constrained to triggering processes from the terminal by hand (please use a cronjob) and due to that insurmountable limitation, you also wont learn how an && works.

> Seems like a terrible waste having untold numbers of users having to constantly reinvent the wheel.

You might be overstating things a bit



On Mon, May 1, 2023 at 12:15 PM Ron <ronljohnsonjr@gmail.com> wrote:
On 5/1/23 10:50, Scott Ribe wrote:
On May 1, 2023, at 9:25 AM, David G. Johnston <david.g.johnston@gmail.com> wrote:

The arguments about "the Unix way" seem to have forgotten the point about being silent except in cases of exception, and this level of detail already violates that premise.
It would be a reasonable convenience. I think some of the pushback is from the framing that not having it is a defect.

Exactly.  It's a convenience, the lack of which is easily worked around.

What would really be useful are timestamps (both start and completion of dump) and an easy association of ".dat numbers" with table names.

--
Born in Arizona, moved to Babylonia.

Re: pg_dump why no indicator of completion

От
Erik Wienhold
Дата:
> On 02/05/2023 04:42 CEST richard coleman <rcoleman.ascentgl@gmail.com> wrote:
>
> I'm glad that I was able to bring you a modicum of joy.

I find this entire thread entertaining.

> Backups have to be on a manual basis for the time being when I can schedule
> enough time away from the users to complete it. Manged a 2.2T database this
> afternoon, it only took about 10 hours with pg_dump in the database format
> running 7 jobs.

Differential/incremental backups with Barman or pgBackRest can cut down on this
time.

> Amazingly, my WAL folder has yet to fill up, I guess I must be doing it wrong.

Of course there's not much WAL activity if you schedule your backup for a time
window with low client activity.

--
Erik



Re: pg_dump why no indicator of completion

От
Ian Lawrence Barwick
Дата:
2023年5月1日(月) 22:26 richard coleman <rcoleman.ascentgl@gmail.com>:
>
> Scott,
>
> Generally I run pg_dump nohup in the background with all outputs captured to a file.  Finding the exit message is
simplya matter of navigating to the bottom of the output file and checking. 
> Excepting for pg_dump, I don't generally have to worry about exit values as most command line programs are either
short-livedand obvious, like ls, or emit their own text exit status. 

If you do need to run pg_dump manually, have you considered running each
invocation in a terminal multiplexer such as "screen" or "tmux"? The
thought of long-running processes hanging around in NOHUP purgatory
fills me with a sense of impending doom.


Regards

Ian Barwick



Re: pg_dump why no indicator of completion

От
Tim
Дата:
 > After reading the responses in this thread I've adjusted it to capture the exit status

Just so I'm not a complete troll, heres your example which also appends a success/fail message :)

nohup $(pg_dump ${connection_details} > something.log 2>&1 && echo "pgdump succeded" >> something.log || echo "pgdump failed" something.log) &


Re: pg_dump why no indicator of completion

От
Ron
Дата:
On 5/1/23 22:21, Tim wrote:
 > After reading the responses in this thread I've adjusted it to capture the exit status

Just so I'm not a complete troll, heres your example which also appends a success/fail message :)

nohup $(pg_dump ${connection_details} > something.log 2>&1 && echo "pgdump succeded" >> something.log || echo "pgdump failed" something.log) &

Out of curiosity, why this:
    > something.log 2>&1
instead of that?
    &> something.log

Yes, "&>" is a bashism, but so what?

--
Born in Arizona, moved to Babylonia.

Re: pg_dump why no indicator of completion

От
Ron
Дата:
On 5/1/23 21:42, richard coleman wrote:
As these databases are both in constant use, with users that have, how do I put it kindly, interesting workflows, it really isn't possible to cron a backup job.

You can't edit crontab to run a script a few minutes from "now"?  (My "postgres" crontabs are replete with commented out ad hoc scripts which get uncommented and rescheduled as needed.)

--
Born in Arizona, moved to Babylonia.

Re: pg_dump why no indicator of completion

От
richard coleman
Дата:
Tim, 

Thanks, but as I had previously mentioned, I had already taken care of that little detail.

rik.

On Mon, May 1, 2023 at 11:22 PM Tim <timfosho@gmail.com> wrote:
 > After reading the responses in this thread I've adjusted it to capture the exit status

Just so I'm not a complete troll, heres your example which also appends a success/fail message :)

nohup $(pg_dump ${connection_details} > something.log 2>&1 && echo "pgdump succeded" >> something.log || echo "pgdump failed" something.log) &