Обсуждение: pg_basebackup delays closing of stdout

Поиск
Список
Период
Сортировка

pg_basebackup delays closing of stdout

От
Jeff Janes
Дата:
Ever since pg_basebackup was created, it had a comment like this:

     * End of chunk. If requested, and this is the base tablespace
     * write configuration file into the tarfile. When done, close the
     * file (but not stdout).

But, why make the exception for output going to stdout?  If we are done with it, why not close it?

After a massive maintenance operation, I want to re-seed a streaming standby, which I start to do by: 

pg_basebackup -D - -Ft -P -X none  | pxz > base.tar.xz

But the archiver is way behind, so when it finishes the basebackup part, I get:

NOTICE:  pg_stop_backup cleanup done, waiting for required WAL segments to be archived
WARNING:  pg_stop_backup still waiting for all required WAL segments to be archived (60 seconds elapsed)
...

The base backup file is not finalized, because pg_basebackup has not closed its stdout while waiting for the WAL segment to be archived. The file is incomplete due to data stuck in buffers, so I can't copy it to where I want and bring up a new streaming replica (which bypasses the WAL archive, so would otherwise work). Also, if pg_basebackup gets interupted somehow while it is waiting for WAL archiving, the backup will be invalid, as it won't flush the last bit of data.  Of course if it gets interupted, I would have to test the backup to make sure it is valid.  But testing it and finding that it is valid is better than testing it and finding that it is not.

I think it makes sense for pg_basebackup to wait for the WAL to be archived, but there is no reason for it to hold the base.tar.xz file hostage while it does so.

If I simply remove the test for strcmp(basedir, "-"), as in the attached, I get the behavior I desire, and nothing bad seems to happen.  Meaning, "make check -C src/bin/pg_basebackup/" still passes (but only tested on Linux).

Is there a reason not to do this? 

Cheers,

Jeff
Вложения

Re: pg_basebackup delays closing of stdout

От
Andres Freund
Дата:
Hi,

On 2019-07-23 22:16:26 -0400, Jeff Janes wrote:
> Ever since pg_basebackup was created, it had a comment like this:
> 
>      * End of chunk. If requested, and this is the base tablespace
>      * write configuration file into the tarfile. When done, close the
>      * file (but not stdout).
> 
> But, why make the exception for output going to stdout?  If we are done
> with it, why not close it?

I think closing stdout is a bad idea that can cause issues in a lot of
situations. E.g. because a later open() will then use that fd (the
lowest unused fd always gets used), and then the next time somebody
wants to write something to stdout, there's normal output interspersed
with some random file.  You'd at the least have to reopen /dev/null into
its place or such.

It also seems likely to be a trap for some future feature additions that
want to write another file to stdout or such - in contrast to the normal
files it can't be reopened.


> After a massive maintenance operation, I want to re-seed a streaming
> standby, which I start to do by:
> 
> pg_basebackup -D - -Ft -P -X none  | pxz > base.tar.xz
> 
> But the archiver is way behind, so when it finishes the basebackup part, I
> get:
> 
> NOTICE:  pg_stop_backup cleanup done, waiting for required WAL segments to
> be archived
> WARNING:  pg_stop_backup still waiting for all required WAL segments to be
> archived (60 seconds elapsed)
> ...
> 
> The base backup file is not finalized, because pg_basebackup has not closed
> its stdout while waiting for the WAL segment to be archived. The file is
> incomplete due to data stuck in buffers, so I can't copy it to where I want
> and bring up a new streaming replica (which bypasses the WAL archive, so
> would otherwise work).

That seems more like an argument for sticking a fflush() there, than
closing stdout.


Greetings,

Andres Freund