Re: [BUGS] pg_receivexlog creating holes in WAL (HEAD only)

Поиск
Список
Период
Сортировка
От Magnus Hagander
Тема Re: [BUGS] pg_receivexlog creating holes in WAL (HEAD only)
Дата
Msg-id CABUevExh2p6j=uzfMRUvFdm59dDKNcMmM=3H1rknau8Uwa4dig@mail.gmail.com
обсуждение исходный текст
Ответ на [BUGS] pg_receivexlog creating holes in WAL (HEAD only)  (Michael Paquier <michael.paquier@gmail.com>)
Ответы Re: [BUGS] pg_receivexlog creating holes in WAL (HEAD only)  (Michael Paquier <michael.paquier@gmail.com>)
Список pgsql-bugs


On Tue, Dec 27, 2016 at 4:10 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
Hi all,

On HEAD, I am seeing the following thing with pg_receivexlog.
1) Run it, server is still on segment 1/0/1:
$ pg_receivexlog -D ~/data/logs/ --verbose
pg_receivexlog: starting log streaming at 0/1000000 (timeline 1)
$ psql -X -c 'select pg_current_xlog_location()'
 pg_current_xlog_location
--------------------------
 0/1510840
(1 row)
2) Only a partial segment is being written:
$ ls ~/data/logs/
000000010000000000000001.partial
3) Kill the PG instance, pg_receivexlog rightfully complains:
pg_receivexlog: could not receive data from WAL stream: server closed
the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
pg_receivexlog: disconnected; waiting 5 seconds to try again
2016-12-27 11:35:02 JST [42342]: [1-1] db=,user=,app=,client= DEBUG:
logger shutting down
pg_receivexlog: could not connect to server: could not connect to
server: Connection refused
    Is the server running locally and accepting
    connections on Unix domain socket "/tmp/.s.PGSQL.5432"?
4) Looking at the logs directory, the .partial segment went away:
$ ls logs/
000000010000000000000001
5) Restart the PG instance, pg_receivexlog complains, again:
pg_receivexlog: starting log streaming at 0/2000000 (timeline 1)
pg_receivexlog: unexpected termination of replication stream: ERROR:
requested starting point 0/2000000 is ahead of the WAL flush position
of this server 0/15108B0
pg_receivexlog: disconnected; waiting 5 seconds to try again

Oops. Once the server switches to segment 1/0/2, it will then work
again, but any content from the moment where the server has been
killed to the moment it has reconnected has been lost. By that I mean
the end of segment 1/0/1.

The issue here is at step 4). When a connection failure happens,
pg_receivexlog renames the .partial segment to be a completed segment.
So at its next loop it considers the segment as sane, but it should be
considered as corrupted, and it should not use it. On back-branches,
9.6 included, the .partial file is kept after a connection failure and
streaming begins from the beginning of the partial segment, the last
.partial file being overwritten. In short, this is an oversight from
56c7d8d, this bit to be precise:
@@ -777,10 +679,10 @@ ReceiveXlogStream(PGconn *conn, StreamCtl *stream)
    }

 error:
-   if (walfile != -1 && close(walfile) != 0)
+   if (walfile != NULL && stream->walmethod->close(walfile, CLOSE_NORMAL) != 0)
        fprintf(stderr, _("%s: could not close file \"%s\": %s\n"),
-               progname, current_walfile_name, strerror(errno));
-   walfile = -1;
+         progname, current_walfile_name, stream->walmethod->getlasterror());
+   walfile = NULL;

The problem is that when a temporary suffix is defined, CLOSE_NORMAL
will rename the existing segment (this is not close_walfile()!).
Attached is a patch to fix the problem, which is to just use
CLOSE_NO_RENAME instead of CLOSE_NORMAL.

Hi!

Yeah, agreed, this seems like a pure bug in the walmethod patch. Probably caused by copy/paste-replacing that interface way too many times during my development.

Good spot, thanks! Applied. 


--

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: [BUGS] [BUG] pg9.4.10 Logical decoding did not get the correct oldtuplelen
Следующее
От: email@piotr-stefaniak.me
Дата:
Сообщение: [BUGS] BUG #14479: Casting non-constant values to INTERVAL DAY TO MINUTEdoesn't truncate the seconds field