Help with streaming replication protocol

Поиск
Список
Период
Сортировка
От Christopher Bottaro
Тема Help with streaming replication protocol
Дата
Msg-id DM6PR22MB205937A29DEF55A4D7205B8AE8B20@DM6PR22MB2059.namprd22.prod.outlook.com
обсуждение исходный текст
Список pgsql-general
Hello,

So from a high level, I understand that Postgres will send messages (XLogData) and the receiver needs to ack these messages so Postgres knows it's ok to delete data from disk. I don't understand some details of the protocol though.

Working off the documentation here:

(Side note, it's super annoying that the documentation doesn't name these fields, so I'm going to name them here to make things easier to talk about.)

An XLogData message has two interesting fields:
```
message_wal_start) "The starting point of the WAL data in this message."
server_wal_end) "The current end of WAL on the server."
```

Which one do I care about? It seems like message_wal_start, and I don't even get the point of server_wal_end.

When sending a "Standby status update" (which is presumably the "ack" message), I don't understand why there are *three* fields regarding what part of the wal I've seen:
```
wal_written) "The location of the last WAL byte + 1 received and written to disk in the standby."
wal_flushed) "The location of the last WAL byte + 1 flushed to disk in the standby."
wal_applied) "The location of the last WAL byte + 1 applied in the standby."
```

It seems like I should be setting all 3 of them to the last message_wal_start that I've seen. If I look at sendFeedback() in pg_recvlogical.c, it doesn't even set wal_applied. Also, it doesn't do the +1 addition that the documentation says to do.

From some experimentation, I found that if I set all three fields to the last seen message_wal_start, then the replication slot's restart_lsn field will not advanced past the last XLogMessage that I've seen, so if I restart my program, I will get the last XLogMessage again (a duplicate).

To further confuse things, the Postgres server will periodically send a Keepalive message which has:
```
server_wal_end) "The current end of WAL on the server."
```

And it seems I need to send this back via a "Standby status update" message otherwise the replication slot's restart_lsn doesn't advance.

So I guess it boils down to three questions:
1) What should I care about in the XLogData messages? Which wal position?
2) What should I be sending in the status update messages?
3) Should I be doing anything with the server_wal_end in the keepalive messages?

Thank you for the help. If I get to the point of understanding well enough, I wouldn't mind adding it to the Postgresql wiki.

В списке pgsql-general по дате отправления:

Предыдущее
От: Scott Ribe
Дата:
Сообщение: Re: query, probably needs window functions
Следующее
От: Christopher Pereira
Дата:
Сообщение: Re: pg_basebackup + incremental base backups