Re:Re:Re: [BUG] standby node can not provide service even itreplays all log files

Поиск
Список
Период
Сортировка
От Thunder
Тема Re:Re:Re: [BUG] standby node can not provide service even itreplays all log files
Дата
Msg-id 78a38648.8cd4.16e12a5e053.Coremail.thunder1@126.com
обсуждение исходный текст
Ответ на Re:Re: [BUG] standby node can not provide service even it replaysall log files  (Thunder <thunder1@126.com>)
Список pgsql-hackers
Hi
In our usage scenario the standby node could be OOM killed and we have to create new standby node.
If master node has uncommitted long transaction and new standby node can not provide service.
So for us this is a critical issue.

I do hope any suggestion to this issue.
And can any one help to review the attached patch?
Thanks. 





At 2019-10-22 20:42:21, "Thunder" <thunder1@126.com> wrote:
Update the patch.
1. The STANDBY_SNAPSHOT_PENDING state is set when we replay the first XLOG_RUNNING_XACTS and the sub transaction ids are overflow.
2. When we log XLOG_RUNNING_XACTS in master node, can we assume that all xact IDS < oldestRunningXid are considered finished?
3. If we can assume this, when we replay XLOG_RUNNING_XACTS and change standbyState to STANDBY_SNAPSHOT_PENDING, can we record oldestRunningXid to a shared variable, like procArray->oldest_running_xid?
4. In standby node when call GetSnapshotData if procArray->oldest_running_xid is valid, can we set xmin to be procArray->oldest_running_xid?

Appreciate any suggestion to this issue.



At 2019-10-22 01:27:58, "Robert Haas" <robertmhaas@gmail.com> wrote: >On Mon, Oct 21, 2019 at 4:13 AM Thunder <thunder1@126.com> wrote: >> Can we fix this issue like the following patch? >> >> $git diff src/backend/access/transam/xlog.c >> diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c >> index 49ae97d4459..0fbdf6fd64a 100644 >> --- a/src/backend/access/transam/xlog.c >> +++ b/src/backend/access/transam/xlog.c >> @@ -8365,7 +8365,7 @@ CheckRecoveryConsistency(void) >> * run? If so, we can tell postmaster that the database is consistent now, >> * enabling connections. >> */ >> - if (standbyState == STANDBY_SNAPSHOT_READY && >> + if ((standbyState == STANDBY_SNAPSHOT_READY || standbyState == STANDBY_SNAPSHOT_PENDING) && >> !LocalHotStandbyActive && >> reachedConsistency && >> IsUnderPostmaster) > >I think that the issue you've encountered is design behavior. In >other words, it's intended to work that way. > >The comments for the code you propose to change say that we can allow >connections once we've got a valid snapshot. So presumably the effect >of your change would be to allow connections even though we don't have >a valid snapshot. > >That seems bad. > >-- >Robert Haas >EnterpriseDB: http://www.enterprisedb.com >The Enterprise PostgreSQL Company


 



 

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Konstantin Knizhnik
Дата:
Сообщение: Re: [Proposal] Global temporary tables
Следующее
От: Geoff Winkless
Дата:
Сообщение: Re: Proposition to use '==' as synonym for 'IS NOT DISTINCT FROM'