Re: Syncrep and improving latency due to WAL throttling

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Syncrep and improving latency due to WAL throttling
Дата
Msg-id 20230126154912.okzh7psxox6aphdm@awork3.anarazel.de
обсуждение исходный текст
Ответ на Re: Syncrep and improving latency due to WAL throttling  (Jakub Wartak <jakub.wartak@enterprisedb.com>)
Ответы Re: Syncrep and improving latency due to WAL throttling  (Jakub Wartak <jakub.wartak@enterprisedb.com>)
Список pgsql-hackers
Hi,

On 2023-01-26 14:40:56 +0100, Jakub Wartak wrote:
> In summary:  Attached is a slightly reworked version of this patch.
> 1. Moved logic outside XLogInsertRecord() under ProcessInterrupts()
> 2. Flushes up to the last page boundary, still uses SyncRepWaitForLSN()
> 3. Removed GUC for now (always on->256kB); potentially to be restored

Huh? Why did you remove the GUC? Clearly this isn't something we can default
to on.


> diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
> index d85e313908..05d56d65f9 100644
> --- a/src/backend/access/transam/xact.c
> +++ b/src/backend/access/transam/xact.c
> @@ -2395,6 +2395,7 @@ CommitTransaction(void)
>  
>      XactTopFullTransactionId = InvalidFullTransactionId;
>      nParallelCurrentXids = 0;
> +    backendWalInserted = 0;
>  
>      /*
>       * done with commit processing, set current transaction state back to

I don't like the resets around like this. Why not just move it into
XLogFlush()?



> +/*
> + * Called from ProcessMessageInterrupts() to avoid waiting while being in critical section
> + */ 
> +void HandleXLogDelayPending()
> +{
> +    /* flush only up to the last fully filled page */
> +    XLogRecPtr     LastFullyWrittenXLogPage = XactLastRecEnd - (XactLastRecEnd % XLOG_BLCKSZ);
> +    XLogDelayPending = false;

Hm - wonder if it'd be better to determine the LSN to wait for in
XLogInsertRecord(). There'll be plenty cases where we'll insert multiple (but
bounded number of) WAL records before processing interrupts. No need to flush
more aggressively than necessary...


> +    //HOLD_INTERRUPTS();
> +
> +    /* XXX Debug for now */
> +    elog(WARNING, "throttling WAL down on this session (backendWalInserted=%d, LSN=%X/%X flushingTo=%X/%X)", 
> +        backendWalInserted, 
> +        LSN_FORMAT_ARGS(XactLastRecEnd),
> +        LSN_FORMAT_ARGS(LastFullyWrittenXLogPage));
> +
> +    /* XXX: refactor SyncRepWaitForLSN() to have different waitevent than default WAIT_EVENT_SYNC_REP  */
> +    /* maybe new WAIT_EVENT_SYNC_REP_BIG or something like that */
> +    XLogFlush(LastFullyWrittenXLogPage);
> +    SyncRepWaitForLSN(LastFullyWrittenXLogPage, false);

SyncRepWaitForLSN() has this comment:
 * 'lsn' represents the LSN to wait for.  'commit' indicates whether this LSN
 * represents a commit record.  If it doesn't, then we wait only for the WAL
 * to be flushed if synchronous_commit is set to the higher level of
 * remote_apply, because only commit records provide apply feedback.


> +    elog(WARNING, "throttling WAL down on this session - end");
> +    backendWalInserted = 0;
> +
> +    //RESUME_INTERRUPTS();
> +}

I think we'd want a distinct wait event for this.



> diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
> index 1b1d814254..8ed66b2eae 100644
> --- a/src/backend/utils/init/globals.c
> +++ b/src/backend/utils/init/globals.c
> @@ -37,6 +37,7 @@ volatile sig_atomic_t IdleSessionTimeoutPending = false;
>  volatile sig_atomic_t ProcSignalBarrierPending = false;
>  volatile sig_atomic_t LogMemoryContextPending = false;
>  volatile sig_atomic_t IdleStatsUpdateTimeoutPending = false;
> +volatile sig_atomic_t XLogDelayPending = false;
>  volatile uint32 InterruptHoldoffCount = 0;
>  volatile uint32 QueryCancelHoldoffCount = 0;
>  volatile uint32 CritSectionCount = 0;

I don't think the new variable needs to be volatile, or even
sig_atomic_t. We'll just manipulate it from outside signal handlers.


Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Syncrep and improving latency due to WAL throttling
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Syncrep and improving latency due to WAL throttling