Re: [HACKERS] subscription worker signalling wal writer too much

Поиск
Список
Период
Сортировка
От Jeff Janes
Тема Re: [HACKERS] subscription worker signalling wal writer too much
Дата
Msg-id CAMkU=1xycMxkVV3ccwdxSF+HgJ1d7YwHf4Y52-A+iDJ5Cmg8Cg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] subscription worker signalling wal writer too much  (Andres Freund <andres@anarazel.de>)
Ответы Re: [HACKERS] subscription worker signalling wal writer too much  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On Wed, Jun 14, 2017 at 3:20 PM, Andres Freund <andres@anarazel.de> wrote:
On 2017-06-14 15:08:49 -0700, Jeff Janes wrote:
> On Wed, Jun 14, 2017 at 11:55 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
>
> > If I publish a pgbench workload and subscribe to it, the subscription
> > worker is signalling the wal writer thousands of times a second, once for
> > every async commit.  This has a noticeable performance cost.
> >
>
> I've used a local variable to avoid waking up the wal writer more than once
> for the same page boundary.  This reduces the number of wake-ups by about
> 7/8.

Maybe I'm missing something here, but isn't that going to reduce our
guarantees about when asynchronously committed xacts are flushed out?
You can easily fit a number of commits into the same page...   As this
isn't specific to logical-rep, I don't think that's ok.

The guarantee is based on wal_writer_delay not on SIGUSR1, so I don't think this changes that. (Also, it isn't really a guarantee, the fsync can take many seconds to complete once we do initiate it, and there is absolutely nothing we can do about that, other than do the fsync synchronously in the first place).

The reason for kicking the wal writer at page boundaries is so that hint bits can get set earlier than they otherwise could. But I don't think kicking it multiple times per page boundary can help in that effort.
 

Have you chased down why there's that many wakeups?  Normally I'd have
expected that a number of the SetLatch() calls get consolidated
together, but I guess walwriter is "too quick" in waking up and
resetting the latch?

I'll have to dig into that some more.  The 7/8 reduction I cited was just in calls to SetLatch from that part of the code, I didn't measure whether the SetLatch actually called kill(owner_pid, SIGUSR1) or not when I determined that reduction, so it wasn't truly wake ups I measured.  Actual wake ups were measured only indirectly via the impact on performance.  I'll need to figure out how to instrument that without distorting the performance too much in the process..

Cheers,

Jeff

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tatsuo Ishii
Дата:
Сообщение: Re: [HACKERS] Document bug regarding read only transactions
Следующее
От: Andres Freund
Дата:
Сообщение: Re: [HACKERS] subscription worker signalling wal writer too much