Re: [HACKERS] subscription worker signalling wal writer too much

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: [HACKERS] subscription worker signalling wal writer too much
Дата
Msg-id 20170614232922.igl2qhdeqdp77niq@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: [HACKERS] subscription worker signalling wal writer too much  (Jeff Janes <jeff.janes@gmail.com>)
Ответы Re: [HACKERS] subscription worker signalling wal writer too much  (Jeff Janes <jeff.janes@gmail.com>)
Список pgsql-hackers
On 2017-06-14 16:24:27 -0700, Jeff Janes wrote:
> On Wed, Jun 14, 2017 at 3:20 PM, Andres Freund <andres@anarazel.de> wrote:
> 
> > On 2017-06-14 15:08:49 -0700, Jeff Janes wrote:
> > > On Wed, Jun 14, 2017 at 11:55 AM, Jeff Janes <jeff.janes@gmail.com>
> > wrote:
> > >
> > > > If I publish a pgbench workload and subscribe to it, the subscription
> > > > worker is signalling the wal writer thousands of times a second, once
> > for
> > > > every async commit.  This has a noticeable performance cost.
> > > >
> > >
> > > I've used a local variable to avoid waking up the wal writer more than
> > once
> > > for the same page boundary.  This reduces the number of wake-ups by about
> > > 7/8.
> >
> > Maybe I'm missing something here, but isn't that going to reduce our
> > guarantees about when asynchronously committed xacts are flushed out?
> > You can easily fit a number of commits into the same page...   As this
> > isn't specific to logical-rep, I don't think that's ok.
> >
> 
> The guarantee is based on wal_writer_delay not on SIGUSR1, so I don't think
> this changes that. (Also, it isn't really a guarantee, the fsync can take
> many seconds to complete once we do initiate it, and there is absolutely
> nothing we can do about that, other than do the fsync synchronously in the
> first place).

Well, wal_writer_delay doesn't work if walwriter is in sleep mode, and
this afaics would allow wal writer to go into sleep mode with half a
page filled, and it'd not be woken up again until the page is filled.
No?


> > Have you chased down why there's that many wakeups?  Normally I'd have
> > expected that a number of the SetLatch() calls get consolidated
> > together, but I guess walwriter is "too quick" in waking up and
> > resetting the latch?

> I'll have to dig into that some more.  The 7/8 reduction I cited was just
> in calls to SetLatch from that part of the code, I didn't measure whether
> the SetLatch actually called kill(owner_pid, SIGUSR1) or not when I
> determined that reduction, so it wasn't truly wake ups I measured.  Actual
> wake ups were measured only indirectly via the impact on performance.  I'll
> need to figure out how to instrument that without distorting the
> performance too much in the process..

I'd suspect that just measuring the number of kill() calls should be
doable, if measured via perf or something like hta.t


Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeff Janes
Дата:
Сообщение: Re: [HACKERS] subscription worker signalling wal writer too much
Следующее
От: "David G. Johnston"
Дата:
Сообщение: Re: [HACKERS] logical replication: \dRp+ and "for all tables"