Re: Refactoring the checkpointer's fsync request queue

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Refactoring the checkpointer's fsync request queue
Дата
Msg-id CA+Tgmobtym=iXUV8Lb-ifnxqpfKTiJBPSWuqryutFqUpuqL7AQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Refactoring the checkpointer's fsync request queue  (Thomas Munro <thomas.munro@enterprisedb.com>)
Ответы Re: Refactoring the checkpointer's fsync request queue  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On Sun, Nov 11, 2018 at 9:59 PM Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> There is one major problem with this patch

If there's only one, you're doing great!  Although admittedly this
seems like a big one...

> 1.  Go back to the current pressure-valve strategy: make the sending
> side perform the fsync(), if it can't immediately write to the pipe.

As you say, this will happen significantly more often with
deduplication.  That deduplication logic got added in response to a
real need.  Before that, you could cause an individual backend to
start doing its own fsyncs() with something as simple as a bulk load.
The queue would absorb most of them, but not all, and the performance
ramifications where noticeable.

> 2.  Offload the BufferSync() work to bgwriter, so the checkpointer can
> keep draining the pipe.  Communication between checkpointer and
> bgwriter can be fairly easily multiplexed with the pipe draining work.

That sounds a little like you are proposing to go back to the way
things were before 806a2aee3791244bf0f916729bfdb5489936e068 (and,
belatedly, bf405ba8e460051e715d0a91442b579e590328ce) although I guess
the division of labor wouldn't be quite the same.

> 3.  Multiplex the checkpointer's work:  Use LWLockConditionalAcquire()
> when locking buffers, and if that fails, try to drain the pipe, and
> then fall back to a LWLockTimedAcquire(), drain pipe, repeat loop.  I
> can hear you groan already; that doesn't seem particularly elegant,
> and there are portability problems implementing LWLockTimedAcquire():
> semtimedop() and sem_timedwait() are not available on all platforms
> (eg macOS).  Maybe pthread_timed_condwait() could help (!).

You don't really need to invent LWLockTimedAcquire().  You could just
keep retrying LWLockConditionalAcquire() in a delay loop.  I agree
that doesn't seem particularly elegant, though.

I still feel like this whole pass-the-fds-to-the-checkpointer thing is
a bit of a fool's errand, though.  I mean, there's no guarantee that
the first FD that gets passed to the checkpointer is the first one
opened, or even the first one written, is there?  It seems like if you
wanted to make this work reliably, you'd need to do it the other way
around: have the checkpointer (or some other background process) open
all the FDs, and anybody else who wants to have one open get it from
the checkpointer.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Paul Ramsey
Дата:
Сообщение: Re: Changing SQL Inlining Behaviour (or...?)
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: BUG #15212: Default values in partition tables don't work asexpected and allow NOT NULL violation