Re: Checkpoint Tuning Question

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: Checkpoint Tuning Question
Дата	12 июля 2009 г. 17:10:27
Msg-id	2479.1247418610@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: Checkpoint Tuning Question (Simon Riggs <simon@2ndQuadrant.com>)
Ответы	Re: Checkpoint Tuning Question Re: Checkpoint Tuning Question
Список	pgsql-general

Дерево обсуждения

Simon Riggs <simon@2ndQuadrant.com> writes:
> This causes us to queue for the WALInsertLock twice at exactly the time
> when every caller needs to calculate the CRC for complete blocks. So we
> queue twice when the lock-hold-time is consistently high, causing queue
> lengths to go ballistic.

You keep saying that, and it keeps not being true, because the CRC
calculation is *not* done while holding the lock.

It is true that the very first XLogInsert call in each backend after
a checkpoint starts will have to go back and redo its CRC calculation,
but that's a one-time waste of CPU.  It's hard to see how it could have
continuing effects over several seconds, especially in a system that
has CPU to spare.

What I think might be the cause is that just after a checkpoint starts,
quite a large proportion of XLogInserts will include full-page buffer
copies, thus leading to an overall higher rate of WAL creation.  That
means longer hold times for WALInsertLock due to spending more time
copying data into the WAL buffers, and it also means more WAL that has
to be synced to disk before a transaction can commit.  I'm still
convinced that Dan's problem ultimately comes down to inadequate disk
bandwidth, so I think the latter point is probably the key.

So this thought leads to a couple of other things Dan could test.
First, see if turning off full_page_writes makes the hiccup go away.
If so, we know the problem is in this area (though still not exactly
which reason); if not we need another idea.  That's not a good permanent
fix though, since it reduces crash safety.  The other knobs to
experiment with are synchronous_commit and wal_sync_method.  If the
stalls are due to commits waiting for additional xlog to get written,
then async commit should stop them.  I'm not sure if changing
wal_sync_method can help, but it'd be worth experimenting with.

            regards, tom lane

В списке pgsql-general по дате отправления:

Предыдущее

От: dkeeney
Дата: 12 июля 2009 г., 16:59:24
Сообщение: Postgresql databases as a web service

Следующее

От: Roy Walter
Дата: 12 июля 2009 г., 17:42:29
Сообщение: Re: xpath() subquery for empty array

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Checkpoint Tuning Question

Предыдущее

Следующее