Re: Spread checkpoint sync

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Spread checkpoint sync
Дата
Msg-id AANLkTimOe70iC1-DG-cDOt_CR=OxkFeF2ajFdYH=fwQ5@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Spread checkpoint sync  (Itagaki Takahiro <itagaki.takahiro@gmail.com>)
Ответы Re: Spread checkpoint sync  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Список pgsql-hackers
On Mon, Jan 31, 2011 at 3:04 AM, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:
> On Mon, Jan 31, 2011 at 13:41, Robert Haas <robertmhaas@gmail.com> wrote:
>> 1. Absorb fsync requests a lot more often during the sync phase.
>> 2. Still try to run the cleaning scan during the sync phase.
>> 3. Pause for 3 seconds after every fsync.
>>
>> So if we want the checkpoint
>> to finish in, say, 20 minutes, we can't know whether the write phase
>> needs to be finished by minute 10 or 15 or 16 or 19 or only by 19:59.
>
> We probably need deadline-based scheduling, that is being used in write()
> phase. If we want to sync 100 files in 20 minutes, each file should be
> sync'ed in 12 seconds if we think each fsync takes the same time.
> If we would have better estimation algorithm (file size? dirty ratio?),
> each fsync chould have some weight factor.  But deadline-based scheduling
> is still needed then.

Right.  I think the problem is balancing the write and sync phases.
For example, if your operating system is very aggressively writing out
dirty pages to disk, then you want the write phase to be as long as
possible and the sync phase can be very short because there won't be
much work to do.  But if your operating system is caching lots of
stuff in memory and writing dirty pages out to disk only when
absolutely necessary, then the write phase could be relatively quick
without much hurting anything, but the sync phase will need to be long
to keep from crushing the I/O system.  The trouble is, we don't really
have a priori way to know which it's doing.  Maybe we could try to
tune based on the behavior of previous checkpoints, but I'm wondering
if we oughtn't to take the cheesy path first and split
checkpoint_completion_target into checkpoint_write_target and
checkpoint_sync_target.  That's another parameter to set, but I'd
rather add a parameter that people have to play with to find the right
value than impose an arbitrary rule that creates unavoidable bad
performance in certain environments.

> BTW, we should not sleep in full-speed checkpoint. CHECKPOINT command,
> shutdown, pg_start_backup(), and some of checkpoints during recovery
> might don't want to sleep.

Yeah, I think that's understood.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: [pgsql-general 2011-1-21:] Are there any projects interested in object functionality? (+ rule bases)
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: Error code for "terminating connection due to conflict with recovery"