Re: postgresql latency & bgwriter not doing its job

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: postgresql latency & bgwriter not doing its job
Дата
Msg-id 20140830184558.GA31166@awork2.anarazel.de
обсуждение исходный текст
Ответ на Re: postgresql latency & bgwriter not doing its job  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: postgresql latency & bgwriter not doing its job  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Список pgsql-hackers
On 2014-08-30 14:16:10 -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2014-08-30 13:50:40 -0400, Tom Lane wrote:
> >> A possible compromise is to sort a limited number of
> >> buffers ---- say, collect a few thousand dirty buffers then sort, dump and
> >> fsync them, repeat as needed.
> 
> > Yea, that's what I suggested nearby. But I don't really like it, because
> > it robs us of the the chance to fsync() a relfilenode immediately after
> > having synced all its buffers.
> 
> Uh, how so exactly?  You could still do that.  Yeah, you might fsync a rel
> once per sort-group and not just once per checkpoint, but it's not clear
> that that's a loss as long as the group size isn't tiny.

Because it wouldn't have the benefit of sycing the minimal amount of
data anymore. If lots of other relfilenodes have been synced inbetween
the amount of newly dirtied pages in the os' buffercache (written by
backends, bgwriter) for a individual relfilenode is much higher.

A fsync() on a file with dirty data often causes *serious* latency
spikes - we should try hard to avoid superflous calls.

As an example: Calling fsync() on pgbench_accounts's underlying files,
from outside postgres, *before* postgres even started its first
checkpoint does this:
progress: 72.0 s, 4324.9 tps, lat 41.481 ms stddev 40.567
progress: 73.0 s, 4704.9 tps, lat 38.465 ms stddev 35.436
progress: 74.0 s, 4448.5 tps, lat 40.058 ms stddev 32.634
progress: 75.0 s, 4634.5 tps, lat 39.229 ms stddev 33.463
progress: 76.8 s, 2753.1 tps, lat 48.693 ms stddev 75.309
progress: 77.1 s, 126.6 tps, lat 773.433 ms stddev 222.667
progress: 78.0 s, 183.7 tps, lat 786.401 ms stddev 395.954
progress: 79.1 s, 170.3 tps, lat 975.949 ms stddev 596.751
progress: 80.0 s, 2116.6 tps, lat 168.608 ms stddev 398.933
progress: 81.0 s, 4436.1 tps, lat 40.313 ms stddev 34.198
progress: 82.0 s, 4383.9 tps, lat 41.811 ms stddev 37.241

Note the dip from 4k tps to 130 tps.

We can get a handle on that (on some platforms at least) for writes
issued during the buffer sync by forcing the kernel to write out the
pages in small increments; but I doubt we want to do that for writes by
backends themselves.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: postgresql latency & bgwriter not doing its job
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Selectivity estimation for inet operators