Re: some longer, larger pgbench tests with various performance-related patches

Поиск
Список
Период
Сортировка
От Kevin Grittner
Тема Re: some longer, larger pgbench tests with various performance-related patches
Дата
Msg-id 4F2C40820200002500044D96@gw.wicourts.gov
обсуждение исходный текст
Список pgsql-hackers
Robert Haas  wrote:
> A couple of things stand out at me from these graphs. First, some
> of these transactions had really long latency. Second, there are a
> remarkable number of seconds all of the test during which no
> transactions at all manage to complete, sometimes several seconds
> in a row. I'm not sure why. Third, all of the tests initially start
> of processing transactions very quickly, and get slammed down very
> hard, probably because the very high rate of transaction processing
> early on causes a checkpoint to occur around 200 s.
The amazing performance at the very start of all of these tests
suggests that there is a write-back cache (presumably battery-backed)
which is absorbing writes until the cache becomes full, at which
point actual disk writes become a bottleneck.  The problems you
mention here, where no transactions complete, sounds like the usual
problem that many people have complained about on the lists, where
the controller cache becomes so overwhelmed that activity seems to
cease while the controller catches up.  Greg, and to a lesser degree
myself, have written about this for years.
On the nofpw graph, I wonder whether the lower write rate just takes
that much longer to fill the controller cache.  I don't think it's
out of the question that it could take 700 seconds instead of 200
seconds depending on whether full pages are being fsynced to WAL. 
This effect is precisely why I think that on such machines the DW
feature may be a huge help.  If one small file is being written to
and fsynced repeatedly, it stays "fresh" enough not to actually be
written to the disk (it will stay in OS or controller cache), and the
disks are freed up to write everything else, helping to keep the
controller cache from being overwhelmed.  (Whether patches to date
are effective at achieving this is a separate question -- I'm
convinced the concept is sound for certain important workloads.)
> I didn't actually log when the checkpoints were occurring,
It would be good to have that information if you can get it for
future tests.
-Kevin


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Noah Misch
Дата:
Сообщение: Re: Review of: explain / allow collecting row counts without timing info
Следующее
От: Jeff Janes
Дата:
Сообщение: Re: Review of: explain / allow collecting row counts without timing info