Re: Slow Query / Check Point Segments

Поиск
Список
Период
Сортировка
От Greg Smith
Тема Re: Slow Query / Check Point Segments
Дата
Msg-id 4B5A5AC6.1090302@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: Slow Query / Check Point Segments  (John R Pierce <pierce@hogranch.com>)
Ответы Re: Slow Query / Check Point Segments  (John R Pierce <pierce@hogranch.com>)
Список pgsql-general
John R Pierce wrote:
> we're having a similar problem with a very update intensive database
> that is part of a 24/7 manufacturing operation (no breathing time
> unless there's an unusual line down situtation)

Your problem is actually a bit different.

> 2010-01-23 01:08:13 MYTLOG:  checkpoint complete: wrote 7535 buffers
> (2.9%); 0 transaction log file(s) added, 0 removed, 20 recycled;
> write=86.510 s, sync=0.260 s, total=86.791 s
> 2010-01-23 01:13:22 MYTLOG:  checkpoint complete: wrote 12786 buffers
> (4.9%); 0 transaction log file(s) added, 0 removed, 17 recycled;
> write=95.028 s, sync=0.135 s, total=95.182 s
> 2010-01-23 01:16:22 MYTLOG:  checkpoint complete: wrote 11720 buffers
> (4.5%); 0 transaction log file(s) added, 0 removed, 19 recycled;
> write=41.854 s, sync=0.640 s, total=42.518 s
> 2010-01-23 01:19:11 MYTLOG:  checkpoint complete: wrote 10436 buffers
> (4.0%); 0 transaction log file(s) added, 0 removed, 20 recycled;
> write=50.330 s, sync=0.510 s, total=50.861 s
>

Your interval between checkpoints isn't too bad--3 minutes on the worst
of these.  And the amount per checkpoint isn't that high either:  12786
buffers=100MB.  Your problem is that the sync on that 100MB write is
taking an awfully long time.  Spacing your checkpoints out further would
help, sure, but I wouldn't be surprised to find that doesn't do much for
you.  Worth a shot, unlikely to hurt anything, just not optimistic about
it being the right think the way I was on the other message.

> This is running on a 16 core UltrasparcIV server, solaris 10, zfs,
> tons of disks on SAN with multiple tablespaces on different ZFS's,
> with the block sizes tuned appropriately

My guess is that there's something wrong with your config such that
writes followed by fsync are taking longer than they should.  When I see
"sync=0.640 s" into a SAN where that sync operation should be near
instant, I'd be looking for issues in the ZFS intent log setup, how much
data you've setup ZFS to write cache, and the write cache policy on the
SAN hardware.  There's something weird going on here, that sync should
be near instant on your hardware with such a small write volume.

Unfortunately, you may have to take the server down to find out exactly
what's going on, which doesn't sound practical for your environment.
(And people wonder why I have near religious zeal about testing disk
hardware before systems go into production)

--
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com


В списке pgsql-general по дате отправления:

Предыдущее
От: PG User 2010
Дата:
Сообщение: Re: VACUUM FULL performance issues with pg_largeobject table
Следующее
От: Craig Ringer
Дата:
Сообщение: Re: Shall I convert all my stored procedure to use "EXECUTE", to ensure I get index-scan