Re: Amazon High I/O instances

Поиск

Список

Период

Сортировка

От	Sébastien Lorion
Тема	Re: Amazon High I/O instances
Дата	13 сентября 2012 г. 09:27:34
Msg-id	CAGa5y0OBKu1gJkeckQBGNgHGWezkK=JVQfjj+FOi-PGm-8-dLg@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Amazon High I/O instances (Sébastien Lorion <sl@thestrangefactory.com>)
Список	pgsql-general

Дерево обсуждения

pgbench initialization has been going on for almost 5 hours now and still stuck before vacuum starts .. something is definitely wrong as I don't remember it took so long first time I created the db. Here are the current stats now:

iostat (xbd13-14 are WAL zpool)

device r/s w/s kr/s kw/s qlen svc_t %b

xbd8 161.3 109.8 1285.4 3450.5 0 12.5 19

xbd7 159.5 110.6 1272.3 3450.5 0 11.4 14

xbd6 161.1 108.8 1284.4 3270.6 0 10.9 14

xbd5 159.5 109.0 1273.1 3270.6 0 11.6 15

xbd14 0.0 0.0 0.0 0.0 0 0.0 0

xbd13 0.0 0.0 0.0 0.0 0 0.0 0

xbd12 204.6 110.8 1631.3 3329.2 0 9.1 15

xbd11 216.0 111.2 1722.5 3329.2 1 8.6 16

xbd10 197.2 109.4 1573.5 3285.8 0 9.8 15

xbd9 195.0 109.4 1557.1 3285.8 0 9.9 15

zpool iostat (db pool)

pool alloc free read write read write

db 143G 255G 1.40K 1.53K 11.2M 12.0M

vmstat

procs memory page disks faults cpu

r b w avm fre flt re pi po fr sr ad0 xb8 in sy cs us sy id

0 0 0 5634M 28G 7 0 0 0 7339 0 0 245 2091 6358 20828 2 5 93

0 0 0 5634M 28G 10 0 0 0 6989 0 0 312 1993 6033 20090 1 4 95

0 0 0 5634M 28G 7 0 0 0 6803 0 0 292 1974 6111 22763 2 5 93

0 0 0 5634M 28G 10 0 0 0 7418 0 0 339 2041 6170 20838 2 4 94

0 0 0 5634M 28G 123 0 0 0 6980 0 0 282 1977 5906 19961 2 4 94

top

last pid: 2430; load averages: 0.72, 0.73, 0.69 up 0+04:56:16 04:52:53

32 processes: 1 running, 31 sleeping

CPU: 1.8% user, 0.0% nice, 5.3% system, 1.4% interrupt, 91.5% idle

Mem: 1817M Active, 25M Inact, 36G Wired, 24K Cache, 699M Buf, 28G Free

Swap:

PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND

1283 pgsql 1 34 0 3967M 1896M zio->i 5 80:14 21.00% postgres

1282 pgsql 1 25 0 25740K 3088K select 2 10:34 0.00% pgbench

1274 pgsql 1 20 0 2151M 76876K select 1 0:09 0.00% postgres

On Wed, Sep 12, 2012 at 9:16 PM, Sébastien Lorion <sl@thestrangefactory.com> wrote:

I recreated the DB and WAL pools, and launched pgbench -i -s 10000. Here are the stats during the load (still running):

iostat (xbd13-14 are WAL zpool)
device r/s w/s kr/s kw/s qlen svc_t %b
xbd8 0.0 471.5 0.0 14809.3 40 67.9 84
xbd7 0.0 448.1 0.0 14072.6 39 62.0 74
xbd6 0.0 472.3 0.0 14658.6 39 61.3 77
xbd5 0.0 464.7 0.0 14433.1 39 61.4 76
xbd14 0.0 0.0 0.0 0.0 0 0.0 0
xbd13 0.0 0.0 0.0 0.0 0 0.0 0
xbd12 0.0 460.1 0.0 14189.7 40 63.4 78
xbd11 0.0 462.9 0.0 14282.8 40 61.8 76
xbd10 0.0 477.0 0.0 14762.1 38 61.2 77
xbd9 0.0 477.6 0.0 14796.2 38 61.1 77

zpool iostat (db pool)
pool alloc free read write read write
db 11.1G 387G 0 6.62K 0 62.9M

vmstat
procs memory page disks faults cpu
r b w avm fre flt re pi po fr sr ad0 xb8 in sy cs us sy id
0 0 0 3026M 35G 126 0 0 0 29555 0 0 478 2364 31201 26165 10 9 81

top
last pid: 1333; load averages: 1.89, 1.65, 1.08 up 0+01:17:08 01:13:45
32 processes: 2 running, 30 sleeping
CPU: 10.3% user, 0.0% nice, 7.8% system, 1.2% interrupt, 80.7% idle
Mem: 26M Active, 19M Inact, 33G Wired, 16K Cache, 25M Buf, 33G Free

On Wed, Sep 12, 2012 at 9:02 PM, Sébastien Lorion <sl@thestrangefactory.com> wrote:
>
> One more question .. I could not set wal_sync_method to anything else but fsync .. is that expected or should other choices be also available ? I am not sure how the EC2 SSD cache flushing is handled on EC2, but I hope it is flushing the whole cache on every sync .. As a side note, I got corrupted databases (errors about pg_xlog directories not found, etc) at first when running my tests, and I suspect it was because of vfs.zfs.cache_flush_disable=1, though I cannot prove it for sure.
>
> Sébastien
>
>
> On Wed, Sep 12, 2012 at 8:49 PM, Sébastien Lorion <sl@thestrangefactory.com> wrote:
>>
>> Is dedicating 2 drives for WAL too much ? Since my whole raid is comprised of SSD drives, should I just put it in the main pool ?
>>
>> Sébastien
>>
>>
>> On Wed, Sep 12, 2012 at 8:28 PM, Sébastien Lorion <sl@thestrangefactory.com> wrote:
>>>
>>> Ok, make sense .. I will update that as well and report back. Thank you for your advice.
>>>
>>> Sébastien
>>>
>>>
>>> On Wed, Sep 12, 2012 at 8:04 PM, John R Pierce <pierce@hogranch.com> wrote:
>>>>
>>>> On 09/12/12 4:49 PM, Sébastien Lorion wrote:
>>>>>
>>>>> You set shared_buffers way below what is suggested in Greg Smith book (25% or more of RAM) .. what is the rationale behind that rule of thumb ? Other values are more or less what I set, though I could lower the effective_cache_size and vfs.zfs.arc_max and see how it goes.
>>>>
>>>>
>>>> I think those 25% rules were typically created when ram was no more than 4-8GB.
>>>>
>>>> for our highly transactional workload, at least, too large of a shared_buffers seems to slow us down, perhaps due to higher overhead of managing that many 8k buffers. I've heard other read-mostly workloads, such as data warehousing, can take advantage of larger buffer counts.
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> john r pierce N 37, W 122
>>>> santa cruz ca mid-left coast
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>>>> To make changes to your subscription:
>>>> http://www.postgresql.org/mailpref/pgsql-general
>>>
>>>
>>
>

В списке pgsql-general по дате отправления:

Предыдущее

От: John R Pierce
Дата: 13 сентября 2012 г., 09:00:07
Сообщение: Re: how long to wait on 9.2 bitrock installer?

Следующее

От: Alex Lai
Дата: 13 сентября 2012 г., 09:39:11
Сообщение: Planner forces seq scan when select without quoting its values

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Amazon High I/O instances

Предыдущее

Следующее