Re: Intel 710 pgbench write latencies

Поиск
Список
Период
Сортировка
От Merlin Moncure
Тема Re: Intel 710 pgbench write latencies
Дата
Msg-id CAHyXU0xYU8XVpkDqMta1mQX07UbPo1ZqN-1Cti630h50JUESng@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Intel 710 pgbench write latencies  (Yeb Havinga <yebhavinga@gmail.com>)
Список pgsql-performance
On Wed, Nov 2, 2011 at 10:16 AM, Yeb Havinga <yebhavinga@gmail.com> wrote:
> On 2011-11-02 15:26, Merlin Moncure wrote:
>>
>> On Wed, Nov 2, 2011 at 8:05 AM, Yeb Havinga<yebhavinga@gmail.com>  wrote:
>>>
>>> Hello list,
>>>
>>> A OCZ Vertex 2 PRO and Intel 710 SSD, both 100GB, in a software raid 1
>>> setup. I was pretty convinced this was the perfect solution to run
>>> PostgreSQL on SSDs without a IO controller with BBU. No worries for
>>> strange
>>> firmware bugs because of two different drives, good write endurance of
>>> the
>>> 710. Access to the smart attributes. Complete control over the disks:
>>> nothing hidden by a hardware raid IO layer.
>>>
>>> Then I did a pgbench test:
>>> - bigger than RAM test (~30GB database with 24GB ram)
>>> - and during that test I removed the Intel 710.
>>> - during the test I removed the 710 and 10 minutes later inserted it
>>> again
>>> and added it to the array.
>>>
>>> The pgbench transaction latency graph is here: http://imgur.com/JSdQd
>>>
>>> With only the OCZ, latencies are acceptable but with two drives, there
>>> are
>>> latencies up to 3 seconds! (and 11 seconds at disk remove time) Is this
>>> due
>>> to software raid, or is it the Intel 710? To figure that out I repeated
>>> the
>>> test, but now removing the OCZ, latency graph at: http://imgur.com/DQa59
>>> (The 12 seconds maximum was at disk remove time.)
>>>
>>> So the Intel 710 kind of sucks latency wise. Is it because it is also
>>> heavily reading, and maybe WAL should not be put on it?
>>>
>>> I did another test, same as before but
>>> * with 5GB database completely fitting in RAM (24GB)
>>> * put WAL on a ramdisk
>>> * started on the mirror
>>> * during the test mdadm --fail on the Intel SSD
>>>
>>> Latency graph is at: http://imgur.com/dY0Rk
>>>
>>> So still: with Intel 710 participating in writes (beginning of graph),
>>> some
>>> latencies are over 2 seconds, with only the OCZ, max write latencies are
>>> near 300ms.
>>>
>>> I'm now contemplating not using the 710 at all. Why should I not buy two
>>> 6Gbps SSDs without supercap (e.g. Intel 510 and OCZ Vertex 3 Max IOPS)
>>> with
>>> a IO controller+BBU?
>>>
>>> Benefits: should be faster for all kinds of reads and writes.
>>> Concerns: TRIM becomes impossible (which was already impossible with md
>>> raid1, lvm / dm based mirroring could work) but is TRIM important for a
>>> PostgreSQL io load, without e.g. routine TRUNCATES? Also the write
>>> endurance
>>> of these drives is probably a lot less than previous setup.
>>
>> software RAID (mdadm) is currently blocking TRIM.  the only way to to
>> get TRIM in a raid-ish environment is through LVM mirroring/striping
>> or w/brtfs raid (which is not production ready afaik).
>>
>> Given that, if you do use software raid, it's not a good idea to
>> partition the entire drive because the very first thing the raid
>> driver does is write to the entire device.
>
> If that is bad because of a decreased lifetime, I don't think these number
> of writes are significant - in a few hours of pgbenching I the GBs written
> are more than 10 times the GB sizes of the drives. Or do you suggest this
> because then the disk firmware can operate assuming a smaller idema
> capacity, thereby proloning the drive life? (i.e. the Intel 710 200GB has
> 200GB idema capacity but 320GB raw flash).

It's bad because the controller thinks all the data is 'live' -- that
is, important.   When all the data on the drive is live the fancy
tricks the controller pulls to do intelligent wear leveling and to get
fast write times becomes much more difficult which in turn leads to
more write amplification and early burnout.  Supposedly, the 710 has
extra space anyways which is probably there specifically to ameliorate
the raid issue as well as extend lifespan but I'm still curious how
this works out.

merlin

В списке pgsql-performance по дате отправления:

Предыдущее
От: Claudio Freire
Дата:
Сообщение: Re: Guide to PG's capabilities for inlining, predicate hoisting, flattening, etc?
Следующее
От: Craig James
Дата:
Сообщение: Re: Guide to PG's capabilities for inlining, predicate hoisting, flattening, etc?