Обсуждение: SSD performance

Поиск
Список
Период
Сортировка

SSD performance

От
david@lang.hm
Дата:
I spotted a new interesting SSD review. it's a $379 5.25" drive bay device
that holds up to 8 DDR2 DIMMS (up to 8G per DIMM) and appears to the
system as a SATA drive (or a pair of SATA drives that you can RAID-0 to
get past the 300MB/s SATA bottleneck)

the best review I've seen only ran it on windows (and a relativly old
hardware platform at that), I suspect it's performance would be even
better under linux and with a top-notch controller card (especially with
the RAID option)

it has a battery backup (good for 4 hours or so) and a CF cardslot that it
can back the ram up to (~20 min to save 32G and 15 min to restore, so not
something you really want to make use of, but a good safety net)

the review also includes the Intel X-25E and X-25M drives (along with a
variety of SCSI and SATA drives)

http://techreport.com/articles.x/16255/1

equipped with 16G the street price should be ~$550, with 32G it should be
~$1200 with 64G even more expensive, but the performance is very good.
there are times when the X-25E matches it or edges it out in these tests,
so there is room for additional improvement, but as I noted above it may
do better with a better controller and non-windows OS.

power consumption is slightly higher than normal hard drives at about 12w
(_much_ higher than the X-25)

they also have a review of the X-25E vs the X-25M

http://techreport.com/articles.x/15931/1

one thing that both of these reviews show is that if you are doing a
significant amount of writing the X-25M is no better than a normal hard
drive (and much of the time in the middle to bottom of the pack compared
to normal hard drives)

David Lang

Re: SSD performance

От
Glyn Astill
Дата:
> I spotted a new interesting SSD review. it's a $379
> 5.25" drive bay device that holds up to 8 DDR2 DIMMS
> (up to 8G per DIMM) and appears to the system as a SATA
> drive (or a pair of SATA drives that you can RAID-0 to get
> past the 300MB/s SATA bottleneck)
>

Sounds very similar to the Gigabyte iRam drives of a few years ago

http://en.wikipedia.org/wiki/I-RAM






Re: SSD performance

От
david@lang.hm
Дата:
On Fri, 23 Jan 2009, Glyn Astill wrote:

>> I spotted a new interesting SSD review. it's a $379
>> 5.25" drive bay device that holds up to 8 DDR2 DIMMS
>> (up to 8G per DIMM) and appears to the system as a SATA
>> drive (or a pair of SATA drives that you can RAID-0 to get
>> past the 300MB/s SATA bottleneck)
>>
>
> Sounds very similar to the Gigabyte iRam drives of a few years ago
>
> http://en.wikipedia.org/wiki/I-RAM

similar concept, but there are some significant differences

the iRam was limited to 4G, used DDR ram, and used a PCI slot for power
(which can be in
short supply nowdays)

this new drive can go to 64G, uses DDR2 ram (cheaper than DDR nowdays),
gets powered like a normal SATA drive, can use two SATA channels (to be
able to get past the throughput limits of a single SATA interface), and
has a CF card slot to backup the data to if the system powers down.

plus the performance appears to be significantly better (even without
using the second SATA interface)

David Lang


Re: SSD performance

От
Luke Lonergan
Дата:
Why not simply plug your server into a UPS and get 10-20x the performance using the same approach (with OS IO cache)?

In fact, with the server it's more robust, as you don't have to transit several intervening physical devices to get to
theRAM.
 

If you want a file interface, declare a RAMDISK.

Cheaper/faster/improved reliability.

- Luke

----- Original Message -----
From: pgsql-performance-owner@postgresql.org <pgsql-performance-owner@postgresql.org>
To: Glyn Astill <glynastill@yahoo.co.uk>
Cc: pgsql-performance@postgresql.org <pgsql-performance@postgresql.org>
Sent: Fri Jan 23 04:39:07 2009
Subject: Re: [PERFORM] SSD performance

On Fri, 23 Jan 2009, Glyn Astill wrote:

>> I spotted a new interesting SSD review. it's a $379
>> 5.25" drive bay device that holds up to 8 DDR2 DIMMS
>> (up to 8G per DIMM) and appears to the system as a SATA
>> drive (or a pair of SATA drives that you can RAID-0 to get
>> past the 300MB/s SATA bottleneck)
>>
>
> Sounds very similar to the Gigabyte iRam drives of a few years ago
>
> http://en.wikipedia.org/wiki/I-RAM

similar concept, but there are some significant differences

the iRam was limited to 4G, used DDR ram, and used a PCI slot for power
(which can be in
short supply nowdays)

this new drive can go to 64G, uses DDR2 ram (cheaper than DDR nowdays),
gets powered like a normal SATA drive, can use two SATA channels (to be
able to get past the throughput limits of a single SATA interface), and
has a CF card slot to backup the data to if the system powers down.

plus the performance appears to be significantly better (even without
using the second SATA interface)

David Lang


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: SSD performance

От
david@lang.hm
Дата:
On Fri, 23 Jan 2009, Luke Lonergan wrote:

> Why not simply plug your server into a UPS and get 10-20x the
> performance using the same approach (with OS IO cache)?
>
> In fact, with the server it's more robust, as you don't have to transit
> several intervening physical devices to get to the RAM.
>
> If you want a file interface, declare a RAMDISK.
>
> Cheaper/faster/improved reliability.

you can also disable fsync to not wait for your disks if you trust your
system to never go down. personally I don't trust any system to not go
down.

if you have a system crash or reboot your RAMDISK will loose it's content,
this device won't.

also you are limited to how many DIMMS you can put on your motherboard
(for the dual-socket systems I am buying nowdays, I'm limited to 32G of
ram) going to a different motherboard that can support additional ram can
be quite expensive.

this isn't for everyone, but for people who need the performance, data
reliability, this looks like a very interesting option.

David Lang

> - Luke
>
> ----- Original Message -----
> From: pgsql-performance-owner@postgresql.org <pgsql-performance-owner@postgresql.org>
> To: Glyn Astill <glynastill@yahoo.co.uk>
> Cc: pgsql-performance@postgresql.org <pgsql-performance@postgresql.org>
> Sent: Fri Jan 23 04:39:07 2009
> Subject: Re: [PERFORM] SSD performance
>
> On Fri, 23 Jan 2009, Glyn Astill wrote:
>
>>> I spotted a new interesting SSD review. it's a $379
>>> 5.25" drive bay device that holds up to 8 DDR2 DIMMS
>>> (up to 8G per DIMM) and appears to the system as a SATA
>>> drive (or a pair of SATA drives that you can RAID-0 to get
>>> past the 300MB/s SATA bottleneck)
>>>
>>
>> Sounds very similar to the Gigabyte iRam drives of a few years ago
>>
>> http://en.wikipedia.org/wiki/I-RAM
>
> similar concept, but there are some significant differences
>
> the iRam was limited to 4G, used DDR ram, and used a PCI slot for power
> (which can be in
> short supply nowdays)
>
> this new drive can go to 64G, uses DDR2 ram (cheaper than DDR nowdays),
> gets powered like a normal SATA drive, can use two SATA channels (to be
> able to get past the throughput limits of a single SATA interface), and
> has a CF card slot to backup the data to if the system powers down.
>
> plus the performance appears to be significantly better (even without
> using the second SATA interface)
>
> David Lang
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

Re: SSD performance

От
Matthew Wakeling
Дата:
On Fri, 23 Jan 2009, Luke Lonergan wrote:
> Why not simply plug your server into a UPS and get 10-20x the
> performance using the same approach (with OS IO cache)?
>
> In fact, with the server it's more robust, as you don't have to transit
> several intervening physical devices to get to the RAM.
>
> If you want a file interface, declare a RAMDISK.
>
> Cheaper/faster/improved reliability.

I'm sure we have gone over that one before. With that method, your data is
at the mercy of the *entire system*. Any fault in any part of the computer
(hardware or software) will result in the loss of all your data. In
contrast, a RAM-based SSD is isolated from such failures, especially if it
backs up to another device on power fail. You can completely trash the
computer, remove the SSD and put it into another machine, and boot it up
as normal.

Computers break. Nothing is going to stop that from happening. Except VMS
maybe.

Not arguing that your method is faster though.

Matthew

--
 "Finger to spiritual emptiness underlying everything."
        -- How a foreign C manual referred to a "pointer to void."

Re: SSD performance

От
Luke Lonergan
Дата:
Hmm - I wonder what OS it runs ;-)

- Luke

----- Original Message -----
From: david@lang.hm <david@lang.hm>
To: Luke Lonergan
Cc: glynastill@yahoo.co.uk <glynastill@yahoo.co.uk>; pgsql-performance@postgresql.org
<pgsql-performance@postgresql.org>
Sent: Fri Jan 23 04:52:27 2009
Subject: Re: [PERFORM] SSD performance

On Fri, 23 Jan 2009, Luke Lonergan wrote:

> Why not simply plug your server into a UPS and get 10-20x the
> performance using the same approach (with OS IO cache)?
>
> In fact, with the server it's more robust, as you don't have to transit
> several intervening physical devices to get to the RAM.
>
> If you want a file interface, declare a RAMDISK.
>
> Cheaper/faster/improved reliability.

you can also disable fsync to not wait for your disks if you trust your
system to never go down. personally I don't trust any system to not go
down.

if you have a system crash or reboot your RAMDISK will loose it's content,
this device won't.

also you are limited to how many DIMMS you can put on your motherboard
(for the dual-socket systems I am buying nowdays, I'm limited to 32G of
ram) going to a different motherboard that can support additional ram can
be quite expensive.

this isn't for everyone, but for people who need the performance, data
reliability, this looks like a very interesting option.

David Lang

> - Luke
>
> ----- Original Message -----
> From: pgsql-performance-owner@postgresql.org <pgsql-performance-owner@postgresql.org>
> To: Glyn Astill <glynastill@yahoo.co.uk>
> Cc: pgsql-performance@postgresql.org <pgsql-performance@postgresql.org>
> Sent: Fri Jan 23 04:39:07 2009
> Subject: Re: [PERFORM] SSD performance
>
> On Fri, 23 Jan 2009, Glyn Astill wrote:
>
>>> I spotted a new interesting SSD review. it's a $379
>>> 5.25" drive bay device that holds up to 8 DDR2 DIMMS
>>> (up to 8G per DIMM) and appears to the system as a SATA
>>> drive (or a pair of SATA drives that you can RAID-0 to get
>>> past the 300MB/s SATA bottleneck)
>>>
>>
>> Sounds very similar to the Gigabyte iRam drives of a few years ago
>>
>> http://en.wikipedia.org/wiki/I-RAM
>
> similar concept, but there are some significant differences
>
> the iRam was limited to 4G, used DDR ram, and used a PCI slot for power
> (which can be in
> short supply nowdays)
>
> this new drive can go to 64G, uses DDR2 ram (cheaper than DDR nowdays),
> gets powered like a normal SATA drive, can use two SATA channels (to be
> able to get past the throughput limits of a single SATA interface), and
> has a CF card slot to backup the data to if the system powers down.
>
> plus the performance appears to be significantly better (even without
> using the second SATA interface)
>
> David Lang
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

Re: SSD performance

От
Craig Ringer
Дата:
Luke Lonergan wrote:
> Why not simply plug your server into a UPS and get 10-20x the performance using the same approach (with OS IO cache)?

A big reason is that your machine may already have as much RAM as is
currently economical to install. Hardware with LOTS of RAM slots can
cost quite a bit.

Another reason is that these devices won't lose data because of an
unexpected OS reboot. If they're fitted with a battery backup and CF
media for emergency write-out, they won't lose data if your UPS runs out
of juice either.

I'd be much more confident with something like those devices than I
would with an OS ramdisk plus startup/shutdown scripts to initialize it
from a file and write it out to a file. Wouldn't it be a pain if the UPS
didn't give the OS enough warning to write the RAM disk out before
losing power...

In any case, you're very rarely better off dedicating host memory to a
ramdisk rather than using the normal file system and letting the host
cache it. A ramdisk really only seems to help when you're really using
it to bypass safeties like the effects of fsync() and ordered
journaling. There are other ways to avoid those if you really don't care
about your data.

These devices would be interesting for a few uses, IMO. One is temp
table space and sort space in Pg. Another is scratch space for apps
(like Photoshop) that do their own VM management. There's also potential
for use as 1st priority OS swap space, though at least on Linux I think
the CPU overhead involved in swapping is so awful you wouldn't benefit
from it much.

I've been hoping this sort of thing would turn up again in a new
incarnation with battery backup and CF/SD for BBU-flat safety.

--
Craig Ringer

Re: SSD performance

От
Florian Weimer
Дата:
* Craig Ringer:

> I'd be much more confident with something like those devices than I
> would with an OS ramdisk plus startup/shutdown scripts to initialize it
> from a file and write it out to a file. Wouldn't it be a pain if the UPS
> didn't give the OS enough warning to write the RAM disk out before
> losing power...

The cache warm-up time can also be quite annoying.  Of course, with
flash-backed DRAM, this is a concern as long as you use the cheaper,
slower variants for the backing storage.

--
Florian Weimer                <fweimer@bfk.de>
BFK edv-consulting GmbH       http://www.bfk.de/
Kriegsstraße 100              tel: +49-721-96201-1
D-76133 Karlsruhe             fax: +49-721-96201-99

Re: SSD performance

От
Merlin Moncure
Дата:
On 1/23/09, david@lang.hm <david@lang.hm> wrote:
>  the review also includes the Intel X-25E and X-25M drives (along with a
> variety of SCSI and SATA drives)
>

The x25-e is a game changer for database storage.  It's still a little
pricey for what it does but who can argue with these numbers?
http://techreport.com/articles.x/15931/9

merlin

Re: SSD performance

От
david@lang.hm
Дата:
On Fri, 23 Jan 2009, Merlin Moncure wrote:

> On 1/23/09, david@lang.hm <david@lang.hm> wrote:
>>  the review also includes the Intel X-25E and X-25M drives (along with a
>> variety of SCSI and SATA drives)
>>
>
> The x25-e is a game changer for database storage.  It's still a little
> pricey for what it does but who can argue with these numbers?
> http://techreport.com/articles.x/15931/9

take a look at this ram based drive, specificly look at the numbers here
http://techreport.com/articles.x/16255/9

they are about as much above the X25-e as the X25-e is above normal
drives.

David Lang


Re: SSD performance

От
"M. Edward (Ed) Borasky"
Дата:
david@lang.hm wrote:
> On Fri, 23 Jan 2009, Luke Lonergan wrote:
>
>> Why not simply plug your server into a UPS and get 10-20x the
>> performance using the same approach (with OS IO cache)?
>>
>> In fact, with the server it's more robust, as you don't have to
>> transit several intervening physical devices to get to the RAM.
>>
>> If you want a file interface, declare a RAMDISK.
>>
>> Cheaper/faster/improved reliability.
>
> you can also disable fsync to not wait for your disks if you trust your
> system to never go down. personally I don't trust any system to not go
> down.
>
> if you have a system crash or reboot your RAMDISK will loose it's
> content, this device won't.
>
> also you are limited to how many DIMMS you can put on your motherboard
> (for the dual-socket systems I am buying nowdays, I'm limited to 32G of
> ram) going to a different motherboard that can support additional ram
> can be quite expensive.
>
> this isn't for everyone, but for people who need the performance, data
> reliability, this looks like a very interesting option.
>
> David Lang
>
>> - Luke
>>
>> ----- Original Message -----
>> From: pgsql-performance-owner@postgresql.org
>> <pgsql-performance-owner@postgresql.org>
>> To: Glyn Astill <glynastill@yahoo.co.uk>
>> Cc: pgsql-performance@postgresql.org <pgsql-performance@postgresql.org>
>> Sent: Fri Jan 23 04:39:07 2009
>> Subject: Re: [PERFORM] SSD performance
>>
>> On Fri, 23 Jan 2009, Glyn Astill wrote:
>>
>>>> I spotted a new interesting SSD review. it's a $379
>>>> 5.25" drive bay device that holds up to 8 DDR2 DIMMS
>>>> (up to 8G per DIMM) and appears to the system as a SATA
>>>> drive (or a pair of SATA drives that you can RAID-0 to get
>>>> past the 300MB/s SATA bottleneck)
>>>>
>>>
>>> Sounds very similar to the Gigabyte iRam drives of a few years ago
>>>
>>> http://en.wikipedia.org/wiki/I-RAM
>>
>> similar concept, but there are some significant differences
>>
>> the iRam was limited to 4G, used DDR ram, and used a PCI slot for power
>> (which can be in
>> short supply nowdays)
>>
>> this new drive can go to 64G, uses DDR2 ram (cheaper than DDR nowdays),
>> gets powered like a normal SATA drive, can use two SATA channels (to be
>> able to get past the throughput limits of a single SATA interface), and
>> has a CF card slot to backup the data to if the system powers down.
>>
>> plus the performance appears to be significantly better (even without
>> using the second SATA interface)
>>
>> David Lang
>>
>>
>> --
>> Sent via pgsql-performance mailing list
>> (pgsql-performance@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-performance
>>
>

Can I call a time out here? :) There are "always" going to be memory
hierarchies -- registers on the processors, multiple levels of caches,
RAM used for programs / data / I-O caches, and non-volatile rotating
magnetic storage. And there are "always" going to be new hardware
technologies cropping up at various levels in the hierarchy.

There are always going to be cost / reliability / performance
trade-offs, leading to "interesting" though perhaps not really
business-relevant "optimizations". The equations are there for anyone to
use should they want to optimize for a given workload at a given point
in time with given business / service level constraints. See

http://www.amazon.com/Storage-Network-Performance-Analysis-Huseyin/dp/076451685X

for all the details.

I question, however, whether there's much point in seeking an optimum.
As was noted long ago by Nobel laureate Herbert Simon, in actual fact
managers / businesses rarely optimize. Instead, they satisfice. They do
what is "good enough", not what is best. And my own personal opinion in
the current context -- PostgreSQL running on an open-source operating
system -- is that

* large-capacity inexpensive rotating disks,
* a hardware RAID controller containing a battery-backed cache,
* as much RAM as one can afford and the chassis will hold, and
* enough cores to keep the workload from becoming processor-bound

are good enough. And given that, a moderate amount of software tweaking
and balancing will get you close to a local optimum.

--
M. Edward (Ed) Borasky

I've never met a happy clam. In fact, most of them were pretty steamed.

Re: SSD performance

От
"Joshua D. Drake"
Дата:
On Fri, 2009-01-23 at 09:22 -0800, M. Edward (Ed) Borasky wrote:

> I question, however, whether there's much point in seeking an optimum.
> As was noted long ago by Nobel laureate Herbert Simon, in actual fact
> managers / businesses rarely optimize. Instead, they satisfice. They do
> what is "good enough", not what is best. And my own personal opinion in
> the current context -- PostgreSQL running on an open-source operating
> system -- is that

This community is notorious for "optimum". MySQL is notorious for "satisfy".

Which one would you rather store your financial information in?

I actually agree with you to a degree. A loud faction of this community
spends a little too much time mentally masturbating but without that we
wouldn't have a lot of the very interesting features we have now.


There is no correct in left.
There is no correct in right.
Correctness is the result of friction caused by the mingling of the two.

Sincerely,

Joshua D. Drake


--
PostgreSQL - XMPP: jdrake@jabber.postgresql.org
   Consulting, Development, Support, Training
   503-667-4564 - http://www.commandprompt.com/
   The PostgreSQL Company, serving since 1997


Re: SSD performance

От
Matthew Wakeling
Дата:
On Fri, 23 Jan 2009, M. Edward (Ed) Borasky wrote:
> * large-capacity inexpensive rotating disks,
> * a hardware RAID controller containing a battery-backed cache,
> * as much RAM as one can afford and the chassis will hold, and
> * enough cores to keep the workload from becoming processor-bound
>
> are good enough. And given that, a moderate amount of software tweaking
> and balancing will get you close to a local optimum.

That's certainly the case for very large-scale (in terms of data quantity)
databases. However, these solid state devices do have quite an advantage
when what you want to scale is the performance, rather than the data
quantity.

The thing is, it isn't just a matter of storage heirarchy. There's the
volatility matter there as well. What you have in these SSDs is a device
which is non-volatile, like a disc, but fast, like RAM.

Matthew

--
 Anyone who goes to a psychiatrist ought to have his head examined.

Re: SSD performance

От
"M. Edward (Ed) Borasky"
Дата:
Joshua D. Drake wrote:
> This community is notorious for "optimum". MySQL is notorious for "satisfy".

Within *this* community, MySQL is just plain notorious. Let's face it --
we are *not* dolphin-safe.

<ducking>

>
> Which one would you rather store your financial information in?

The one that had the best data integrity, taking into account the RDBMS
*and* the hardware and other software.

> I actually agree with you to a degree. A loud faction of this community
> spends a little too much time mentally masturbating but without that we
> wouldn't have a lot of the very interesting features we have now.

Yes -- you will never hear *me* say "Premature optimization is the root
of all evil." I don't know why Hoare or Dijkstra or Knuth or Wirth or
whoever coined that phrase, but it's been used too many times as an
excuse for not doing any performance engineering, forcing the deployed
"solution" to throw hardware at performance issues.


>
>
> There is no correct in left.
> There is no correct in right.
> Correctness is the result of friction caused by the mingling of the two.

"The only good I/O is a dead I/O" -- Mark Friedman

--
M. Edward (Ed) Borasky

I've never met a happy clam. In fact, most of them were pretty steamed.

Re: SSD performance

От
Greg Smith
Дата:
On Fri, 23 Jan 2009, david@lang.hm wrote:

> take a look at this ram based drive, specificly look at the numbers here
> http://techreport.com/articles.x/16255/9
> they are about as much above the X25-e as the X25-e is above normal drives.

They're so close to having a killer product with that one.  All they need
to do is make the backup to the CF card automatic once the battery backup
power drops low (but not so low there's not enough power to do said
backup) and it would actually be a reasonable solution.  The whole
battery-backed cache approach is risky enough when the battery is expected
to last a day or two; with this product only giving 4 hours, it not hard
to imagine situations where you'd lose everything on there.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: SSD performance

От
david@lang.hm
Дата:
On Sun, 25 Jan 2009, Greg Smith wrote:

> On Fri, 23 Jan 2009, david@lang.hm wrote:
>
>> take a look at this ram based drive, specificly look at the numbers here
>> http://techreport.com/articles.x/16255/9
>> they are about as much above the X25-e as the X25-e is above normal drives.
>
> They're so close to having a killer product with that one.  All they need to
> do is make the backup to the CF card automatic once the battery backup power
> drops low (but not so low there's not enough power to do said backup) and it
> would actually be a reasonable solution.  The whole battery-backed cache
> approach is risky enough when the battery is expected to last a day or two;
> with this product only giving 4 hours, it not hard to imagine situations
> where you'd lose everything on there.

they currently have it do a backup immediatly on power loss (which is a
safe choice as the contents won't be changing without power), but it then
powers off (which is not good for startup time afterwords)

David Lang

Re: SSD performance

От
Gregory Stark
Дата:
david@lang.hm writes:

> they currently have it do a backup immediatly on power loss (which is a safe
> choice as the contents won't be changing without power), but it then powers off
> (which is not good for startup time afterwords)

So if you have a situation where it's power cycling rapidly each iteration
drains the battery of the time it takes to save the state but only charges it
for the time the power is on. I wonder how many iterations that gives you.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

Re: SSD performance

От
david@lang.hm
Дата:
On Sun, 25 Jan 2009, Gregory Stark wrote:

> david@lang.hm writes:
>
>> they currently have it do a backup immediatly on power loss (which is a safe
>> choice as the contents won't be changing without power), but it then powers off
>> (which is not good for startup time afterwords)
>
> So if you have a situation where it's power cycling rapidly each iteration
> drains the battery of the time it takes to save the state but only charges it
> for the time the power is on. I wonder how many iterations that gives you.

good question.

assuming that it's smart enough to not start a save if it didn't finish
doing a restore, and going from the timings in the article (~20 min save,
~15 min load and 4 hour battery life)
you would get ~12 cycles from the initial battery
plus whatever you could get from the battery charging (~3 hours during the
initial battery time)

if the battery could be fully charged in 3 hours it could keep doing this
indefinantly.

if it takes 6 hours it would get a half charge, so 12+6+3+1=22 cycles

but even the initial 12 cycles is long enough that you should probably be
taking action by then.

in most situations you are going to have a UPS on your system anyway, and
it will have the same type of problem (but usually with _much_ less than 4
hours worth of operation to start with)


so while you could loose data from intermittent power, I think you would
be far more likely to loose data due to a defective battery or the CF card
not being fully seated or something like that.

David Lang

Re: SSD performance

От
James Mansion
Дата:
Craig Ringer wrote:
> These devices would be interesting for a few uses, IMO. One is temp
> table space and sort space in Pg. Another is scratch space for apps
> (like Photoshop) that do their own VM management. There's also potential
>
Surely temp tables and sort space isn't subject to fsync and won't gain
that much since they
should stay in the OS cache?  The device will surely help seek- or
sync-bound tasks.

Doesn't that make it a good candidate for WAL and hot tables?

James


Re: SSD performance

От
david@lang.hm
Дата:
On Tue, 27 Jan 2009, James Mansion wrote:

> Craig Ringer wrote:
>> These devices would be interesting for a few uses, IMO. One is temp
>> table space and sort space in Pg. Another is scratch space for apps
>> (like Photoshop) that do their own VM management. There's also potential
>>
> Surely temp tables and sort space isn't subject to fsync and won't gain that
> much since they
> should stay in the OS cache?  The device will surely help seek- or sync-bound
> tasks.
>
> Doesn't that make it a good candidate for WAL and hot tables?

it doesn't just gain on fsync speed, but also raw transfer speed.

if everything stays in the OS buffers than you are right, but when you
start to exceed those buffers is when fast storage like this is very
useful.

David Lang

Re: SSD performance

От
Scott Carey
Дата:



On 1/23/09 3:35 AM, "david@lang.hm" <david@lang.hm> wrote:
http://techreport.com/articles.x/15931/1

one thing that both of these reviews show is that if you are doing a
significant amount of writing the X-25M is no better than a normal hard
drive (and much of the time in the middle to bottom of the pack compared
to normal hard drives)

David Lang


The X-25-M may not have write STR rates that high compared to normal disks, but for write latency, it is FAR superior to a normal disk, and for random writes will demolish most small and medium sized raid arrays by itself.  It will push 30MB to 60MB /sec of random 8k writes, or ~2000 to 12000 8k fsyncs/sec.  The –E is definitely a lot better, but the –M can get you pretty far.

For any postgres installation where you don’t expect to write to a WAL log at more than 30MB/sec (the vast majority), it is good enough to use (mirrored) as a WAL device, without a battery back up, with very good performance.  A normal disk cannot do that.

Also, it can be used very well for the OS swap, and some other temp space to prevent swap storms from severely impacting the system.

For anyone worried about the X 25–M’s ability to withstand lots of write cycles ... Calculate how long it would take you to write 800TB to the drive at a typical rate.  For most use cases that’s going to be > 5 years.  For the 160GB version, it will take 2x as much data and time to wear it down.   

Samsung, SanDisk, Toshiba, Micron, and several others are expected to have low random write latency, next gen SSD’s this year.  A few of these are claiming > 150MB/sec for the writes, even for MLC based drives.

A RAM based device is intriguing, but an ordinary SSD will be enough to make most Postgres databases CPU bound, and with those there is no concern about data loss on power failure.  The Intel X 25 series does not even use the RAM on it for write cache! (it uses some SRAM on the controller chip for that, and its fsync safe) The RAM is working memory for the controller chip to cache the LBA to Physical flash block mappings and other data needed for the wear leveling, contrary to what many reviews may claim.

Re: SSD performance

От
Jeff
Дата:
I somehow managed to convince the powers that be to let me get a
couple X25-E's.
I tossed them in my macpro (8 cores), fired up Ubuntu 8.10 and did
some testing.

Raw numbers are very impressive. I was able to get 3700 random seek
+read's a second. In a R1 config it stayed at 3700, but if I added
another process it went up to 7000, and eventually settled into the
4000s.    If I added in some random writing with fsyncs to it, it
settled at 2200 (to be specific, I had 3 instances going - 2 read-only
and 1 read-20% write to get that).  These numbers were obtained
running a slightly modified version of pgiosim (which is on
pgfoundtry) - it randomly seeks to a "block" in a file and reads 8kB
of data, optionally writing the block back out.

Now, moving into reality I compiled 8.3.latest and gave it a whirl.
Running against a software R1 of the 2 x25-e's  I got the following
pgbench results:
(note config tweaks: work_mem=>4mb, shared_buffers=>1gb, should
probably have tweaked checkpoint_segs, as it was emitting lots of
notices about that, but I didn't).

(multiple runs, avg tps)

Scalefactor 50, 10 clients: 1700tps

At that point I realized write caching on the drives was ON. So I
turned it off at this point:

Scalefactor 50, 10 clients: 900tps

At scalefactor 50 the dataset fits well within memory, so I scaled it
up.

Scalefactor 1500: 10 clients: 420tps


While some of us have arrays that can smash those numbers, that is
crazy impressive for a plain old mirror pair.   I also did not do much
tweaking of PG itself.

While I'm in the testing mood, are there some other tests folks would
like me to try out?

--
Jeff Trout <jeff@jefftrout.com>
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




Re: SSD performance

От
David Rees
Дата:
On Tue, Feb 3, 2009 at 9:54 AM, Jeff <threshar@torgo.978.org> wrote:
> Scalefactor 50, 10 clients: 900tps
>
> At scalefactor 50 the dataset fits well within memory, so I scaled it up.
>
> Scalefactor 1500: 10 clients: 420tps
>
> While some of us have arrays that can smash those numbers, that is crazy
> impressive for a plain old mirror pair.   I also did not do much tweaking of
> PG itself.
>
> While I'm in the testing mood, are there some other tests folks would like
> me to try out?

How do the same benchmarks fair on regular rotating discs on the same
system?  Ideally we'd have numbers for 7.2k and 10k disks to give us
some sort of idea of exactly how much faster we're talking here.  Hey,
since you asked, right? ;-)

-Dave

Re: SSD performance

От
Scott Carey
Дата:
I don’t think write caching on the disks is a risk to data integrity if you are configured correctly.
Furthermore, these drives don’t use the RAM for write cache, they only use a bit of SRAM on the controller chip for that (and respect fsync), so write caching should be fine.

Confirm that NCQ is on (a quick check in dmesg),  I have seen degraded performance when the wrong SATA driver is in use on some linux configs, but your results indicate its probably fine.

How much RAM is in that machine?  

Some suggested tests if you are looking for more things to try :D
-- What affect does the following tuning have:

Turn the I/O scheduler to ‘noop’  ( echo noop > /sys/block/<devices>/queue/scheduler)  I’m assuming the current was cfq, deadline may also be interesting, anticipatory would have comically horrible results.
Tune upward the readahead value ( blockdev —setra <value> /dev/<device>)  -- try 16384 (8MB)  This probably won’t help that much for a pgbench tune, its more for large sequential scans in other workload types, and more important for rotating media.
Generally speaking with SSD’s, tuning the above values does less than with hard drives.

File system effects would also be interesting.  If you’re in need of more tests to try, compare XFS to EXT3 (I am assuming the below is ext3).  

On 2/3/09 9:54 AM, "Jeff" <threshar@torgo.978.org> wrote:

I somehow managed to convince the powers that be to let me get a
couple X25-E's.
I tossed them in my macpro (8 cores), fired up Ubuntu 8.10 and did
some testing.

Raw numbers are very impressive. I was able to get 3700 random seek
+read's a second. In a R1 config it stayed at 3700, but if I added
another process it went up to 7000, and eventually settled into the
4000s.    If I added in some random writing with fsyncs to it, it
settled at 2200 (to be specific, I had 3 instances going - 2 read-only
and 1 read-20% write to get that).  These numbers were obtained
running a slightly modified version of pgiosim (which is on
pgfoundtry) - it randomly seeks to a "block" in a file and reads 8kB
of data, optionally writing the block back out.

Now, moving into reality I compiled 8.3.latest and gave it a whirl.
Running against a software R1 of the 2 x25-e's  I got the following
pgbench results:
(note config tweaks: work_mem=>4mb, shared_buffers=>1gb, should
probably have tweaked checkpoint_segs, as it was emitting lots of
notices about that, but I didn't).

(multiple runs, avg tps)

Scalefactor 50, 10 clients: 1700tps

At that point I realized write caching on the drives was ON. So I
turned it off at this point:

Scalefactor 50, 10 clients: 900tps

At scalefactor 50 the dataset fits well within memory, so I scaled it
up.

Scalefactor 1500: 10 clients: 420tps


While some of us have arrays that can smash those numbers, that is
crazy impressive for a plain old mirror pair.   I also did not do much
tweaking of PG itself.

While I'm in the testing mood, are there some other tests folks would
like me to try out?

--
Jeff Trout <jeff@jefftrout.com>
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




Re: SSD performance

От
Scott Marlowe
Дата:
On Tue, Feb 3, 2009 at 10:54 AM, Jeff <threshar@torgo.978.org> wrote:

> Now, moving into reality I compiled 8.3.latest and gave it a whirl.  Running
> against a software R1 of the 2 x25-e's  I got the following pgbench results:
> (note config tweaks: work_mem=>4mb, shared_buffers=>1gb, should probably
> have tweaked checkpoint_segs, as it was emitting lots of notices about that,
> but I didn't).

You may find you get better numbers with a lower shared_buffers value,
and definitely try cranking up number of checkpoint segments to
something in the 50 to 100 range.

> (multiple runs, avg tps)
>
> Scalefactor 50, 10 clients: 1700tps
>
> At that point I realized write caching on the drives was ON. So I turned it
> off at this point:
>
> Scalefactor 50, 10 clients: 900tps
>
> At scalefactor 50 the dataset fits well within memory, so I scaled it up.
>
> Scalefactor 1500: 10 clients: 420tps
>
>
> While some of us have arrays that can smash those numbers, that is crazy
> impressive for a plain old mirror pair.   I also did not do much tweaking of
> PG itself.

On a scale factor or 100 my 12 disk 15k.5 seagate sas drives on an
areca get somewhere in the 2800 to 3200 tps range on sustained tests
for anywhere from 8 to 32 or so concurrent clients.  I get similar
performance falloffs as I increase the testdb scaling factor.

But for a pair of disks in a mirror with no caching controller, that's
impressive.  I've already told my boss our next servers will likely
have intel's SSDs in them.

> While I'm in the testing mood, are there some other tests folks would like
> me to try out?

how about varying the number of clients with a static scalefactor?

--
When fascism comes to America, it will be the intolerant selling
fascism as diversity.

Re: SSD performance

От
david@lang.hm
Дата:
On Wed, 4 Feb 2009, Jeff wrote:

> On Feb 3, 2009, at 1:43 PM, Scott Carey wrote:
>
>> I don?t think write caching on the disks is a risk to data integrity if you
>> are configured correctly.
>> Furthermore, these drives don?t use the RAM for write cache, they only use
>> a bit of SRAM on the controller chip for that (and respect fsync), so write
>> caching should be fine.
>>
>> Confirm that NCQ is on (a quick check in dmesg),  I have seen degraded
>> performance when the wrong SATA driver is in use on some linux configs, but
>> your results indicate its probably fine.
>>
>
> As it turns out, there's a bug/problem/something with the controller in the
> macpro vs the ubuntu drives where the controller goes into "works, but not as
> super as it could" mode, so NCQ is effectively disabled, haven't seen a
> workaround yet. Not sure if this problem exists on other distros (used ubuntu
> because I just wanted to try a live).  I read some stuff from Intel on the
> NCQ and in a lot of cases it won't make that much difference because the
> thing can respond so fast.

actually, what I've heard is that NCQ is a win on the intel drives becouse
it avoids having the drive wait while the OS prepares and sends the next
write.

>> Some suggested tests if you are looking for more things to try :D
>> -- What affect does the following tuning have:
>>
>> Turn the I/O scheduler to ?noop?  ( echo noop >
>> /sys/block/<devices>/queue/scheduler)  I?m assuming the current was cfq,
>> deadline may also be interesting, anticipatory would have comically
>> horrible results.
>
> I only tested noop, if you think about it, it is the most logical one as an
> SSD really does not need an elevator at all. There is no rotational latency
> or moving of the arm that the elevator was designed to cope with.

you would think so, but that isn't nessasarily the case. here's a post
where NOOP lost to CFQ by ~24% when there were multiple proceses competing
for the drive (not on intel drives)

http://www.alphatek.info/2009/02/02/io-scheduler-and-ssd-part-2/

David Lang

Re: SSD performance

От
Jeff
Дата:
On Feb 3, 2009, at 1:43 PM, Scott Carey wrote:

> I don’t think write caching on the disks is a risk to data integrity
> if you are configured correctly.
> Furthermore, these drives don’t use the RAM for write cache, they
> only use a bit of SRAM on the controller chip for that (and respect
> fsync), so write caching should be fine.
>
> Confirm that NCQ is on (a quick check in dmesg),  I have seen
> degraded performance when the wrong SATA driver is in use on some
> linux configs, but your results indicate its probably fine.
>

As it turns out, there's a bug/problem/something with the controller
in the macpro vs the ubuntu drives where the controller goes into
"works, but not as super as it could" mode, so NCQ is effectively
disabled, haven't seen a workaround yet. Not sure if this problem
exists on other distros (used ubuntu because I just wanted to try a
live).  I read some stuff from Intel on the NCQ and in a lot of cases
it won't make that much difference because the thing can respond so
fast.


> How much RAM is in that machine?
>

8GB

> Some suggested tests if you are looking for more things to try :D
> -- What affect does the following tuning have:
>
> Turn the I/O scheduler to ‘noop’  ( echo noop > /sys/block/<devices>/
> queue/scheduler)  I’m assuming the current was cfq, deadline may
> also be interesting, anticipatory would have comically horrible
> results.

I only tested noop, if you think about it, it is the most logical one
as an SSD really does not need an elevator at all. There is no
rotational latency or moving of the arm that the elevator was designed
to cope with.

but, here are the results:
scale 50, 100 clients, 10x txns: 1600tps (a noticable improvement!)
scale 1500, 100 clients, 10xtxns: 434tps

I'm going to try to get some results for raptors, but there was
another post earlier today that got higher, but not ridiculously
higher tps but it required 14 15k disks instead of 2

>
> Tune upward the readahead value ( blockdev —setra <value> /dev/
> <device>)  -- try 16384 (8MB)  This probably won’t help that much
> for a pgbench tune, its more for large sequential scans in other
> workload types, and more important for rotating media.
> Generally speaking with SSD’s, tuning the above values does less
> than with hard drives.
>

Yeah, I don't think RA will help pgbench, and for my workloads it is
rather useless as they tend to be tons of random IO.

I've got some Raptors here too I'll post numbers wed or thu.

--
Jeff Trout <jeff@jefftrout.com>
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




Re: SSD performance

От
Matthew Wakeling
Дата:
On Fri, 30 Jan 2009, Scott Carey wrote:
> For anyone worried about the X 25–M’s ability to withstand lots of write
> cycles ... Calculate how long it would take you to write 800TB to the
> drive at a typical rate.  For most use cases that’s going to be > 5
> years.  For the 160GB version, it will take 2x as much data and time to
> wear it down.   

This article just came out:
http://www.theregister.co.uk/2009/02/20/intel_x25emmental/

and

http://www.pcper.com/article.php?aid=669

It seems that the performance of the X25-M degrades over time, as the
write levelling algorithm fragments the device into little bits.
Especially under database-like access patterns.

Matthew

--
 I quite understand I'm doing algebra on the blackboard and the usual response
 is to throw objects...  If you're going to freak out... wait until party time
 and invite me along                     -- Computer Science Lecturer