Обсуждение: SSDs - SandForce or not?
Hi, I'm wondering which type of SSDs would be better for use with PostgreSQL. Background: At the moment, SSD drives fall into two categories.. Those that use internal-compression on the SandForce controller, which gives very fast speeds for compressible data; and those that don't. In benchmarks, the compressing style of drive do extremely well at random writes as long as there's semi-compressible-data involved. They still do well if uncompressible data is used, just usually not quite as well as the competitors. When it comes to reading data, there's no real difference. So I just wondered how this might apply to PostgreSQL's workload? I think the on-disk data is going to consist of a lot of random reads and writes, with what I suspect is data that *does* compress quite well. (At least on my data sets, that is. If I use gzip or lzma on the postgres data directly, it gets MUCH smaller) So on the face of it, I think the Sandforce-based drives are probably a winner here, so I should look at the Intel 520s for evaluation, and whatever the enterprise equivalent are for production. I wondered if anyone else wiser than I has thought about this yet though.. are there any downsides to that combination? cheers, Toby
On 11/14/2012 01:11 AM, Toby Corkindale wrote: > I'm wondering which type of SSDs would be better for use with > PostgreSQL. A few things: 1. While the controller may or may not have an impact, the presence of an on-board super-capacitor will have more. SSDs should be considered malignant devices that will go out of their way to destroy your data, unless they have one of these. 2. Workload on a compressible system like PG is generally dependent on your data sets. If you have lots of TOAST data, which is already compressed, you get no benefit. If your use case doesn't show a lot of random writes, optimizing for them is of questionable value. 3. SSDs also exist as effectively raw NVRAM, in the form of PCIe cards. These cards come in several varieties, and these days, can be mounted in external PCIe chassis in hot-swap bays much like more conventional drive enclosures. Some of these use a kernel-level driver over a proprietary controller, using neither Sandforce or anything else. They are also close to an order of magnitude faster than an SSD because they discard the SATA/SCSI bus entirely. 4. SSDs do have limited write cycles, and whether it's write leveling or drive compression to reduce writes on the actual NVRAM chips, if you honestly have a high write load, you're better off with whatever card reports the highest longevity of the relatively scarce write cycles per cell. 5. You're more likely to get performance improvements pursuing SLC (single layer chips) versus cheaper MLC (multi-layer) for writing, because the controller doesn't have to mask writes to the proper layer. Basically, there's way more involved here than Sandforce vs. Others. Or even Compressible vs. Not. SSDs are still a pretty Wild West kind of thing, and you've got a lot more variables to consider than with standard spindles. -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604 312-444-8534 sthomas@optionshouse.com ______________________________________________ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email
On 15/11/12 01:42, Shaun Thomas wrote: > On 11/14/2012 01:11 AM, Toby Corkindale wrote: > >> I'm wondering which type of SSDs would be better for use with >> PostgreSQL. Hi Shaun, thanks for your info. I should probably have made it clear that I was curious to know how the compression stuff affected the situation, aside from the other variables. I'm aware of the other issues you've mentioned, but I'm sure it's helpful for other people reading this list to see them. You make a good point about the TOAST tables, I hadn't thought of that. (My data is mostly numeric here though) thanks, Toby > 1. While the controller may or may not have an impact, the presence of > an on-board super-capacitor will have more. SSDs should be considered > malignant devices that will go out of their way to destroy your data, > unless they have one of these. > > 2. Workload on a compressible system like PG is generally dependent on > your data sets. If you have lots of TOAST data, which is already > compressed, you get no benefit. If your use case doesn't show a lot of > random writes, optimizing for them is of questionable value. > > 3. SSDs also exist as effectively raw NVRAM, in the form of PCIe cards. > These cards come in several varieties, and these days, can be mounted in > external PCIe chassis in hot-swap bays much like more conventional drive > enclosures. Some of these use a kernel-level driver over a proprietary > controller, using neither Sandforce or anything else. They are also > close to an order of magnitude faster than an SSD because they discard > the SATA/SCSI bus entirely. > > 4. SSDs do have limited write cycles, and whether it's write leveling or > drive compression to reduce writes on the actual NVRAM chips, if you > honestly have a high write load, you're better off with whatever card > reports the highest longevity of the relatively scarce write cycles per > cell. > > 5. You're more likely to get performance improvements pursuing SLC > (single layer chips) versus cheaper MLC (multi-layer) for writing, > because the controller doesn't have to mask writes to the proper layer. > > Basically, there's way more involved here than Sandforce vs. Others. Or > even Compressible vs. Not. SSDs are still a pretty Wild West kind of > thing, and you've got a lot more variables to consider than with > standard spindles.
On 11/14/12 2:11 AM, Toby Corkindale wrote: > So on the face of it, I think the Sandforce-based drives are probably a > winner here, so I should look at the Intel 520s for evaluation, and > whatever the enterprise equivalent are for production. As far as I know the 520 series drives fail the requirements outlined at http://wiki.postgresql.org/wiki/Reliable_Writes and you can expect occasional data corruption after a crash when using them. As such, any performance results you get back are fake. You can't trust the same results will come back from their drives that do handle writes correctly. I'm not aware of any SSD with one of these compressing Sandforce controller that's on the market right now that does this correctly; they're all broken for database use. The quick rule of thumb is that if the manufacturer doesn't brag about the capacitors on the drive, it doesn't have any and isn't reliable for PostgreSQL. The safe Intel SSD models state very clearly in the specifications how they write data in case of a crash. The data sheet for the 320 series drives for example says "To reduce potential data loss, the Intel® SSD 320 Series also detects and protects from unexpected system power loss by saving all cached data in the process of being written before shutting down". The other model I've deployed and know is safe are the 710 series models, which are the same basic drive but with different quality flash and tuning for longevity. See http://blog.2ndquadrant.com/intel_ssds_lifetime_and_the_32/ for details. The 710 series drives are quite a bit more expensive than Intel's other models. Intel's recently released DC S3700 drives also look to have the right battery backup system to be reliable for PostgreSQL. Those are expected to be significantly cheaper than the 710 models, while having the same reliability characteristics. I haven't been able to get one yet though, so I don't really know for sure how well they perform. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
On Mon, Dec 10, 2012 at 02:05:09AM -0500, Greg Smith wrote: > On 11/14/12 2:11 AM, Toby Corkindale wrote: > >So on the face of it, I think the Sandforce-based drives are probably a > >winner here, so I should look at the Intel 520s for evaluation, and > >whatever the enterprise equivalent are for production. > > As far as I know the 520 series drives fail the requirements > outlined at http://wiki.postgresql.org/wiki/Reliable_Writes and you > can expect occasional data corruption after a crash when using them. > As such, any performance results you get back are fake. You can't > trust the same results will come back from their drives that do > handle writes correctly. I'm not aware of any SSD with one of these > compressing Sandforce controller that's on the market right now that > does this correctly; they're all broken for database use. The quick > rule of thumb is that if the manufacturer doesn't brag about the > capacitors on the drive, it doesn't have any and isn't reliable for > PostgreSQL. > > The safe Intel SSD models state very clearly in the specifications > how they write data in case of a crash. The data sheet for the 320 > series drives for example says "To reduce potential data loss, the > Intel® SSD 320 Series also detects and protects from unexpected > system power loss by saving all cached data in the process of being > written before shutting down". The other model I've deployed and > know is safe are the 710 series models, which are the same basic > drive but with different quality flash and tuning for longevity. > See http://blog.2ndquadrant.com/intel_ssds_lifetime_and_the_32/ for > details. The 710 series drives are quite a bit more expensive than > Intel's other models. It looks like the newer Intel 330 SSD also lacks a capacitor. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Tue, Dec 11, 2012 at 10:48 AM, Bruce Momjian <bruce@momjian.us> wrote: > On Mon, Dec 10, 2012 at 02:05:09AM -0500, Greg Smith wrote: >> On 11/14/12 2:11 AM, Toby Corkindale wrote: >> >So on the face of it, I think the Sandforce-based drives are probably a >> >winner here, so I should look at the Intel 520s for evaluation, and >> >whatever the enterprise equivalent are for production. >> >> As far as I know the 520 series drives fail the requirements >> outlined at http://wiki.postgresql.org/wiki/Reliable_Writes and you >> can expect occasional data corruption after a crash when using them. >> As such, any performance results you get back are fake. You can't >> trust the same results will come back from their drives that do >> handle writes correctly. I'm not aware of any SSD with one of these >> compressing Sandforce controller that's on the market right now that >> does this correctly; they're all broken for database use. The quick >> rule of thumb is that if the manufacturer doesn't brag about the >> capacitors on the drive, it doesn't have any and isn't reliable for >> PostgreSQL. >> >> The safe Intel SSD models state very clearly in the specifications >> how they write data in case of a crash. The data sheet for the 320 >> series drives for example says "To reduce potential data loss, the >> Intel® SSD 320 Series also detects and protects from unexpected >> system power loss by saving all cached data in the process of being >> written before shutting down". The other model I've deployed and >> know is safe are the 710 series models, which are the same basic >> drive but with different quality flash and tuning for longevity. >> See http://blog.2ndquadrant.com/intel_ssds_lifetime_and_the_32/ for >> details. The 710 series drives are quite a bit more expensive than >> Intel's other models. > > It looks like the newer Intel 330 SSD also lacks a capacitor. I believe that's the case. The only choices today are the 320, 710, and the upcoming DC S3700 (which looks explicitly designed for database use). merlin