Обсуждение: partioning tips?
Hello, I'm about to embark on a partitioning project to improve read performance on some of our tables: db=# select relname,n_live_tup,pg_size_pretty(pg_relation_size(relid)) from pg_stat_all_tables where schemaname = 'public'order by n_live_tup desc limit 10; relname | n_live_tup | pg_size_pretty -------------------------------------+------------+---------------- objects | 125255895 | 11 GB papers | 124213085 | 14 GB stats | 124202261 | 9106 MB exclusions | 53090902 | 3050 MB marks | 42467477 | 4829 MB student_class | 31491181 | 1814 MB users | 19906017 | 3722 MB view_stats | 12031074 | 599 MB highlights | 10884380 | 629 MB Problem is, I have foreign keys that link almost all of our tables together (as a business requirement/IT policy). However,I know (er, I have a gut feeling) that many people out there have successfully deployed table partitioning, so I'mhoping to solicit some advice with respect to this. I've looked at documentation, tried creating a prototype, etc...lookslike foreign keys have to go. But do they? What have other people out there done to get their tables partitioned? Any input would be much appreciated. Thanks! --Richard
"Because it's policy" is rarely a good design decision :-) Lose the FK constraints, and make up for them with integrity checking queries.
I just did a major refactor and shard on our PG schema and the performance improvement was dramatic ... a big plus for PG, if it is e.g. time-series data is to shard by time and make the tables write-once. The same applies to any record id that doesn't get re-used. PG doesn't do in-place record updates, so tables with lots of row changes can get order-fragmented.
If not, also check out the "cluster table on index" command.
Cheers
Dave
I just did a major refactor and shard on our PG schema and the performance improvement was dramatic ... a big plus for PG, if it is e.g. time-series data is to shard by time and make the tables write-once. The same applies to any record id that doesn't get re-used. PG doesn't do in-place record updates, so tables with lots of row changes can get order-fragmented.
If not, also check out the "cluster table on index" command.
Cheers
Dave
On Wed, May 5, 2010 at 3:25 PM, Richard Yen <dba@richyen.com> wrote:
Hello,
I'm about to embark on a partitioning project to improve read performance on some of our tables:
db=# select relname,n_live_tup,pg_size_pretty(pg_relation_size(relid)) from pg_stat_all_tables where schemaname = 'public' order by n_live_tup desc limit 10;
relname | n_live_tup | pg_size_pretty
-------------------------------------+------------+----------------
objects | 125255895 | 11 GB
papers | 124213085 | 14 GB
stats | 124202261 | 9106 MB
exclusions | 53090902 | 3050 MB
marks | 42467477 | 4829 MB
student_class | 31491181 | 1814 MB
users | 19906017 | 3722 MB
view_stats | 12031074 | 599 MB
highlights | 10884380 | 629 MB
Problem is, I have foreign keys that link almost all of our tables together (as a business requirement/IT policy). However, I know (er, I have a gut feeling) that many people out there have successfully deployed table partitioning, so I'm hoping to solicit some advice with respect to this. I've looked at documentation, tried creating a prototype, etc...looks like foreign keys have to go. But do they? What have other people out there done to get their tables partitioned?
Any input would be much appreciated.
Thanks!
--Richard
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
On 05/05/2010 01:25 PM, Richard Yen wrote: > Problem is, I have foreign keys that link almost all of our tables together (as a business requirement/IT policy). However,I know (er, I have a gut feeling) that many people out there have successfully deployed table partitioning, so I'mhoping to solicit some advice with respect to this. I've looked at documentation, tried creating a prototype, etc...lookslike foreign keys have to go. But do they? What have other people out there done to get their tables partitioned? Well, it's possible to work around the limitation on FKs, but probably not worth it. In general, the reasons you want to partition (being able to cheaply drop segments, no scans against the whole table, ever) are reasons why you wouldn't want an FK to a partition table in any case. The specific cases where it works to have FKs anyway are: 1) if you're making FKs between two partitioned tables whose partition ranges match exactly. In this case, you can just FK the individual partitions (there is a TODO, and some draft code from Aster, to make this happen automatically). 2) If the partitioned table has very wide rows, and it's large for that reason rather than because of having many rows. In this case, you can create an FK join table containing only the SKs for creating FKs to, just like a many-to-many join table. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com
Now that it's time to buy a new computer, Dell has changed their RAID models from the Perc6 to Perc H200 and such. Doesanyone know what's inside these? I would hope they've stuck with the Megaraid controller... Also, I can't find any info on Dell's site about how these devices can be configured. I was thinking of ten disks, as OS: RAID1 WAL: RAID1 Database: RAID10 using 6 disks But it's not clear to me if these RAID controllers can handle multible arrays, or if you need a separate controller for eacharray. We're a small shop and I only get to do this every year or so, and everything changes in between purchases! Thanks, Craig
Craig James wrote: > Now that it's time to buy a new computer, Dell has changed their RAID > models from the Perc6 to Perc H200 and such. Does anyone know what's > inside these? I would hope they've stuck with the Megaraid controller... The H700 and H800 are both based on the LSI 2180 chipset: http://support.dell.com/support/edocs/storage/Storlink/H700H800/en/UG/HTML/chapterb.htm I'm not sure what's in the H200, but since it does not have a write cache you don't want one of those anyway. Note that early versions of these cards shipped such that you could not use non-Dell drives with them. Customer feedback was so overwhelmingly negative that last month they announced that the next firmware update will remove that restriction: http://en.community.dell.com/support-forums/servers/f/906/p/19324790/19689719.aspx#19689719 If I were you, I'd tell Dell that you refuse to make your purchase until that firmware release is actually available, such that your system ships without that restriction. That's the right thing to do for the protection of your company, and it sends the right message to their sales team too: this sort of nonsense only reduces their sales. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us
On Sun, May 9, 2010 at 3:20 PM, Greg Smith <greg@2ndquadrant.com> wrote: > Craig James wrote: >> >> Now that it's time to buy a new computer, Dell has changed their RAID >> models from the Perc6 to Perc H200 and such. Does anyone know what's inside >> these? I would hope they've stuck with the Megaraid controller... > > The H700 and H800 are both based on the LSI 2180 chipset: > http://support.dell.com/support/edocs/storage/Storlink/H700H800/en/UG/HTML/chapterb.htm > > I'm not sure what's in the H200, but since it does not have a write cache > you don't want one of those anyway. > > Note that early versions of these cards shipped such that you could not use > non-Dell drives with them. Customer feedback was so overwhelmingly negative > that last month they announced that the next firmware update will remove > that restriction: > http://en.community.dell.com/support-forums/servers/f/906/p/19324790/19689719.aspx#19689719 > > If I were you, I'd tell Dell that you refuse to make your purchase until > that firmware release is actually available, such that your system ships > without that restriction. That's the right thing to do for the protection > of your company, and it sends the right message to their sales team too: > this sort of nonsense only reduces their sales. Well, it's the attitude that really matters, and Dell has shown how little they think of the people who buy their machines with this move. I gave up on them when they screwed me royally over 8 quad core cpus that they couldn't even tell me which of my 1950's could take them, and they have all the config codes for them. At least if I buy something with generic mobos in it I can go look it up myself. Also, going back to the PE16xx series and the adaptec based Perc3 (DI? not sure which one was) had lockup problems in windows AND linux. Dell never would take responsibility and ship us different RAID controllers for some 300 machines we bought. We wound up buying a handful of LSI based Perc 3 (DC? still not sure of the name) and just pulling the RAID controller on all the rest to get reliable machines. Never. Again.