Re: Slow queries on big table
От | Scott Marlowe |
---|---|
Тема | Re: Slow queries on big table |
Дата | |
Msg-id | 464E0036.3030804@g2switchworks.com обсуждение исходный текст |
Ответ на | Slow queries on big table ("Tyrrill, Ed" <tyrrill_ed@emc.com>) |
Ответы |
Re: Slow queries on big table
(Tom Lane <tgl@sss.pgh.pa.us>)
|
Список | pgsql-performance |
Tyrrill, Ed wrote: > I have a two column table with over 160 million rows in it. As the size > of the table grows queries on this table get exponentially slower. I am > using version 8.1.5 32-bit on Red Hat Enterprise Linux 3. The hardware > is an Intel 3 Ghz Xeon with 4GB RAM, and 6 disks in a RAID 5 > configuration. For current testing I am running a single database > connection with no other applications running on the machine, and the > swap is not being used at all. > > Here is the table definition: > > mdsdb=# \d backup_location > Table "public.backup_location" > Column | Type | Modifiers > -----------+---------+----------- > record_id | bigint | not null > backup_id | integer | not null > Indexes: > "backup_location_pkey" PRIMARY KEY, btree (record_id, backup_id) > "backup_location_rid" btree (record_id) > Foreign-key constraints: > "backup_location_bfk" FOREIGN KEY (backup_id) REFERENCES > backups(backup_id) ON DELETE CASCADE > > Here is the table size: > > mdsdb=# select count(*) from backup_location; > count > ----------- > 162101296 > (1 row) > > And here is a simple query on this table that takes nearly 20 minutes to > return less then 3000 rows. I ran an analyze immediately before I ran > this query: > > mdsdb=# explain analyze select record_id from backup_location where > backup_id = 1070; > > QUERY PLAN > > ------------------------------------------------------------------------ > ------------------------------------------------------------------------ > ------------- > Index Scan using backup_location_pkey on backup_location > (cost=0.00..1475268.53 rows=412394 width=8) (actual > time=3318.057..1196723.915 rows=2752 loops=1) > Index Cond: (backup_id = 1070) > Total runtime: 1196725.617 ms > (3 rows) > I've got a few points. Firstly, is your data amenable to partitioning? If so that might be a big winner. Secondly, it might be more efficient for the planner to choose the backup_location_rid index than the combination primary key index. You can test this theory with this cool pg trick: begin; alter table backup_location drop constraint backup_location_pkey; explain analyze select .... rollback; to see if it's faster. > Obviously at this point the application is not usable. If possible we > would like to grow this table to the 3-5 billion row range, but I don't > know if that is realistic. > > Any guidance would be greatly appreciated. > Without knowing more about your usage patterns, it's hard to say. But partitioning seems like your best choice at the moment.
В списке pgsql-performance по дате отправления: