Обсуждение: Index performance
Hi, I have a table with a primarry key made of two columns. One of these has about 150 distinct values which are unique IDs, and the other has over 3 million almost unique data values. This table is added to in real time, at least 10 rows per second. If I do a select which uses the pkey index, where equal to the ID column, and greater than one of the values, which should return about 1500 rows, it sometimes takes 1/2 minute to return, and other times takes only seconds. Is it the number of rows being added in real time, that is maybe causing the index to be locked? I tried creating a partial index where equal to the ID column, and that sped it up cconsiderably and consistently, but this is a headache to maintain. Any ideas? Thanks Brian -- Brian Modra Land line: +27 23 5411 462 Mobile: +27 79 183 8059 6 Jan Louw Str, Prince Albert, 6930 Postal: P.O. Box 2, Prince Albert 6930 South Africa
On Wed, Jan 02, 2008 at 05:53:35PM +0200, Brian Modra wrote: > This table is added to in real time, at least 10 rows per second. [. . .] > If I do a select which uses the pkey index, where equal to the ID > column, and greater than one of the values, which should return about > 1500 rows, it sometimes takes 1/2 minute to return, and other times > takes only seconds. > > Is it the number of rows being added in real time, that is maybe > causing the index to be locked? No, it's probably a bad plan. A minimum 10 rows/second is probably just making the statistics for the table look bad. You likely want to SET STATISTICS wider on the 1st (~150 distinct values) column, and then run ANALYSE on the table very frequently. Are you updating or deleting at all? If so, that will also affect things: you need to perform very frequent VACUUM on that table in that case. Aside from that generic advice, it's impossible to say more without EXPLAIN ANALYSE output for the slow and fast examples. A
Thanks, I think you have me on the right track. I'm testing a vacuum analyse now to see how long it takes, and then I'll set it up to automatically run every night (so that it has a chance to complete before about 6am.) On 02/01/2008, Andrew Sullivan <ajs@crankycanuck.ca> wrote: > On Wed, Jan 02, 2008 at 05:53:35PM +0200, Brian Modra wrote: > > This table is added to in real time, at least 10 rows per second. > > [. . .] > > > If I do a select which uses the pkey index, where equal to the ID > > column, and greater than one of the values, which should return about > > 1500 rows, it sometimes takes 1/2 minute to return, and other times > > takes only seconds. > > > > Is it the number of rows being added in real time, that is maybe > > causing the index to be locked? > > No, it's probably a bad plan. A minimum 10 rows/second is probably just > making the statistics for the table look bad. You likely want to SET > STATISTICS wider on the 1st (~150 distinct values) column, and then run > ANALYSE on the table very frequently. Are you updating or deleting at all? > If so, that will also affect things: you need to perform very frequent > VACUUM on that table in that case. > > Aside from that generic advice, it's impossible to say more without EXPLAIN > ANALYSE output for the slow and fast examples. > > A > > > ---------------------------(end of broadcast)--------------------------- > TIP 7: You can help support the PostgreSQL project by donating at > > http://www.postgresql.org/about/donate > -- Brian Modra Land line: +27 23 5411 462 Mobile: +27 79 183 8059 6 Jan Louw Str, Prince Albert, 6930 Postal: P.O. Box 2, Prince Albert 6930 South Africa
On Thu, Jan 03, 2008 at 07:11:07AM +0200, Brian Modra wrote: > Thanks, I think you have me on the right track. I'm testing a vacuum > analyse now to see how long it takes, and then I'll set it up to > automatically run every night (so that it has a chance to complete > before about 6am.) Note that "VACUUM ANALYSE" and "ANALYSE" are not identical: the former also performs vacuum. On a table that is not updating that often but that is expanding rapidly, you may not need that extra I/O. Analyse on its own can perform just the statistical sampling. If you're not creating dead tuples with UPDATE, DELETE, or ROLLBACK, that might be enough most of the time. A