Обсуждение: When/if to Reindex
On Wed, Jul 18, 2007 at 01:08:30PM -0400, Steven Flatt wrote: > We're using Postgres 8.2.4. > > I'm trying to decide whether it's worthwhile to implement a process that > does periodic reindexing. In a few ad hoc tests, where I've tried to set up > data similar to how our application does it, I've noticed decent performance > increases after doing a reindex as well as the planner being more likely to > choose an index scan. > > Some background: we make extensive use of partitioned tables. In fact, I'm > really only considering reindexing partitions that have "just closed". In > our simplest/most general case, we have a table partitioned by a timestamp > column, each partition 24 hours wide. The partition will have an index on > the timestamp column as well as a few other indexes including a primary key > index (all b-tree). Is there a programmatic way I can decide, upon the > "closing" of a partition, which, if any, of these indexes will benefit from > a reindex? Can I determine things like average node density, node depth, or > any other indication as to the quality of an index? Will pg_class.relpages > be any help here? Looking at that stuff will help determine if the index is bloated, or if it's just bigger than optimal. Once you're done writing to an index, it might be worth reindexing with a fillfactor of 100% to shrink things down a bit. > Is it a simple matter of running some queries, reindexing the table, then > running the queries again to determine overall performance change? If so, > what queries would exercise this best? > > Just trying to determine if the extra cost of reindexing newly closed > partitions will be worth the performance benefit of querying the data. > Reindexing a table with a day's worth of data is taking on the order of a > few hours (10s of millions of rows). > > The docs say that: > > "...for B-tree indexes a freshly-constructed index is somewhat faster to > access than one that has been updated many times, because logically adjacent > pages are usually also physically adjacent in a newly built index... It > might be worthwhile to reindex periodically just to improve access speed." That's the other consideration, though if you're seeing a big difference I suspect it's an issue of indexes fitting in cache or not. -- Jim Nasby decibel@decibel.org EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Вложения
On Jul 18, 2007, at 1:08 PM, Steven Flatt wrote: > Some background: we make extensive use of partitioned tables. In > fact, I'm > really only considering reindexing partitions that have "just > closed". In > our simplest/most general case, we have a table partitioned by a > timestamp > column, each partition 24 hours wide. The partition will have an > index on > the timestamp column as well as a few other indexes including a > primary key If all you ever did was insert into that table, then you probably don't need to reindex. If you did mass updates/deletes mixed with your inserts, then perhaps you do. Do some experiments comparing pg_class.relpages for your table and its indexes before and after a reindex. Decide if the number of pages you save on the index is worth the trouble. If it shaves off just a handful of pages, I'd vote no...
If all you ever did was insert into that table, then you probably
don't need to reindex. If you did mass updates/deletes mixed with
your inserts, then perhaps you do.
Do some experiments comparing pg_class.relpages for your table and
its indexes before and after a reindex. Decide if the number of
pages you save on the index is worth the trouble. If it shaves off
just a handful of pages, I'd vote no...
In response to "Steven Flatt" <steven.flatt@gmail.com>: > On 8/8/07, Vivek Khera <vivek@khera.org> wrote: > > > > If all you ever did was insert into that table, then you probably > > don't need to reindex. If you did mass updates/deletes mixed with > > your inserts, then perhaps you do. > > > > Do some experiments comparing pg_class.relpages for your table and > > its indexes before and after a reindex. Decide if the number of > > pages you save on the index is worth the trouble. If it shaves off > > just a handful of pages, I'd vote no... > > > What's interesting is that an insert-only table can benefit significantly > from reindexing after the table is fully loaded. I had done experiments > exactly as you suggest (looking at pg_class.relpages), and determined that > reindexing results in about a 30% space savings for all indexes except the > PK index. The PK index (integer based on a sequence) does not benefit at > all. By setting fillfactor=100 on the index prior to reindexing, I get > another 10% space savings on all the indexes. > > Not to mention the general performance improvements when reading from the > table... > > So, we decided that reindexing partitions after they're fully loaded *was* > worth it. I've had similar experience. One thing you didn't mention that I've noticed is that VACUUM FULL often bloats indexes. I've made it SOP that after application upgrades (which usually includes lots of ALTER TABLES and other massive schema and data changes) I VACUUM FULL and REINDEX (in that order). Lots of ALTER TABLEs seem to bloat the database size considerably, beyond what normal VACUUM seems to fix. A FULL seems to fix that, but it appears to bloat the indexes, thus a REINDEX helps. I would expect that setting fillfactor to 100 will encourage indexs to bloat faster, and would only be recommended if you didn't expect the index contents to change? -- Bill Moran Collaborative Fusion Inc. http://people.collaborativefusion.com/~wmoran/ wmoran@collaborativefusion.com Phone: 412-422-3463x4023
On Wed, Aug 08, 2007 at 03:27:57PM -0400, Bill Moran wrote: > I've had similar experience. One thing you didn't mention that I've noticed > is that VACUUM FULL often bloats indexes. I've made it SOP that > after application upgrades (which usually includes lots of ALTER TABLES and > other massive schema and data changes) I VACUUM FULL and REINDEX (in that > order). You'd be better off with a CLUSTER in that case. It'll be faster, and you'll ensure that the table has optimal ordering. > Lots of ALTER TABLEs seem to bloat the database size considerably, beyond > what normal VACUUM seems to fix. A FULL seems to fix that, but it appears > to bloat the indexes, thus a REINDEX helps. Hrm, are you sure that's still true? I just did an ALTER TABLE ... TYPE and it created a new file, meaning no bloating. > I would expect that setting fillfactor to 100 will encourage indexs to bloat > faster, and would only be recommended if you didn't expect the index contents > to change? Yes. -- Decibel!, aka Jim Nasby decibel@decibel.org EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Вложения
Bill Moran <wmoran@collaborativefusion.com> writes: > In response to "Steven Flatt" <steven.flatt@gmail.com>: >> What's interesting is that an insert-only table can benefit significantly >> from reindexing after the table is fully loaded. > I've had similar experience. One thing you didn't mention that I've noticed > is that VACUUM FULL often bloats indexes. I've made it SOP that > after application upgrades (which usually includes lots of ALTER TABLES and > other massive schema and data changes) I VACUUM FULL and REINDEX (in that > order). Actually, if that is your intent then the best plan is: drop indexes, VACUUM FULL, create indexes from scratch. A huge proportion of VACUUM FULL's time goes into updating the indexes, and that work is basically wasted if you are going to reindex afterwards. CLUSTER is a good substitute for V.F. partly because it doesn't try to update the indexes incrementally, but just does the equivalent of REINDEX after it's reordered the heap. I'd make the same remark about Steven's case: if possible, don't create the indexes at all until you've loaded the table fully. regards, tom lane
In response to "Decibel!" <decibel@decibel.org>: > On Wed, Aug 08, 2007 at 03:27:57PM -0400, Bill Moran wrote: > > I've had similar experience. One thing you didn't mention that I've noticed > > is that VACUUM FULL often bloats indexes. I've made it SOP that > > after application upgrades (which usually includes lots of ALTER TABLES and > > other massive schema and data changes) I VACUUM FULL and REINDEX (in that > > order). > > You'd be better off with a CLUSTER in that case. It'll be faster, and > you'll ensure that the table has optimal ordering. Point taken. > > Lots of ALTER TABLEs seem to bloat the database size considerably, beyond > > what normal VACUUM seems to fix. A FULL seems to fix that, but it appears > > to bloat the indexes, thus a REINDEX helps. > > Hrm, are you sure that's still true? I just did an ALTER TABLE ... TYPE > and it created a new file, meaning no bloating. No, I'm not. This isn't something I've analyzed or investigated in detail. During upgrades, a lot happens: ATLER TABLES, tables are dropped, new tables are created, massive amounts of data may be altered in a short period, stored procedures are replaced, etc, etc. I don't remember what led me to believe that the ALTER TABLES were causing the worst of the problem, but it's entirely possible that I was off-base. (I seem to remember being concerned about too many DROP COLUMN and ADD COLUMNs) In any event, my original statement (that it's a good idea to REINDEX after VACUUM FULL) still seems to be correct. -- Bill Moran Collaborative Fusion Inc. http://people.collaborativefusion.com/~wmoran/ wmoran@collaborativefusion.com Phone: 412-422-3463x4023
I'd make the same remark about Steven's case: if possible, don't create
the indexes at all until you've loaded the table fully.
It makes more sense for us to have ~1 hour's worth of reindexing afterwards during which read performance on that partition is "compromised".
However I'm seeing that all readers of that table are blocked until the reindex finishes, even reads that do not attempt to use the index. Is this a problem with the docs or a bug?
"Steven Flatt" <steven.flatt@gmail.com> writes: > However I'm seeing that all readers of that table are blocked until the > reindex finishes, even reads that do not attempt to use the index. Is this > a problem with the docs or a bug? You'll have to describe in more detail what you're doing so we can see what's causing it to not work for you because "works for me": postgres=# create table test (i integer); CREATE TABLE postgres=# insert into test select generate_series(1,1000); INSERT 0 1000 postgres=# create or replace function slow(integer) returns integer as 'begin perform pg_sleep(0); return $1; end' languageplpgsql immutable strict; CREATE FUNCTION postgres=# create index slowi on test (slow(i)); CREATE INDEX postgres=# create or replace function slow(integer) returns integer as 'begin perform pg_sleep(1); return $1; end' languageplpgsql immutable strict; CREATE FUNCTION postgres=# reindex index slowi; While that's running I ran: postgres=# select count(*) from test; count ------- 1000 (1 row) > I'm considering creating a new index with the same definition as the first > (different name), so while that index is being created, read access to the > table, and the original index, is not blocked. When the new index is > created, drop the original index and rename the new index to the original, > and we've essentially accomplished the same thing. In fact, why isn't > reindex doing this sort of thing in the background anways? It is but one level lower down. But the locks which block people from using the index must be at this level. Consider for example that one of the operations someone might be doing is creating a foreign key which depends on this index. If we created a new index and then tried to drop this one the drop would fail because of the foreign key which needs it. It's possible these problems could all be worked out but it would still take quite a bit of work to do so. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
postgres=# create table test (i integer);
CREATE TABLE
postgres=# insert into test select generate_series(1,1000);
INSERT 0 1000
postgres=# create or replace function slow(integer) returns integer as 'begin perform pg_sleep(0); return $1; end' language plpgsql immutable strict;
CREATE FUNCTION
postgres=# create index slowi on test (slow(i));
CREATE INDEX
postgres=# create or replace function slow(integer) returns integer as 'begin perform pg_sleep(1); return $1; end' language plpgsql immutable strict;
CREATE FUNCTION
postgres=# reindex index slowi;
While that's running I ran:
postgres=# select count(*) from test;
count
-------
1000
(1 row)
version
--------------------------------------------------------------------------------
----------------
PostgreSQL 8.2.4 on i386-portbld-freebsd6.1, compiled by GCC cc (GCC) 3.4.4 [FreeBSD] 20050518
(1 row)
# select locktype,relation,mode,granted from pg_locks where not granted;
locktype | relation | mode | granted
----------+----------+-----------------+---------
relation | 69293 | AccessShareLock | f
(1 row)
# select relname from pg_class where oid = 69293;
relname
---------
slowi
(1 row)
# select locktype,relation,mode,granted from pg_locks where relation = 69293;
locktype | relation | mode | granted
----------+----------+---------------------+---------
relation | 69293 | AccessShareLock | f
relation | 69293 | AccessExclusiveLock | t
(2 rows)
"Steven Flatt" <steven.flatt@gmail.com> writes: > Interestingly enough, the example you've given does not work for me either. > The select count(*) from test blocks until the reindex completes. Are we > using the same pg version? Seems like a fair question, because Greg's example blocks for me too, in plancat.c where the planner is trying to acquire information on each index. This seems to be an unwanted side effect of this 8.2-era patch http://archives.postgresql.org/pgsql-committers/2006-07/msg00356.php specifically, note here http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/optimizer/util/plancat.c.diff?r1=1.121;r2=1.122;f=h how the new planner coding takes at least AccessShareLock on each index, where the old coding took no lock at all. I think that the new coding rule of "you *must* take some lock when opening the relation" is essential for tables, but it might not be necessary for indexes if you've got a lock on the parent table. We don't allow any schema changes on an index to be made without holding exclusive lock on the parent, so plancat.c's basic purpose of finding out the properties of the index could be done safely without any index lock. The fly in the ointment is that after collecting the pg_index definition of the index, plancat.c also wants to know how big it is --- it calls RelationGetNumberOfBlocks. And that absolutely does look at the physical storage, which means it absolutely is unsafe to do in parallel with a REINDEX that will be dropping the old physical storage at some point. So maybe we are stuck and we have to say "that doesn't work anymore". But it feels like we might not be too far away from letting it still work. Thoughts, ideas? regards, tom lane
Tom Lane wrote: > > The fly in the ointment is that after collecting the pg_index definition > of the index, plancat.c also wants to know how big it is --- it calls > RelationGetNumberOfBlocks. And that absolutely does look at the > physical storage, which means it absolutely is unsafe to do in parallel > with a REINDEX that will be dropping the old physical storage at some > point. > > So maybe we are stuck and we have to say "that doesn't work anymore". > But it feels like we might not be too far away from letting it still > work. Thoughts, ideas? > A suggestion that seems a bit like a leap backwards in time - maybe just use the pg_class.relpages entry for the index size? I'm punting that with autovacuum being enabled by default now, the relpages entries for all relations will be more representative than they used to in previous releases. Cheers Mark
"Steven Flatt" <steven.flatt@gmail.com> writes: > On 8/22/07, Gregory Stark <stark@enterprisedb.com> wrote: > > Interestingly enough, the example you've given does not work for me either. > The select count(*) from test blocks until the reindex completes. Are we > using the same pg version? I was using CVS head but given Tom's explanation I wouldn't expect to see any different behaviour here. I just retried it and it did block. I can't think of anything I could have done wrong last time to make it appear not to block. If I had missed an error at some point along the way I would have expected the reindex to complete quickly or fail or something but it was definitely just blocked. I remember noting (much) later that it had finished. Strange. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Tom Lane wrote:
>
> The fly in the ointment is that after collecting the pg_index definition
> of the index, plancat.c also wants to know how big it is --- it calls
> RelationGetNumberOfBlocks. And that absolutely does look at the
> physical storage, which means it absolutely is unsafe to do in parallel
> with a REINDEX that will be dropping the old physical storage at some
> point.
A suggestion that seems a bit like a leap backwards in time - maybe just
use the pg_class.relpages entry for the index size?
"Steven Flatt" <steven.flatt@gmail.com> writes: >> The fly in the ointment is that after collecting the pg_index definition >> of the index, plancat.c also wants to know how big it is --- it calls >> RelationGetNumberOfBlocks. > Why do we even need to consider calling RelationGetNumberOfBlocks or looking > at the pg_class.relpages entry? My understanding of the expected behaviour > is that while a reindex is happening, all queries run against the parent > table are planned as though the index isn't there (i.e. it's unusable). Where in the world did you get that idea? If we had a REINDEX CONCURRENTLY it might work that way. A normal REINDEX cannot "mark" anything because it runs within a single transaction; there is no way that it can emit any catalog changes that will be visible before it's over. regards, tom lane
"Steven Flatt" <steven.flatt@gmail.com> writes:
> Why do we even need to consider calling RelationGetNumberOfBlocks or looking
> at the pg_class.relpages entry? My understanding of the expected behaviour
> is that while a reindex is happening, all queries run against the parent
> table are planned as though the index isn't there (i.e. it's unusable).
Where in the world did you get that idea?
If we had a REINDEX CONCURRENTLY it might work that way. A normal
REINDEX cannot "mark" anything because it runs within a single
transaction; there is no way that it can emit any catalog changes
that will be visible before it's over.
"Steven Flatt" <steven.flatt@gmail.com> writes: > So, can we simply trust what's in pg_class.relpages and ignore looking > directly at the index? No, we can't. In the light of morning I remember more about the reason for the aforesaid patch: it's actually unsafe to read the pg_class row at all if you have not got lock on the index. We are reading with SnapshotNow in order to be sure we see up-to-date info, and that means that a concurrent update of the row (eg, for REINDEX to report the new relfilenode) can have the following behavior: 1. REINDEX inserts the new modified version of the index's pg_class row. 2. Would-be reader process visits the new version of the pg_class row. It's not committed yet, so we ignore it and continue scanning. 3. REINDEX commits. 4. Reader process visits the old version of the pg_class row. It's now committed dead, so we ignore it and continue scanning. 5. Reader process bombs out with a complaint about no pg_class row for the index. So we really have to have the lock. > This is a fairly serious concern for us, that > reindex is blocking all readers of the parent table. I'm afraid you're kinda stuck: I don't see any fix that would be practical to put into 8.2, or even 8.3 considering that it's way too late to be thinking of implementing REINDEX CONCURRENTLY for 8.3. You might be able to work around it for now by faking such a reindex "by hand"; that is, create a duplicate new index under a different name using CREATE INDEX CONCURRENTLY, then exclusive-lock the table for just long enough to drop the old index and rename the new one to match. It's probably worth asking also how badly you really need routine reindexing. Are you certain your app still needs that with 8.2, or is it a hangover from a few releases back? Could more aggressive (auto)vacuuming provide a better solution? regards, tom lane
You might be able to work around it for now by faking such a reindex
"by hand"; that is, create a duplicate new index under a different
name using CREATE INDEX CONCURRENTLY, then exclusive-lock the table
for just long enough to drop the old index and rename the new one
to match.
It's probably worth asking also how badly you really need routine
reindexing. Are you certain your app still needs that with 8.2,
or is it a hangover from a few releases back? Could more aggressive
(auto)vacuuming provide a better solution?
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > "Steven Flatt" <steven.flatt@gmail.com> writes: >> So, can we simply trust what's in pg_class.relpages and ignore looking >> directly at the index? > > No, we can't. In the light of morning I remember more about the reason > for the aforesaid patch: it's actually unsafe to read the pg_class row > at all if you have not got lock on the index. We are reading with > SnapshotNow in order to be sure we see up-to-date info, and that means > that a concurrent update of the row (eg, for REINDEX to report the new > relfilenode) can have the following behavior: Should reindex be doing an in-place update? Don't we have to do in-place updates for other system catalogs which are read in snapshotnow for precisely the same reasons? Alternatively, why does the planner need access to the pg_class entry and not just the pg_index record? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Gregory Stark <stark@enterprisedb.com> writes: > Should reindex be doing an in-place update? Not if you'd like it to be crash-safe. > Alternatively, why does the planner need access to the pg_class entry and not > just the pg_index record? For one thing, to find out how big the index is ... though if we could get around that problem, it might indeed be possible to treat the pg_index records as property of the parent table not the index itself, which would give us license to read them without locking the index. regards, tom lane
This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold --------------------------------------------------------------------------- Tom Lane wrote: > "Steven Flatt" <steven.flatt@gmail.com> writes: > > So, can we simply trust what's in pg_class.relpages and ignore looking > > directly at the index? > > No, we can't. In the light of morning I remember more about the reason > for the aforesaid patch: it's actually unsafe to read the pg_class row > at all if you have not got lock on the index. We are reading with > SnapshotNow in order to be sure we see up-to-date info, and that means > that a concurrent update of the row (eg, for REINDEX to report the new > relfilenode) can have the following behavior: > > 1. REINDEX inserts the new modified version of the index's pg_class row. > > 2. Would-be reader process visits the new version of the pg_class row. > It's not committed yet, so we ignore it and continue scanning. > > 3. REINDEX commits. > > 4. Reader process visits the old version of the pg_class row. It's > now committed dead, so we ignore it and continue scanning. > > 5. Reader process bombs out with a complaint about no pg_class row for > the index. > > So we really have to have the lock. > > > This is a fairly serious concern for us, that > > reindex is blocking all readers of the parent table. > > I'm afraid you're kinda stuck: I don't see any fix that would be > practical to put into 8.2, or even 8.3 considering that it's way too > late to be thinking of implementing REINDEX CONCURRENTLY for 8.3. > > You might be able to work around it for now by faking such a reindex > "by hand"; that is, create a duplicate new index under a different > name using CREATE INDEX CONCURRENTLY, then exclusive-lock the table > for just long enough to drop the old index and rename the new one > to match. > > It's probably worth asking also how badly you really need routine > reindexing. Are you certain your app still needs that with 8.2, > or is it a hangover from a few releases back? Could more aggressive > (auto)vacuuming provide a better solution? > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +