Обсуждение: Concurrent Reindex on Primary Key for large table

Поиск
Список
Период
Сортировка

Concurrent Reindex on Primary Key for large table

От
rverghese
Дата:
I have a large table with about 60 million rows, everyday I add 3-4 million,
remove 3-4 million and update 1-2 million. I have a script that reindexes
concurrently a couple of times a week, since I see significant bloat. I have
autovac on and the settings are below. I can't concurrently reindex the
primary key, since there can be only one on a table. I can't take the table
offline to do a drop/recreate. I assumed the autovac would take care of the
bloat there, but I checked the size of the index and dropped it and
recreated it and the size went from 2.5GB to 1.3GB and my daily import into
that table took almost half the time after recreating the primary key.
I was wondering what the options are to reindex it. Should i not have a
primary key and just a unique index so that I can concurrently reindex? Do I
lose anything by replacing the primary key with a unique index? I prefer to
have a primary key on the table for Slony, but I suppose I could make the
OID column the primary key just for SLONY purposes. 
Any thoughts about the best approach would be appreciated. Note that this
bloat on the primary key is from about 2-3 weeks of daily
deletes/updates/inserts. I have multiple tables with similar structure and
data size. 

Here are my autovac settings. I am on PostgreSQL 9.1.1

#autovacuum_max_workers = 3             # max number of autovacuum
subprocesses
#autovacuum_naptime = 1min              # time between autovacuum runs
autovacuum_vacuum_threshold =  20000    # min number of row updates before 
# vacuum default 50
autovacuum_analyze_threshold = 20000    # min number of row updates before  
# analyze default 50
autovacuum_vacuum_scale_factor = 0.1    # fraction of table size before
vacuum
autovacuum_analyze_scale_factor = 0.1   # fraction of table  size before
analyze
#autovacuum_freeze_max_age = 200000000  # maximum XID age before forced
vacuum
autovacuum_vacuum_cost_delay = 5ms    
#autovacuum_vacuum_cost_limit = -1  

Thanks
RV

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Concurrent-Reindex-on-Primary-Key-for-large-table-tp5467243p5467243.html
Sent from the PostgreSQL - sql mailing list archive at Nabble.com.


Re: Concurrent Reindex on Primary Key for large table

От
Thomas Kellerer
Дата:
rverghese wrote on 08.02.2012 19:07:
> I have a large table with about 60 million rows, everyday I add 3-4 million,
> remove 3-4 million and update 1-2 million. I have a script that reindexes
> concurrently a couple of times a week, since I see significant bloat. I have
> autovac on and the settings are below. I can't concurrently reindex the
> primary key, since there can be only one on a table.

With 9.1 you can create a new index and drop and re-create the primary key using the new index.

This still requires an exclusive lock on the table, but only for a very short moment:

The following example is more or less taken from the manual:
http://www.postgresql.org/docs/9.1/static/sql-altertable.html

create unique concurrently new_index on your_table(your_pk);
alter table your_table drop primary key;
alter table your_table add primary key using index new_index;

As this can be done in one transaction it should be safe with regards to the primary key.

Thomas





Re: Concurrent Reindex on Primary Key for large table

От
rverghese
Дата:
Thanks! That worked. 

Any thoughts about containing index bloat. I thought the autovac would clean
it up a bit more. Would any tweaks to my settings improve autovac
performance? I am still doing a couple of concurrent reindexes per week
otherwise performance degrades over a couple of days. 

Thanks
RV

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Concurrent-Reindex-on-Primary-Key-for-large-table-tp5467243p5470216.html
Sent from the PostgreSQL - sql mailing list archive at Nabble.com.


Re: Concurrent Reindex on Primary Key for large table

От
"Igor Neyman"
Дата:
> -----Original Message-----
> From: rverghese [mailto:riyav@hotmail.com]
> Sent: Thursday, February 09, 2012 12:49 PM
> To: pgsql-sql@postgresql.org
> Subject: Re: Concurrent Reindex on Primary Key for large table
>
> Thanks! That worked.
>
> Any thoughts about containing index bloat. I thought the autovac would
> clean it up a bit more. Would any tweaks to my settings improve
autovac
> performance? I am still doing a couple of concurrent reindexes per
week
> otherwise performance degrades over a couple of days.
>
> Thanks
> RV
>
> --
> View this message in context:
> http://postgresql.1045698.n5.nabble.com/Concurrent-Reindex-on-Primary-
> Key-for-large-table-tp5467243p5470216.html
> Sent from the PostgreSQL - sql mailing list archive at Nabble.com.

Did you try to make autovacuum "more aggressive" like lowering
autovacuum_vacuum_threshold from 20000?

Regards,
Igor Neyman


Re: Concurrent Reindex on Primary Key for large table

От
rverghese
Дата:
Hi
I had another question, what about when the primary key is a foreign key in
another table? Is the only option to drop the FK and recreate it after the
primary key has been created with the new index?

Thanks!
RV

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Concurrent-Reindex-on-Primary-Key-for-large-table-tp5467243p5506261.html
Sent from the PostgreSQL - sql mailing list archive at Nabble.com.