Обсуждение: Yet another Performance Question
Hi, I once read in Oracle Performance Tuning, that if one inserts or changes large amounts of data in a table, it might be better to drop indices before doing this and recreating them afterwards. Could someone give a hint on how this is in Postgres 7.1? Currently I am experiencing a massive slowdown in importing data. Konstantin -- Dipl-Inf. Konstantin Agouros aka Elwood Blues. Internet: elwood@agouros.de Otkerstr. 28, 81547 Muenchen, Germany. Tel +49 89 69370185 ---------------------------------------------------------------------------- "Captain, this ship will not sustain the forming of the cosmos." B'Elana Torres
>I once read in Oracle Performance Tuning, that if one inserts or changes large >amounts of data in a table, it might be better to drop indices before doing >this and recreating them afterwards. Could someone give a hint on how this >is in Postgres 7.1? Currently I am experiencing a massive slowdown in importing >data. Postgres "suffers" the same problem which is very logical if you think about it. Inserts must adjust the indexes for every record and don't really know that there are a lot of other rows comming. Also, due to the multiuser nature of Postgres, other users could be accessing data between your rows of inserts and that data must be valid at that time. A non-indexed version of a table is just as accessible as an indexed one (though not as fast) so you have to decide if it's better to slow down a query or two while you insert/index or spend much more time having a good index after each insert. What would be nice is a simple "disable indexes on this (these) tables" command. The enable indexes.... command would then do a vacuum analyze on the effected tables when you were done. This would make sure that ALL of the indexes got rebuilt (I occationally forget an index when doing it "by hand"). len morgan
On Wed, Apr 18, 2001 at 07:18:40AM -0500, Len Morgan wrote: > >I once read in Oracle Performance Tuning, that if one inserts or changes > large > >amounts of data in a table, it might be better to drop indices before doing > >this and recreating them afterwards. Could someone give a hint on how this > >is in Postgres 7.1? Currently I am experiencing a massive slowdown in > importing > >data. > > Postgres "suffers" the same problem which is very logical if you think about > it. Inserts must adjust the indexes for every record and don't really know > that there are a lot of other rows comming. Also, due to the multiuser > nature of Postgres, other users could be accessing data between your rows of > inserts and that data must be valid at that time. > > A non-indexed version of a table is just as accessible as an indexed one > (though not as fast) so you have to decide if it's better to slow down a > query or two while you insert/index or spend much more time having a good > index after each insert. Frankly what matters is the time it takes alltogether. I have a script that first does a few million inserts, and than queries on this data... these queries don't terminate in a reasonable (<10days) time, if I don't use indices. So I will add some drop/create index-commands to the script... Konstantin -- Dipl-Inf. Konstantin Agouros aka Elwood Blues. Internet: elwood@agouros.de Otkerstr. 28, 81547 Muenchen, Germany. Tel +49 89 69370185 ---------------------------------------------------------------------------- "Captain, this ship will not sustain the forming of the cosmos." B'Elana Torres
On Wed, Apr 18, 2001 at 12:55:07PM +0200, Konstantinos Agouros wrote: > I once read in Oracle Performance Tuning, that if one inserts or changes large > amounts of data in a table, it might be better to drop indices before doing > this and recreating them afterwards. Could someone give a hint on how this > is in Postgres 7.1? Currently I am experiencing a massive slowdown in importing > data. I use a little script that create two sql files for dropping and recreating indexes of a database. Sxript is for "es" shell, but it is easily adaptable to sh: ------------------------------------------------------------------------ #!/usr/bin/es # # David Espada 2000. # BD = multi # Name of database. QUERY = 'select indexdef from pg_indexes;' F_CREA = create_indices.sql F_BORRA = drop_indices.sql psql $BD -c $QUERY | grep -i 'create' | grep -v 'pg_.*_index' | awk '{print $0, ";"}' > $F_CREA awk '{ gsub(/CREATE.*INDEX/, "DROP INDEX"); print $1, $2, $3, ";" }' $F_CREA > $F_BORRA ------------------------------------------------------------------------- When executing script, you have two files that can use like: $ psql -f drop_indices.sql $ psql -f create_indices.sql I hope it helps you. Greets. David
Hi! I'm not 100% sure, but I think it would be much faster if you use COPY instead of INSERT when you read in a lot of data from a file. Regards, Patrik Kudo -- ech`echo xiun|tr nu oc|sed 'sx\([sx]\)\([xoi]\)xo un\2\1 is xg'`ol Känns det oklart? Fråga på! On Wed, 18 Apr 2001, Konstantinos Agouros wrote: > Frankly what matters is the time it takes alltogether. I have a script that > first does a few million inserts, and than queries on this data... these queries > don't terminate in a reasonable (<10days) time, if I don't use indices. > > So I will add some drop/create index-commands to the script... > > Konstantin
On Wed, Apr 18, 2001 at 02:59:53PM +0200, Patrik Kudo wrote: > Hi! > > I'm not 100% sure, but I think it would be much faster if you use COPY > instead of INSERT when you read in a lot of data from a file. Well unless I am the database owner I only can do \copy and the script does some manipulation before it inserts... Konstantin > > Regards, > Patrik Kudo > > -- > ech`echo xiun|tr nu oc|sed 'sx\([sx]\)\([xoi]\)xo un\2\1 is xg'`ol > Känns det oklart? Fråga på! > > On Wed, 18 Apr 2001, Konstantinos Agouros wrote: > > > Frankly what matters is the time it takes alltogether. I have a script that > > first does a few million inserts, and than queries on this data... these queries > > don't terminate in a reasonable (<10days) time, if I don't use indices. > > > > So I will add some drop/create index-commands to the script... > > > > Konstantin > -- Dipl-Inf. Konstantin Agouros aka Elwood Blues. Internet: elwood@agouros.de Otkerstr. 28, 81547 Muenchen, Germany. Tel +49 89 69370185 ---------------------------------------------------------------------------- "Captain, this ship will not sustain the forming of the cosmos." B'Elana Torres
Konstantinos Agouros <elwood@agouros.de> said: > On Wed, Apr 18, 2001 at 02:59:53PM +0200, Patrik Kudo wrote: > > Hi! > > > > I'm not 100% sure, but I think it would be much faster if you use COPY > > instead of INSERT when you read in a lot of data from a file. > Well unless I am the database owner I only can do copy and the script does > some manipulation before it inserts... Don't forget you can do \copy from stdin so you can pipe the output of your script to psql. Failing that, try batching inserts in transactions, a few hundred at a time. That can speed things up. You'll need to find what figure works best for you. - Richard Huxton
Have you tried running PostGres with the -F option so that disk flushes are not performed every time. Maybe this will make a difference to insert performance. NB In 7.1 this is replaced by WAL which I don't know much about MC -- NOTICE: The information contained in this electronic mail transmission is intended by Convergys Corporation for the use of the named individual or entity to which it is directed and may contain information that is privileged or otherwise confidential. If you have received this electronic mail transmission in error, please delete it from your system without copying or forwarding it, and notify the sender of the error by reply email or by telephone (collect), so that the sender's address records can be corrected.
In <OFFC4DA3D2.A3056D98-ON80256A32.00517540@cbis.com> martin.chantler@convergys.com writes: >Have you tried running PostGres with the -F option so that >disk flushes are not performed every time. Maybe this will make a >difference >to insert performance. In the DBI-connect I still have a -F in the options, but now I am running 7.1RC2 (didn't have the time to do the upgrade to final yet). If this is called differently now, please let me know >NB In 7.1 this is replaced by WAL which I don't know much about >MC >-- >NOTICE: The information contained in this electronic mail transmission is >intended by Convergys Corporation for the use of the named individual or >entity to which it is directed and may contain information that is >privileged or otherwise confidential. If you have received this electronic >mail transmission in error, please delete it from your system without >copying or forwarding it, and notify the sender of the error by reply email >or by telephone (collect), so that the sender's address records can be >corrected. >---------------------------(end of broadcast)--------------------------- >TIP 5: Have you checked our extensive FAQ? >http://www.postgresql.org/users-lounge/docs/faq.html -- Dipl-Inf. Konstantin Agouros aka Elwood Blues. Internet: elwood@agouros.de Otkerstr. 28, 81547 Muenchen, Germany. Tel +49 89 69370185 ---------------------------------------------------------------------------- "Captain, this ship will not sustain the forming of the cosmos." B'Elana Torres