Обсуждение: slow update on 1M rows (worse with indexes)

Поиск
Список
Период
Сортировка

slow update on 1M rows (worse with indexes)

От
Gabriel Biberian
Дата:
Hello,

I experience significant performance issues with postgresql and updates.
I have a table which contains ~1M rows.
Layout:
TOTO=# \d versions_9d;
             Table «public.versions_9d»
  Colonne   |          Type          | Modificateurs
------------+------------------------+---------------
 hash       | character(32)          |
 date       | integer                | default 0
 diff       | integer                | default 0
 flag       | integer                | default 0
 size       | bigint                 | default 0
 zip_size   | bigint                 | default 0
 jds        | integer                | default 0
 scanned    | integer                | default 0
 dead       | integer                | default 0

Test case:
Create a new DB and load a dump of the above database with 976009 rows,
then i perform updates on the whole table.  I recorded the time taken
for each full update and the amount of extra disk space used.  Each
consecutive update of the table is slower than the previous
beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=2"
UPDATE 976009
real    0m41.542s
beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=3"
UPDATE 976009
real    0m45.140s (+480M)
beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=4"
UPDATE 976009
real    1m10.554s (+240M)
beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=5"
UPDATE 976009
real    1m24.065s (+127M)
beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=6"
UPDATE 976009
real    1m17.758s (+288M)
beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=7"
UPDATE 976009
real    1m26.777s (+288M)
beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=8"
UPDATE 976009
real    1m39.151s (+289M)

Then i tried adding an index to the table on the column date (int) that
stores unix timestamps.
TOTO=# CREATE INDEX versions_index ON versions_9d (date);
(-60M) disk space goes down on index creation
beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=9"
UPDATE 976009
real    3m8.219s (+328M)
beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=8"
UPDATE 976009
real    6m24.716s (+326M)
beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=10"
UPDATE 976009
real    8m25.274s (+321M)

As a sanity check, i loaded mysql5 and tried the same database and
updates.  With mysql, the update always lasts ~8s.
The conclusions I have come to is that update==insert+delete which seems
very heavy when index are present (and heavy disk wise on big tables).
Is there a switch i can flip to optimise this?

Thanks in advance,
Gabriel Biberian

Re: slow update on 1M rows (worse with indexes)

От
"Steinar H. Gunderson"
Дата:
On Thu, Feb 22, 2007 at 07:11:42PM +0100, Gabriel Biberian wrote:
> Create a new DB and load a dump of the above database with 976009 rows,
> then i perform updates on the whole table.  I recorded the time taken
> for each full update and the amount of extra disk space used.  Each
> consecutive update of the table is slower than the previous
> beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=2"
> UPDATE 976009
> real    0m41.542s

You're creating a huge amount of dead rows by this kind of procedure. Try a
VACUUM in-between, or enable autovacuum. (Adjusting your WAL and
checkpointing settings might help too.)

Apart from that, do you really have a scenario that requires updating _all_
rows in your table regularly?

/* Steinar */
--
Homepage: http://www.sesse.net/

Re: slow update on 1M rows (worse with indexes)

От
"Joshua D. Drake"
Дата:
n i tried adding an index to the table on the column date (int) that
> stores unix timestamps.
> TOTO=# CREATE INDEX versions_index ON versions_9d (date);
> (-60M) disk space goes down on index creation
> beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=9"
> UPDATE 976009
> real    3m8.219s (+328M)
> beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=8"
> UPDATE 976009
> real    6m24.716s (+326M)
> beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=10"
> UPDATE 976009
> real    8m25.274s (+321M)
>
> As a sanity check, i loaded mysql5 and tried the same database and
> updates.  With mysql, the update always lasts ~8s.

Yes but with mysql did you use myisam or innodb?


> The conclusions I have come to is that update==insert+delete which seems
> very heavy when index are present (and heavy disk wise on big tables).
> Is there a switch i can flip to optimise this?
>
> Thanks in advance,
> Gabriel Biberian
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
>                http://www.postgresql.org/about/donate
>


--

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
             http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


Re: slow update on 1M rows (worse with indexes)

От
ismo.tuononen@solenovo.fi
Дата:
how about saying:

lock table versions_9d in EXCLUSIVE mode;
UPDATE versions_9d SET flag=2;
commit;

Ismo

On Thu, 22 Feb 2007, Gabriel Biberian wrote:

> Hello,
>
> I experience significant performance issues with postgresql and updates.
> I have a table which contains ~1M rows.
> Layout:
> TOTO=# \d versions_9d;
>             Table «public.versions_9d»
>  Colonne   |          Type          | Modificateurs
> ------------+------------------------+---------------
> hash       | character(32)          |
> date       | integer                | default 0
> diff       | integer                | default 0
> flag       | integer                | default 0
> size       | bigint                 | default 0
> zip_size   | bigint                 | default 0
> jds        | integer                | default 0
> scanned    | integer                | default 0
> dead       | integer                | default 0
>
> Test case:
> Create a new DB and load a dump of the above database with 976009 rows, then i
> perform updates on the whole table.  I recorded the time taken for each full
> update and the amount of extra disk space used.  Each consecutive update of
> the table is slower than the previous
> beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=2"
> UPDATE 976009
> real    0m41.542s
> beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=3"
> UPDATE 976009
> real    0m45.140s (+480M)
> beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=4"
> UPDATE 976009
> real    1m10.554s (+240M)
> beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=5"
> UPDATE 976009
> real    1m24.065s (+127M)
> beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=6"
> UPDATE 976009
> real    1m17.758s (+288M)
> beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=7"
> UPDATE 976009
> real    1m26.777s (+288M)
> beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=8"
> UPDATE 976009
> real    1m39.151s (+289M)
>
> Then i tried adding an index to the table on the column date (int) that stores
> unix timestamps.
> TOTO=# CREATE INDEX versions_index ON versions_9d (date);
> (-60M) disk space goes down on index creation
> beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=9"
> UPDATE 976009
> real    3m8.219s (+328M)
> beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=8"
> UPDATE 976009
> real    6m24.716s (+326M)
> beebox@evobrik01:~$ time psql TOTO -c "UPDATE versions_9d SET flag=10"
> UPDATE 976009
> real    8m25.274s (+321M)
>
> As a sanity check, i loaded mysql5 and tried the same database and updates.
> With mysql, the update always lasts ~8s.
> The conclusions I have come to is that update==insert+delete which seems very
> heavy when index are present (and heavy disk wise on big tables). Is there a
> switch i can flip to optimise this?
>
> Thanks in advance,
> Gabriel Biberian
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
>                http://www.postgresql.org/about/donate
>