slow DELETE on 12 M row table

От: Janet Jacobsen
Тема: slow DELETE on 12 M row table
Дата: ,
Msg-id: 4A443367.306@lbl.gov
(см: обсуждение, исходный текст)
Ответы: Re: slow DELETE on 12 M row table  (Greg Stark)
Список: pgsql-performance

Скрыть дерево обсуждения

slow DELETE on 12 M row table  (Janet Jacobsen, )
 Re: slow DELETE on 12 M row table  (Greg Stark, )
  Re: slow DELETE on 12 M row table  (Richard Huxton, )
  Re: slow DELETE on 12 M row table  (Janet Jacobsen, )
   Re: slow DELETE on 12 M row table  (Marcin Stępnicki, )
    Re: slow DELETE on 12 M row table  (Janet Jacobsen, )
     Re: slow DELETE on 12 M row table  (Robert Haas, )
      Re: slow DELETE on 12 M row table  (Scott Carey, )
       Re: slow DELETE on 12 M row table  (Robert Haas, )
      Re: slow DELETE on 12 M row table  (Janet Jacobsen, )
   Re: slow DELETE on 12 M row table  (Scott Marlowe, )

Hi.  We are running Postgres 8.3.7 on an eight-processor Linux system.
Because the machine has very little local disk, the database files are on
a file system running GPFS.

The machine is mostly dedicated to processing images.  After the images
are processed, the image attributes and processing parameters are
written to the database.  This is repeated nightly.

Generally, unless image processing is taking place, queries are pretty
fast - ms to seconds.  But deletes are very slow from the largest tables,
which currently have 12 M rows: on the order of four minutes for 60 rows.
We don't have to do a lot of deletions, but do need to be able to do some
from time to time, and generally several thousand at a time.

We also don't have many users - generally no more than one to five
connections at a time.

While waiting for some deletions, I went to search.postgresql.org and
typed "slow delete" in the search field.

Per what I read I tried "explain analyze delete...":
> subtest=> explain analyze delete from table1 where id > 11592550;
>                                                              QUERY
> PLAN
>
------------------------------------------------------------------------------------------------------------------------------------
>  Index Scan using table1_pkey on table1  (cost=0.00..2565.25 rows=1136
> width=6) (actual time=77.819..107.476 rows=50 loops=1)
>    Index Cond: (id > 11592550)
>  Trigger for constraint table2_table1_id_fkey: time=198484.158 calls=50
>  Total runtime: 198591.894 ms
> (4 rows)
which immediately showed me that I had forgotten about the foreign key in
another table that references the primary key in the table where
I am trying to do the deletions: table2.table1_id -> table1.id.

The posts I read  and the time above suggest that I should create an
index on
the foreign key constraint field in table2 , so I am waiting for that
index to be
created.

My questions are:
(1) is my interpretation of the posts correct, i.e., if I am deleting
rows from
table1, where the pkey of table 1 is a fkey in table 2, then do I need
to create an
index on the fkey field in table 2?
(2) do you have any suggestions on how I can determine why it is taking
several hours to create an index on a field in a table with 12 M rows?  does
that seem like a reasonable amount of time?  I have maintenance_work_mem
set to 512MB - is that too low, or is that the wrong config parameter to
change?
[ps aux shows "CREATE INDEX waiting"; there is nothing (no image processing)
running on the machine at this time]
(3) would I be better off dropping the foreign keys?  in general, is it
workable to
have foreign keys on tables with > 100 M rows (assuming I create all of
the 'right'
indexes)?

Thank you,
Janet








В списке pgsql-performance по дате сообщения:

От: Janet Jacobsen
Дата:
Сообщение: Re: slow DELETE on 12 M row table
От: Marcin Stępnicki
Дата:
Сообщение: Re: slow DELETE on 12 M row table