Обсуждение: slow query : very simple delete, 100% cpu, nearly no disk activity

Поиск
Список
Период
Сортировка

slow query : very simple delete, 100% cpu, nearly no disk activity

От
Vincent de Phily
Дата:
Hi list,

I've been running this simple delete since yesterday afternoon :
> db=# explain delete from message where datetime < '2009-03-03';
>  Seq Scan on message  (cost=0.00..34131.95 rows=133158 width=6)
>    Filter: (datetime < '2009-03-03 00:00:00'::timestamp without time zone)

There is no index on that column, so a seqscan is fine. But it really
shouldn't take > 15 hours to delete :

> db=# select count(*) from message where datetime < '2009-03-03';
> 184368
> Time: 751.721 ms
>
> db=# select count(*) from message;
> 1079463
> Time: 593.899 ms
>
> db=# select pg_size_pretty(pg_relation_size('message'));
> 161 MB
> Time: 96.062 ms
>
> db=# \o /dev/null
> db=# select * from message where datetime < '2009-03-03';
> Time: 4975.123 ms


Most of the time, there is no other connection to that database. This is on an
oldish laptop. atop reports 100% cpu and about 24KB/s of writes for postgres.
Machine is mostly idle (although I did run a multi-hours compile during the
night). Nothing looks wrong in postgres logs.

PostgreSQL 8.3.7 on i686-pc-linux-gnu, compiled by GCC i686-pc-linux-gnu-gcc
(Gentoo 4.3.2-r3 p1.6, pie-10.1.5) 4.3.2

postgresql.conf :
> max_connections = 100
> shared_buffers = 24MB
> max_fsm_pages = 153600
> log_destination = 'stderr'
> logging_collector = on
> log_directory = '/var/log/postgres/'
> log_filename = '%Y-%m-%d_%H%M%S.log'
> log_rotation_size = 100MB
> log_min_duration_statement = 30000
> log_line_prefix = '%t %d %p '
> datestyle = 'iso, mdy'
> lc_messages = 'C'
> lc_monetary = 'C'
> lc_numeric = 'C'
> lc_time = 'C'
> default_text_search_config = 'pg_catalog.english'


Not sure what to look at to debug this further (I could work around the
problem with pg_dump + grep, but that's beside the point). Any idea ?


Thanks.

--
Vincent de Phily
Mobile Devices
+33 (0) 666 301 306
+33 (0) 142 119 325

Warning
This message (and any associated files) is intended only for the use of its
intended recipient and may contain information that is confidential, subject
to copyright or constitutes a trade secret. If you are not the intended
recipient you are hereby notified that any dissemination, copying or
distribution of this message, or files associated with this message, is
strictly prohibited. If you have received this message in error, please
notify us immediately by replying to the message and deleting it from your
computer. Any views or opinions presented are solely those of the author
vincent.dephily@mobile-devices.fr and do not necessarily represent those of
the
company. Although the company has taken reasonable precautions to ensure no
viruses are present in this email, the company cannot accept responsibility
for any loss or damage arising from the use of this email or attachments.

Re: slow query : very simple delete, 100% cpu, nearly no disk activity

От
Tom Lane
Дата:
Vincent de Phily <vincent.dephily@mobile-devices.fr> writes:
> I've been running this simple delete since yesterday afternoon :
>> db=# explain delete from message where datetime < '2009-03-03';
>> Seq Scan on message  (cost=0.00..34131.95 rows=133158 width=6)
>> Filter: (datetime < '2009-03-03 00:00:00'::timestamp without time zone)

> There is no index on that column, so a seqscan is fine. But it really
> shouldn't take > 15 hours to delete :

99% of the time, the reason a delete takes way longer than it seems like
it should is trigger firing time.  In particular, foreign key triggers
where you don't have an index on the referencing column.  Are there
any foreign keys linking to this table?

            regards, tom lane

Re: slow query : very simple delete, 100% cpu, nearly no disk activity

От
Vincent de Phily
Дата:
On Monday 07 September 2009 03:25:23 Tom Lane wrote:
> Vincent de Phily <vincent.dephily@mobile-devices.fr> writes:
> > I've been running this simple delete since yesterday afternoon :
> >> db=# explain delete from message where datetime < '2009-03-03';
> >> Seq Scan on message  (cost=0.00..34131.95 rows=133158 width=6)
> >> Filter: (datetime < '2009-03-03 00:00:00'::timestamp without time zone)
> >
> > There is no index on that column, so a seqscan is fine. But it really
> > shouldn't take > 15 hours to delete :
>
> 99% of the time, the reason a delete takes way longer than it seems like
> it should is trigger firing time.  In particular, foreign key triggers
> where you don't have an index on the referencing column.  Are there
> any foreign keys linking to this table?

Yes, but they look fine to me (?). Only one FK references the table; it's an
internal reference :

                                     Table "public.message"
  Column   |            Type             |                      Modifiers
-----------+-----------------------------+------------------------------------------------------
 id        | integer                     | not null default
nextval('message_id_seq'::regclass)
 unitid    | integer                     | not null
 userid    | integer                     |
 refid     | integer                     |
(...)
Indexes:
    "message_pkey" PRIMARY KEY, btree (id)
    "message_unitid_fromto_status_idx" btree (unitid, fromto, status)
    "message_userid_idx" btree (userid)
Foreign-key constraints:
    "message_refid_fkey" FOREIGN KEY (refid) REFERENCES message(id) ON UPDATE
CASCADE ON DELETE CASCADE
    "message_unitid_fkey" FOREIGN KEY (unitid) REFERENCES units(id) ON UPDATE
CASCADE ON DELETE CASCADE
    "message_userid_fkey" FOREIGN KEY (userid) REFERENCES users(id) ON UPDATE
CASCADE ON DELETE CASCADE

                                      Table "public.units"
   Column    |            Type             |                     Modifiers
-------------+-----------------------------+----------------------------------------------------
 id          | integer                     | not null default
nextval('units_id_seq'::regclass)
(...)
Indexes:
    "units_pkey" PRIMARY KEY, btree (id)
    "units_modid_ukey" UNIQUE, btree (modid)
    "units_profileid_idx" btree (profileid)
Foreign-key constraints:
    "units_profileid_fkey" FOREIGN KEY (profileid) REFERENCES profiles(id) ON
UPDATE CASCADE ON DELETE RESTRICT

                                 Table "public.users"
  Column  |         Type          |                     Modifiers
----------+-----------------------+----------------------------------------------------
 id       | integer               | not null default
nextval('users_id_seq'::regclass)
(...)
Indexes:
    "users_pkey" PRIMARY KEY, btree (id)
    "users_login_ukey" UNIQUE, btree (login)


Table users has a handdull of rows, table units has around 40000. 43% of
message.refid is NULL.

The delete finished during the weekend (DELETE 184368). Nothing in the logs
except the duration time (103113291.307 ms). I took a db dump before the
delete finished, in order to be able to reproduce the issue (a 30min test
shows me it is still slow).

--
Vincent de Phily
Mobile Devices
+33 (0) 666 301 306
+33 (0) 142 119 325

Warning
This message (and any associated files) is intended only for the use of its
intended recipient and may contain information that is confidential, subject
to copyright or constitutes a trade secret. If you are not the intended
recipient you are hereby notified that any dissemination, copying or
distribution of this message, or files associated with this message, is
strictly prohibited. If you have received this message in error, please
notify us immediately by replying to the message and deleting it from your
computer. Any views or opinions presented are solely those of the author
vincent.dephily@mobile-devices.fr and do not necessarily represent those of
the
company. Although the company has taken reasonable precautions to ensure no
viruses are present in this email, the company cannot accept responsibility
for any loss or damage arising from the use of this email or attachments.

Re: slow query : very simple delete, 100% cpu, nearly no disk activity

От
Robert Haas
Дата:
On Mon, Sep 7, 2009 at 5:05 AM, Vincent de Phily
<vincent.dephily@mobile-devices.fr> wrote:
> On Monday 07 September 2009 03:25:23 Tom Lane wrote:
>> Vincent de Phily <vincent.dephily@mobile-devices.fr> writes:
>> > I've been running this simple delete since yesterday afternoon :
>> >> db=# explain delete from message where datetime < '2009-03-03';
>> >> Seq Scan on message  (cost=0.00..34131.95 rows=133158 width=6)
>> >> Filter: (datetime < '2009-03-03 00:00:00'::timestamp without time zone)
>> >
>> > There is no index on that column, so a seqscan is fine. But it really
>> > shouldn't take > 15 hours to delete :
>>
>> 99% of the time, the reason a delete takes way longer than it seems like
>> it should is trigger firing time.  In particular, foreign key triggers
>> where you don't have an index on the referencing column.  Are there
>> any foreign keys linking to this table?
>
> Yes, but they look fine to me (?). Only one FK references the table; it's an
> internal reference :
>
>                                     Table "public.message"
>  Column   |            Type             |                      Modifiers
> -----------+-----------------------------+------------------------------------------------------
>  id        | integer                     | not null default
> nextval('message_id_seq'::regclass)
>  unitid    | integer                     | not null
>  userid    | integer                     |
>  refid     | integer                     |
> (...)
> Indexes:
>    "message_pkey" PRIMARY KEY, btree (id)
>    "message_unitid_fromto_status_idx" btree (unitid, fromto, status)
>    "message_userid_idx" btree (userid)
> Foreign-key constraints:
>    "message_refid_fkey" FOREIGN KEY (refid) REFERENCES message(id) ON UPDATE
> CASCADE ON DELETE CASCADE
>    "message_unitid_fkey" FOREIGN KEY (unitid) REFERENCES units(id) ON UPDATE
> CASCADE ON DELETE CASCADE
>    "message_userid_fkey" FOREIGN KEY (userid) REFERENCES users(id) ON UPDATE
> CASCADE ON DELETE CASCADE
>
>                                      Table "public.units"
>   Column    |            Type             |                     Modifiers
> -------------+-----------------------------+----------------------------------------------------
>  id          | integer                     | not null default
> nextval('units_id_seq'::regclass)
> (...)
> Indexes:
>    "units_pkey" PRIMARY KEY, btree (id)
>    "units_modid_ukey" UNIQUE, btree (modid)
>    "units_profileid_idx" btree (profileid)
> Foreign-key constraints:
>    "units_profileid_fkey" FOREIGN KEY (profileid) REFERENCES profiles(id) ON
> UPDATE CASCADE ON DELETE RESTRICT
>
>                                 Table "public.users"
>  Column  |         Type          |                     Modifiers
> ----------+-----------------------+----------------------------------------------------
>  id       | integer               | not null default
> nextval('users_id_seq'::regclass)
> (...)
> Indexes:
>    "users_pkey" PRIMARY KEY, btree (id)
>    "users_login_ukey" UNIQUE, btree (login)
>
>
> Table users has a handdull of rows, table units has around 40000. 43% of
> message.refid is NULL.
>
> The delete finished during the weekend (DELETE 184368). Nothing in the logs
> except the duration time (103113291.307 ms). I took a db dump before the
> delete finished, in order to be able to reproduce the issue (a 30min test
> shows me it is still slow).

I would try EXPLAIN ANALYZE DELETE ... with a query that is modified
so as to delete only a handful of rows.  That will report the amount
of time spent in triggers vs. the main query, which will help you
assess whether your conclusion that the foreign keys are OK is
correct.

...Robert

Re: slow query : very simple delete, 100% cpu, nearly no disk activity

От
Merlin Moncure
Дата:
On Mon, Sep 7, 2009 at 5:05 AM, Vincent de Phily
<vincent.dephily@mobile-devices.fr> wrote:
> On Monday 07 September 2009 03:25:23 Tom Lane wrote:
>> Vincent de Phily <vincent.dephily@mobile-devices.fr> writes:
>> > I've been running this simple delete since yesterday afternoon :
>> >> db=# explain delete from message where datetime < '2009-03-03';
>> >> Seq Scan on message  (cost=0.00..34131.95 rows=133158 width=6)
>> >> Filter: (datetime < '2009-03-03 00:00:00'::timestamp without time zone)
>> >
>> > There is no index on that column, so a seqscan is fine. But it really
>> > shouldn't take > 15 hours to delete :
>>
>> 99% of the time, the reason a delete takes way longer than it seems like
>> it should is trigger firing time.  In particular, foreign key triggers
>> where you don't have an index on the referencing column.  Are there
>> any foreign keys linking to this table?
>
> Yes, but they look fine to me (?). Only one FK references the table; it's an
> internal reference :
>
>                                     Table "public.message"
>  Column   |            Type             |                      Modifiers
> -----------+-----------------------------+------------------------------------------------------
>  id        | integer                     | not null default
> nextval('message_id_seq'::regclass)
>  unitid    | integer                     | not null
>  userid    | integer                     |
>  refid     | integer                     |

> Indexes:
>    "message_pkey" PRIMARY KEY, btree (id)
>    "message_unitid_fromto_status_idx" btree (unitid, fromto, status)
>    "message_userid_idx" btree (userid)
> Foreign-key constraints:
>    "message_refid_fkey" FOREIGN KEY (refid) REFERENCES message(id) ON UPDATE
> CASCADE ON DELETE CASCADE
>    "message_unitid_fkey" FOREIGN KEY (unitid) REFERENCES units(id) ON UPDATE
> CASCADE ON DELETE CASCADE
>    "message_userid_fkey" FOREIGN KEY (userid) REFERENCES users(id) ON UPDATE
> CASCADE ON DELETE CASCADE

where is the index on refid?

merlin

Re: slow query : very simple delete, 100% cpu, nearly no disk activity

От
Merlin Moncure
Дата:
On Mon, Sep 21, 2009 at 10:50 AM, Vincent de Phily
<vincent.dephily@mobile-devices.fr> wrote:
> On Friday 11 September 2009 23:55:09 Merlin Moncure wrote:
>> On Mon, Sep 7, 2009 at 5:05 AM, Vincent de Phily
>> <vincent.dephily@mobile-devices.fr> wrote:
>> >                                     Table "public.message"
>> >  Column   |            Type             |                      Modifiers
>> > -----------+-----------------------------+-------------------------------
>> >----------------------- id        | integer                     | not null
>> > default
>> > nextval('message_id_seq'::regclass)
>> >  unitid    | integer                     | not null
>> >  userid    | integer                     |
>> >  refid     | integer                     |
>> >
>> > Indexes:
>> >    "message_pkey" PRIMARY KEY, btree (id)
>> >    "message_unitid_fromto_status_idx" btree (unitid, fromto, status)
>> >    "message_userid_idx" btree (userid)
>> > Foreign-key constraints:
>> >    "message_refid_fkey" FOREIGN KEY (refid) REFERENCES message(id) ON
>> > UPDATE CASCADE ON DELETE CASCADE
>> >    "message_unitid_fkey" FOREIGN KEY (unitid) REFERENCES units(id) ON
>> > UPDATE CASCADE ON DELETE CASCADE
>> >    "message_userid_fkey" FOREIGN KEY (userid) REFERENCES users(id) ON
>> > UPDATE CASCADE ON DELETE CASCADE
>>
>> where is the index on refid?
>
> It's
> "message_pkey" PRIMARY KEY, btree (id)
> because
> (refid) REFERENCES message(id)

You are thinking about this backwards.  Every time you delete a
message, the table has to be scanned for any messages that reference
the message being deleted because of the refid constraint (in order to
see if any deletions must be cascaded).   PostgreSQL creates a backing
index for primary keys automatically but not foreign keys...so you
likely need to create an index on refid.

merlin

Re: slow query : very simple delete, 100% cpu, nearly no disk activity

От
Vincent de Phily
Дата:
On Friday 11 September 2009 23:30:37 Robert Haas wrote:
> On Mon, Sep 7, 2009 at 5:05 AM, Vincent de Phily
> <vincent.dephily@mobile-devices.fr> wrote:
> > On Monday 07 September 2009 03:25:23 Tom Lane wrote:
> >>
> >> 99% of the time, the reason a delete takes way longer than it seems like
> >> it should is trigger firing time.  In particular, foreign key triggers
> >> where you don't have an index on the referencing column.  Are there
> >> any foreign keys linking to this table?
> >
> > Yes, but they look fine to me (?). Only one FK references the table; it's
> > an internal reference :
> >
(...)
> I would try EXPLAIN ANALYZE DELETE ... with a query that is modified
> so as to delete only a handful of rows.  That will report the amount
> of time spent in triggers vs. the main query, which will help you
> assess whether your conclusion that the foreign keys are OK is
> correct.

Good idea. I'll try that in a little while and report the result.

--
Vincent de Phily
Mobile Devices
+33 (0) 666 301 306
+33 (0) 142 119 325

Warning
This message (and any associated files) is intended only for the use of its
intended recipient and may contain information that is confidential, subject
to copyright or constitutes a trade secret. If you are not the intended
recipient you are hereby notified that any dissemination, copying or
distribution of this message, or files associated with this message, is
strictly prohibited. If you have received this message in error, please
notify us immediately by replying to the message and deleting it from your
computer. Any views or opinions presented are solely those of the author
vincent.dephily@mobile-devices.fr and do not necessarily represent those of
the
company. Although the company has taken reasonable precautions to ensure no
viruses are present in this email, the company cannot accept responsibility
for any loss or damage arising from the use of this email or attachments.

Re: slow query : very simple delete, 100% cpu, nearly no disk activity

От
Vincent de Phily
Дата:
On Friday 11 September 2009 23:55:09 Merlin Moncure wrote:
> On Mon, Sep 7, 2009 at 5:05 AM, Vincent de Phily
> <vincent.dephily@mobile-devices.fr> wrote:
> >                                     Table "public.message"
> >  Column   |            Type             |                      Modifiers
> > -----------+-----------------------------+-------------------------------
> >----------------------- id        | integer                     | not null
> > default
> > nextval('message_id_seq'::regclass)
> >  unitid    | integer                     | not null
> >  userid    | integer                     |
> >  refid     | integer                     |
> >
> > Indexes:
> >    "message_pkey" PRIMARY KEY, btree (id)
> >    "message_unitid_fromto_status_idx" btree (unitid, fromto, status)
> >    "message_userid_idx" btree (userid)
> > Foreign-key constraints:
> >    "message_refid_fkey" FOREIGN KEY (refid) REFERENCES message(id) ON
> > UPDATE CASCADE ON DELETE CASCADE
> >    "message_unitid_fkey" FOREIGN KEY (unitid) REFERENCES units(id) ON
> > UPDATE CASCADE ON DELETE CASCADE
> >    "message_userid_fkey" FOREIGN KEY (userid) REFERENCES users(id) ON
> > UPDATE CASCADE ON DELETE CASCADE
>
> where is the index on refid?

It's
"message_pkey" PRIMARY KEY, btree (id)
because
(refid) REFERENCES message(id)


--
Vincent de Phily
Mobile Devices
+33 (0) 666 301 306
+33 (0) 142 119 325

Warning
This message (and any associated files) is intended only for the use of its
intended recipient and may contain information that is confidential, subject
to copyright or constitutes a trade secret. If you are not the intended
recipient you are hereby notified that any dissemination, copying or
distribution of this message, or files associated with this message, is
strictly prohibited. If you have received this message in error, please
notify us immediately by replying to the message and deleting it from your
computer. Any views or opinions presented are solely those of the author
vincent.dephily@mobile-devices.fr and do not necessarily represent those of
the
company. Although the company has taken reasonable precautions to ensure no
viruses are present in this email, the company cannot accept responsibility
for any loss or damage arising from the use of this email or attachments.

Re: slow query : very simple delete, 100% cpu, nearly no disk activity

От
Vincent de Phily
Дата:
On Monday 21 September 2009 17:00:36 Merlin Moncure wrote:
> On Mon, Sep 21, 2009 at 10:50 AM, Vincent de Phily
>
> <vincent.dephily@mobile-devices.fr> wrote:
> > On Friday 11 September 2009 23:55:09 Merlin Moncure wrote:
> >> On Mon, Sep 7, 2009 at 5:05 AM, Vincent de Phily
> >>
> >> <vincent.dephily@mobile-devices.fr> wrote:
> >> >                                     Table "public.message"
> >> >  Column   |            Type             |                    
> >> >  Modifiers
> >> > -----------+-----------------------------+----------------------------
> >> >--- ----------------------- id        | integer                     |
> >> > not null default
> >> > nextval('message_id_seq'::regclass)
> >> >  unitid    | integer                     | not null
> >> >  userid    | integer                     |
> >> >  refid     | integer                     |
> >> >
> >> > Indexes:
> >> >    "message_pkey" PRIMARY KEY, btree (id)
> >> >    "message_unitid_fromto_status_idx" btree (unitid, fromto, status)
> >> >    "message_userid_idx" btree (userid)
> >> > Foreign-key constraints:
> >> >    "message_refid_fkey" FOREIGN KEY (refid) REFERENCES message(id) ON
> >> > UPDATE CASCADE ON DELETE CASCADE
> >> >    "message_unitid_fkey" FOREIGN KEY (unitid) REFERENCES units(id) ON
> >> > UPDATE CASCADE ON DELETE CASCADE
> >> >    "message_userid_fkey" FOREIGN KEY (userid) REFERENCES users(id) ON
> >> > UPDATE CASCADE ON DELETE CASCADE
> >>
> >> where is the index on refid?
> >
> > It's
> > "message_pkey" PRIMARY KEY, btree (id)
> > because
> > (refid) REFERENCES message(id)
>
> You are thinking about this backwards.  Every time you delete a
> message, the table has to be scanned for any messages that reference
> the message being deleted because of the refid constraint (in order to
> see if any deletions must be cascaded).   PostgreSQL creates a backing
> index for primary keys automatically but not foreign keys...so you
> likely need to create an index on refid.

D'Oh ! Sounds obvious now that you mention it, and it's a very good
explanation of the delete's slowness.

I'll test this tonight or tomorrow.


--
Vincent de Phily
Mobile Devices
+33 (0) 666 301 306
+33 (0) 142 119 325

Warning
This message (and any associated files) is intended only for the use of its
intended recipient and may contain information that is confidential, subject
to copyright or constitutes a trade secret. If you are not the intended
recipient you are hereby notified that any dissemination, copying or
distribution of this message, or files associated with this message, is
strictly prohibited. If you have received this message in error, please
notify us immediately by replying to the message and deleting it from your
computer. Any views or opinions presented are solely those of the author
vincent.dephily@mobile-devices.fr and do not necessarily represent those of
the
company. Although the company has taken reasonable precautions to ensure no
viruses are present in this email, the company cannot accept responsibility
for any loss or damage arising from the use of this email or attachments.