Обсуждение: Slow deletes

Поиск
Список
Период
Сортировка

Slow deletes

От
Edmund Dengler
Дата:
Can anyone explain why these deletes are extremely slow?

====================================

=> select version();
                               version
---------------------------------------------------------------------
 PostgreSQL 7.2.1 on i386-unknown-openbsd3.0, compiled by GCC 2.95.3
(1 row)

====================================

=> \d syslog_event
                                        Table "syslog_event"
     Column      |           Type           |                       Modifiers
-----------------+--------------------------+-------------------------------------------------------
 event_id        | bigint                   | not null default nextval('syslog_event_id_seq'::text)
 signature_name  | text                     |
 facility        | character(10)            |
 priority        | character(10)            |
 timestamp       | timestamp with time zone |
 host_id         | bigint                   | not null
 raw_message     | text                     |
 clean_message   | text                     |
 marked_message  | text                     |
 remote_event_id | bigint                   | not null
Indexes: syslog_event_event_id_idx,
         syslog_event_idx_tmp,
         syslog_event_timestamp_idx
Primary key: syslog_event_pkey
Unique keys: syslog_event_host_id_key
Triggers: RI_ConstraintTrigger_13220921,
          RI_ConstraintTrigger_13220965,
          RI_ConstraintTrigger_13220967,
          syslog_event_after_insert_trg

====================================

=> \d syslog_event_pkey
Index "syslog_event_pkey"
  Column  |  Type
----------+--------
 event_id | bigint
unique btree (primary key)

====================================

=> explain delete from syslog_event where event_id = 1001;
NOTICE:  QUERY PLAN:

Seq Scan on syslog_event  (cost=0.00..342277.67 rows=1 width=6)

EXPLAIN

====================================

There are over 5,000,000 rows in the table.  The triggers are only for
inserts or for existence constraints.  There are other tables, but none
have existence constraints to this table.

I don't quite understand why, when there exists an (unique) index,
this would use a sequential scan.

(Note: I am actually trying to delete many rows, but this cost is
for a single row).

Regards,
Ed



Re: Slow deletes

От
Tom Lane
Дата:
Edmund Dengler <edmundd@eSentire.com> writes:
> Can anyone explain why these deletes are extremely slow?

> => explain delete from syslog_event where event_id = 1001;
> NOTICE:  QUERY PLAN:

> Seq Scan on syslog_event  (cost=0.00..342277.67 rows=1 width=6)

> There are over 5,000,000 rows in the table.

Seqscan on a 5M-row table will take a little while...

Your problem is that it's not using an indexscan, and the reason
for that is that '1001' is taken as an integer not a bigint.  The
system is not smart about optimizing cross-datatype comparisons
into indexscans.  You could write

delete from syslog_event where event_id = 1001::int8;

(or use CAST if you want to be pedantically standards-compliant).
Alternatively, consider whether event_id really needs to be bigint.
There's a clear notational advantage in plain integer.

Yes, it'd be nice if "bigintcol = 1001" acted more reasonably,
and someday we'll make it happen ... but doing so without breaking
the system's type-extensibility features is not trivial.

            regards, tom lane

Re: Slow deletes

От
Edmund Dengler
Дата:
Thanks! That seems to have been the issue!

=> explain delete from syslog_event where event_id = 1000::int8;
NOTICE:  QUERY PLAN:

Index Scan using syslog_event_event_id_idx on syslog_event
(cost=0.00..3.02 rows=1 width=6)

EXPLAIN

Deleting a single row now takes only about 5 seconds.

Regards,
Ed

On Mon, 12 Aug 2002, Tom Lane wrote:

> Edmund Dengler <edmundd@eSentire.com> writes:
> > Can anyone explain why these deletes are extremely slow?
>
> > => explain delete from syslog_event where event_id = 1001;
> > NOTICE:  QUERY PLAN:
>
> > Seq Scan on syslog_event  (cost=0.00..342277.67 rows=1 width=6)
>
> > There are over 5,000,000 rows in the table.
>
> Seqscan on a 5M-row table will take a little while...
>
> Your problem is that it's not using an indexscan, and the reason
> for that is that '1001' is taken as an integer not a bigint.  The
> system is not smart about optimizing cross-datatype comparisons
> into indexscans.  You could write
>
> delete from syslog_event where event_id = 1001::int8;
>
> (or use CAST if you want to be pedantically standards-compliant).
> Alternatively, consider whether event_id really needs to be bigint.
> There's a clear notational advantage in plain integer.
>
> Yes, it'd be nice if "bigintcol = 1001" acted more reasonably,
> and someday we'll make it happen ... but doing so without breaking
> the system's type-extensibility features is not trivial.
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>


Re: Slow deletes

От
Tom Lane
Дата:
Edmund Dengler <edmundd@eSentire.com> writes:
> Thanks! That seems to have been the issue!
> => explain delete from syslog_event where event_id = 1000::int8;
> NOTICE:  QUERY PLAN:

> Index Scan using syslog_event_event_id_idx on syslog_event
> (cost=0.00..3.02 rows=1 width=6)

That's better.

> Deleting a single row now takes only about 5 seconds.

Hm, still seems like a huge amount of time to delete a single row.
I suspect query-optimization problems inside your triggers.
Can't say much without more details though.

            regards, tom lane

Re: Slow deletes

От
Edmund Dengler
Дата:
Anyway to conveniently dump the triggers (3 are constraint check, my own
personal one is suppose to be on inserts only)?

Regards,
Ed

P.S. Trying to delete a couple of thousand rows, and while the explain
is indeed faster (and I believe it is getting to the actual delete part
faster), it is still taking quite a long time.

On Mon, 12 Aug 2002, Tom Lane wrote:

> Edmund Dengler <edmundd@eSentire.com> writes:
> > Thanks! That seems to have been the issue!
> > => explain delete from syslog_event where event_id = 1000::int8;
> > NOTICE:  QUERY PLAN:
>
> > Index Scan using syslog_event_event_id_idx on syslog_event
> > (cost=0.00..3.02 rows=1 width=6)
>
> That's better.
>
> > Deleting a single row now takes only about 5 seconds.
>
> Hm, still seems like a huge amount of time to delete a single row.
> I suspect query-optimization problems inside your triggers.
> Can't say much without more details though.
>
>             regards, tom lane
>


Re: Slow deletes

От
Tom Lane
Дата:
Edmund Dengler <edmundd@eSentire.com> writes:
> Anyway to conveniently dump the triggers (3 are constraint check, my own
> personal one is suppose to be on inserts only)?

The RI on-delete trigger must be the issue then.  Lack of indexes, or
linking to a foreign key of a different column type are the trouble
causes I've seen there ...

            regards, tom lane

Re: Slow deletes

От
Edmund Dengler
Дата:
Aha, I did have a dependency I missed. I did a pull from a dump:

CREATE CONSTRAINT TRIGGER "<unnamed>" AFTER DELETE ON "syslog_event"
FROM "syslog_event_message_event" NOT DEFERRABLE INITIALLY IMMEDIATE
FOR EACH ROW EXECUTE PROCEDURE
"RI_FKey_cascade_del" ('<unnamed>', 'syslog_event_message_event',
'syslog_event', 'UNSPECIFIED', 'event_id', 'event_id');

Both columns are the same type (int8).

=> \d syslog_event_message_event
Table "syslog_event_message_event"
   Column   |  Type  | Modifiers
------------+--------+-----------
 message_id | bigint | not null
 event_id   | bigint | not null
Primary key: syslog_event_message_event_pkey
Triggers: RI_ConstraintTrigger_13220957

The table contains no rows (previously deleted, vacuumed, and analyzed).

By the by, it seems from the various timings of this (and knowing how long
it takes to dump/rebuild the database in general), I could have solved my
specific problem by dumping the database, deleting the required rows using
sed/grep/vi, and rebuilt the database.  While this may be a bit of
an exaggeration, it is not far off!  Somehow, this just seems wrong.  I
must be doing something strange to have such slow deletes, and I can't
seem to find it.

Regards,
Ed

On Mon, 12 Aug 2002, Tom Lane wrote:

> Edmund Dengler <edmundd@eSentire.com> writes:
> > Anyway to conveniently dump the triggers (3 are constraint check, my own
> > personal one is suppose to be on inserts only)?
>
> The RI on-delete trigger must be the issue then.  Lack of indexes, or
> linking to a foreign key of a different column type are the trouble
> causes I've seen there ...
>
>             regards, tom lane
>


Re: Slow deletes

От
Martijn van Oosterhout
Дата:
On Mon, Aug 12, 2002 at 10:34:23PM -0400, Tom Lane wrote:
> Edmund Dengler <edmundd@eSentire.com> writes:
> > Can anyone explain why these deletes are extremely slow?
>
> > => explain delete from syslog_event where event_id = 1001;
> > NOTICE:  QUERY PLAN:
>
> > Seq Scan on syslog_event  (cost=0.00..342277.67 rows=1 width=6)
>
> > There are over 5,000,000 rows in the table.
>
> Seqscan on a 5M-row table will take a little while...
>
> Your problem is that it's not using an indexscan, and the reason
> for that is that '1001' is taken as an integer not a bigint.  The
> system is not smart about optimizing cross-datatype comparisons
> into indexscans.  You could write
>
> delete from syslog_event where event_id = 1001::int8;
>
> (or use CAST if you want to be pedantically standards-compliant).
> Alternatively, consider whether event_id really needs to be bigint.
> There's a clear notational advantage in plain integer.
>
> Yes, it'd be nice if "bigintcol = 1001" acted more reasonably,
> and someday we'll make it happen ... but doing so without breaking
> the system's type-extensibility features is not trivial.

Actually, an easier way to do it is by saying:

delete from syslog_event where event_id = '1001'

which works fine, since the quoted value is listed as 'unknown' rather than
'int4'

I considered a patch to make the parser treat all numbers as unknown, but
that would break any code that relies on numbers-are-int4 (think function
and operator type resolution).

Sometimes you can't win.
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> There are 10 kinds of people in the world, those that can do binary
> arithmetic and those that can't.

Re: Slow deletes

От
Tom Lane
Дата:
Edmund Dengler <edmundd@eSentire.com> writes:
> Aha, I did have a dependency I missed. I did a pull from a dump:
> CREATE CONSTRAINT TRIGGER "<unnamed>" AFTER DELETE ON "syslog_event"
> FROM "syslog_event_message_event" NOT DEFERRABLE INITIALLY IMMEDIATE
> FOR EACH ROW EXECUTE PROCEDURE
> "RI_FKey_cascade_del" ('<unnamed>', 'syslog_event_message_event',
> 'syslog_event', 'UNSPECIFIED', 'event_id', 'event_id');

> Both columns are the same type (int8).

> => \d syslog_event_message_event
> Table "syslog_event_message_event"
>    Column   |  Type  | Modifiers
> ------------+--------+-----------
>  message_id | bigint | not null
>  event_id   | bigint | not null
> Primary key: syslog_event_message_event_pkey
> Triggers: RI_ConstraintTrigger_13220957

> The table contains no rows (previously deleted, vacuumed, and analyzed).

I can't tell from this which column is the primary key?  (Fortunately
someone improved the \d output so that that will be apparent in 7.3...)

Generally you want to be sure that indexes are available on both the
referenced and referencing columns of a foreign-key relationship...

            regards, tom lane

Re: Slow deletes

От
Tom Lane
Дата:
Martijn van Oosterhout <kleptog@svana.org> writes:
> I considered a patch to make the parser treat all numbers as unknown, but
> that would break any code that relies on numbers-are-int4 (think function
> and operator type resolution).

Right ... if there were a simple solution we'd have done it already.

The line of thought that's been discussed in the past is to add an
UNKNOWNNUMERIC placeholder type, with similar behavior to UNKNOWN except
that it's used for something that was a numeric-looking literal rather
than a string-looking literal.  I don't recall that anyone worked out
a complete proposal --- check the archives if you want to work on this.

            regards, tom lane