Обсуждение: Query speed problems

Поиск
Список
Период
Сортировка

Query speed problems

От
Victor Danilchenko
Дата:
    Hi,

    In the process of developing an API for web/perl/postrgres
interactions, I have come up against a peculiar problem; a rather simple
query, run on two relatively small tables, takes as much as 0.4 seconds
on my development system (it's a P2 266, which in this case is a good
thing, as it exposes speed issues). I tried accomplishging the same
thing via subqueries and joins, and both methods give me similarly bad
result (join query is a little slower, but only a little).

    The queries I have tested are as follows:

SELECT DISTINCT maker.* FROM maker,model WHERE maker.id=model.maker
SELECT DISTINCT maker.* FROM maker join model ON maker.id=model.maker

    The point of the queries is to extract only the maker rows which
are referenced from the model table. I would happily use another way to
achieve the same end, should anyone suggest it.

    "maker" has only 137 rows, "model" only 1233 rows. I test the
performance in perl, by taking time right before and after query
execution. Executing the queries takes anywhere between .3 and .5
seconds, depending on some other factors (removing the 'distinct'
keyword from the 1st query shaves about .1 second off of the execution
time for example).

    These execution times seem ridiculous. Any idea what the culprit
may be? I hope it's not the text fields, 'cuz those fields are
important.

    Both tables are quite simple:

# \d maker
              Table "public.maker"
   Column   |         Type          | Modifiers
------------+-----------------------+-----------
 id         | character varying(4)  | not null
 fullname   | character varying(20) |
 contact    | character varying(20) |
 phone      | character varying(15) |
 service_no | character varying(20) |
 lastuser   | character varying(30) |
 comments   | text                  |
Indexes: maker_pkey primary key btree (id)
Triggers: RI_ConstraintTrigger_18881,
          RI_ConstraintTrigger_18882

# \d model
                                Table "public.model"
    Column     |         Type          |                  Modifiers
---------------+-----------------------+---------------------------------------------
 id            | integer               | not null default nextval('model_ids'::text)
 name          | character varying(20) | not null
 maker         | character varying(4)  |
 type_hardware | character varying(4)  |
 fullname      | character varying(40) |
 spec          | character varying(50) |
 lastuser      | character varying(30) |
 comments      | text                  |
 size_cap      | character varying(10) |
Indexes: model_pkey primary key btree (id),
         unique_model unique btree (name, maker, type_hardware)
Check constraints: "nonempty_fullname" (fullname > ''::character varying)
Foreign Key constraints: valid_maker FOREIGN KEY (maker) REFERENCES \
                           maker(id) ON UPDATE NO ACTION ON DELETE NO ACTION,
                         valid_type FOREIGN KEY (type_hardware)
REFERENCES type_hardware(id) ON UPDATE NO ACTION ON DELETE NO ACTION

--
|  Victor  Danilchenko  | Any sufficiently advanced       |
| danilche@cs.umass.edu | technology is indistinguishable |
|   CSCF   |   5-4231   | from a Perl script.             |


Re: Query speed problems

От
Victor Danilchenko
Дата:
    Sorry, I forgot to specify software versions.

    I am running RHL 8.0 (Linux kernel 2.4.18), and postgres 7.3.

--
|  Victor  Danilchenko  +------------------------------------+
| danilche@cs.umass.edu | I don't have to outrun the bear -- |
|   CSCF   |   5-4231   |     I just have to outrun YOU!     |


Re: Query speed problems

От
Stephan Szabo
Дата:
On Thu, 17 Apr 2003, Victor Danilchenko wrote:

>     The queries I have tested are as follows:
>
> SELECT DISTINCT maker.* FROM maker,model WHERE maker.id=model.maker
> SELECT DISTINCT maker.* FROM maker join model ON maker.id=model.maker
>
>     The point of the queries is to extract only the maker rows which
> are referenced from the model table. I would happily use another way to
> achieve the same end, should anyone suggest it.

What does explain analyze show for the query?

>     "maker" has only 137 rows, "model" only 1233 rows. I test the
> performance in perl, by taking time right before and after query
> execution. Executing the queries takes anywhere between .3 and .5
> seconds, depending on some other factors (removing the 'distinct'
> keyword from the 1st query shaves about .1 second off of the execution
> time for example).

>     Column     |         Type          |                  Modifiers
> ---------------+-----------------------+---------------------------------------------
>  id            | integer               | not null default nextval('model_ids'::text)
>  name          | character varying(20) | not null
>  maker         | character varying(4)  |
>  type_hardware | character varying(4)  |
>  fullname      | character varying(40) |
>  spec          | character varying(50) |
>  lastuser      | character varying(30) |
>  comments      | text                  |
>  size_cap      | character varying(10) |
> Indexes: model_pkey primary key btree (id),
>          unique_model unique btree (name, maker, type_hardware)
> Check constraints: "nonempty_fullname" (fullname > ''::character varying)
> Foreign Key constraints: valid_maker FOREIGN KEY (maker) REFERENCES \
>                            maker(id) ON UPDATE NO ACTION ON DELETE NO ACTION,
>                          valid_type FOREIGN KEY (type_hardware)
> REFERENCES type_hardware(id) ON UPDATE NO ACTION ON DELETE NO ACTION

Hmm, it doesn't look to me like model.maker=<value> type queries are
indexable with this set of things.  An index on model(maker) might help.


Re: Query speed problems

От
"Peter Darley"
Дата:
Victor,
    I'm not sure, but I think an exists might be faster for you.  It wouldn't
have to deal with the Cartesian product of the tables.

SELECT DISTINCT maker.* FROM maker WHERE exists (SELECT 1 FROM model WHERE
model.maker=maker.id);

Thanks,
Peter Darley

-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org]On Behalf Of Victor
Danilchenko
Sent: Thursday, April 17, 2003 12:17 PM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Query speed problems


    Hi,

    In the process of developing an API for web/perl/postrgres
interactions, I have come up against a peculiar problem; a rather simple
query, run on two relatively small tables, takes as much as 0.4 seconds
on my development system (it's a P2 266, which in this case is a good
thing, as it exposes speed issues). I tried accomplishging the same
thing via subqueries and joins, and both methods give me similarly bad
result (join query is a little slower, but only a little).

    The queries I have tested are as follows:

SELECT DISTINCT maker.* FROM maker,model WHERE maker.id=model.maker
SELECT DISTINCT maker.* FROM maker join model ON maker.id=model.maker

    The point of the queries is to extract only the maker rows which
are referenced from the model table. I would happily use another way to
achieve the same end, should anyone suggest it.

    "maker" has only 137 rows, "model" only 1233 rows. I test the
performance in perl, by taking time right before and after query
execution. Executing the queries takes anywhere between .3 and .5
seconds, depending on some other factors (removing the 'distinct'
keyword from the 1st query shaves about .1 second off of the execution
time for example).

    These execution times seem ridiculous. Any idea what the culprit
may be? I hope it's not the text fields, 'cuz those fields are
important.

    Both tables are quite simple:

# \d maker
              Table "public.maker"
   Column   |         Type          | Modifiers
------------+-----------------------+-----------
 id         | character varying(4)  | not null
 fullname   | character varying(20) |
 contact    | character varying(20) |
 phone      | character varying(15) |
 service_no | character varying(20) |
 lastuser   | character varying(30) |
 comments   | text                  |
Indexes: maker_pkey primary key btree (id)
Triggers: RI_ConstraintTrigger_18881,
          RI_ConstraintTrigger_18882

# \d model
                                Table "public.model"
    Column     |         Type          |                  Modifiers
---------------+-----------------------+------------------------------------
---------
 id            | integer               | not null default
nextval('model_ids'::text)
 name          | character varying(20) | not null
 maker         | character varying(4)  |
 type_hardware | character varying(4)  |
 fullname      | character varying(40) |
 spec          | character varying(50) |
 lastuser      | character varying(30) |
 comments      | text                  |
 size_cap      | character varying(10) |
Indexes: model_pkey primary key btree (id),
         unique_model unique btree (name, maker, type_hardware)
Check constraints: "nonempty_fullname" (fullname > ''::character varying)
Foreign Key constraints: valid_maker FOREIGN KEY (maker) REFERENCES \
                           maker(id) ON UPDATE NO ACTION ON DELETE NO
ACTION,
                         valid_type FOREIGN KEY (type_hardware)
REFERENCES type_hardware(id) ON UPDATE NO ACTION ON DELETE NO ACTION

--
|  Victor  Danilchenko  | Any sufficiently advanced       |
| danilche@cs.umass.edu | technology is indistinguishable |
|   CSCF   |   5-4231   | from a Perl script.             |


---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)


Re: Query speed problems

От
Victor Danilchenko
Дата:
On Thu, 17 Apr 2003, Peter Darley wrote:

>Victor,
>    I'm not sure, but I think an exists might be faster for you.  It wouldn't
>have to deal with the Cartesian product of the tables.
>
>SELECT DISTINCT maker.* FROM maker WHERE exists (SELECT 1 FROM model WHERE
>model.maker=maker.id);

    That was indeed significantly faster. *very* significantly
faster.

    As you may guess, I am an SQL newbie, and working my way through
the language. I figured there would be a faster way to do what I was
doing, but sunqueries or joins was the only way I could figure out.

    Again, thanks for the helpful reply, and for your promptness. I
still want to figure out why the subquery version was taking so damned
long, but it's nice to have a working fast solution.

>-----Original Message-----
>From: pgsql-performance-owner@postgresql.org
>[mailto:pgsql-performance-owner@postgresql.org]On Behalf Of Victor
>Danilchenko
>Sent: Thursday, April 17, 2003 12:17 PM
>To: pgsql-performance@postgresql.org
>Subject: [PERFORM] Query speed problems
>
>
>    Hi,
>
>    In the process of developing an API for web/perl/postrgres
>interactions, I have come up against a peculiar problem; a rather simple
>query, run on two relatively small tables, takes as much as 0.4 seconds
>on my development system (it's a P2 266, which in this case is a good
>thing, as it exposes speed issues). I tried accomplishging the same
>thing via subqueries and joins, and both methods give me similarly bad
>result (join query is a little slower, but only a little).
>
>    The queries I have tested are as follows:
>
>SELECT DISTINCT maker.* FROM maker,model WHERE maker.id=model.maker
>SELECT DISTINCT maker.* FROM maker join model ON maker.id=model.maker
>
>    The point of the queries is to extract only the maker rows which
>are referenced from the model table. I would happily use another way to
>achieve the same end, should anyone suggest it.
>
>    "maker" has only 137 rows, "model" only 1233 rows. I test the
>performance in perl, by taking time right before and after query
>execution. Executing the queries takes anywhere between .3 and .5
>seconds, depending on some other factors (removing the 'distinct'
>keyword from the 1st query shaves about .1 second off of the execution
>time for example).
>
>    These execution times seem ridiculous. Any idea what the culprit
>may be? I hope it's not the text fields, 'cuz those fields are
>important.
>
>    Both tables are quite simple:
>
># \d maker
>              Table "public.maker"
>   Column   |         Type          | Modifiers
>------------+-----------------------+-----------
> id         | character varying(4)  | not null
> fullname   | character varying(20) |
> contact    | character varying(20) |
> phone      | character varying(15) |
> service_no | character varying(20) |
> lastuser   | character varying(30) |
> comments   | text                  |
>Indexes: maker_pkey primary key btree (id)
>Triggers: RI_ConstraintTrigger_18881,
>          RI_ConstraintTrigger_18882
>
># \d model
>                                Table "public.model"
>    Column     |         Type          |                  Modifiers
>---------------+-----------------------+------------------------------------
>---------
> id            | integer               | not null default
>nextval('model_ids'::text)
> name          | character varying(20) | not null
> maker         | character varying(4)  |
> type_hardware | character varying(4)  |
> fullname      | character varying(40) |
> spec          | character varying(50) |
> lastuser      | character varying(30) |
> comments      | text                  |
> size_cap      | character varying(10) |
>Indexes: model_pkey primary key btree (id),
>         unique_model unique btree (name, maker, type_hardware)
>Check constraints: "nonempty_fullname" (fullname > ''::character varying)
>Foreign Key constraints: valid_maker FOREIGN KEY (maker) REFERENCES \
>                           maker(id) ON UPDATE NO ACTION ON DELETE NO
>ACTION,
>                         valid_type FOREIGN KEY (type_hardware)
>REFERENCES type_hardware(id) ON UPDATE NO ACTION ON DELETE NO ACTION
>
>--
>|  Victor  Danilchenko  | Any sufficiently advanced       |
>| danilche@cs.umass.edu | technology is indistinguishable |
>|   CSCF   |   5-4231   | from a Perl script.             |
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 2: you can get off all lists at once with the unregister command
>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>

--
|  Victor  Danilchenko  | Curiosity was framed;     |
| danilche@cs.umass.edu | Ignorance killed the cat. |
|   CSCF   |   5-4231   |              -- Anonymous |


Re: Query speed problems

От
Victor Danilchenko
Дата:
On Thu, 17 Apr 2003, Stephan Szabo wrote:

>
>On Thu, 17 Apr 2003, Victor Danilchenko wrote:
>
>>     The queries I have tested are as follows:
>>
>> SELECT DISTINCT maker.* FROM maker,model WHERE maker.id=model.maker
>> SELECT DISTINCT maker.* FROM maker join model ON maker.id=model.maker
>>
>>     The point of the queries is to extract only the maker rows which
>> are referenced from the model table. I would happily use another way to
>> achieve the same end, should anyone suggest it.
>
>What does explain analyze show for the query?

# explain analyze SELECT DISTINCT * FROM maker WHERE id=model.maker;
NOTICE:  Adding missing FROM-clause entry for table "model"
                                                                QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=230.58..255.24 rows=123 width=171) (actual time=238.20..293.21 rows=128 loops=1)
   ->  Sort  (cost=230.58..233.66 rows=1233 width=171) (actual time=238.19..241.07 rows=1233 loops=1)
         Sort Key: maker.id, maker.fullname, maker.contact, maker.phone, maker.service_no, maker.lastuser,
maker.comments
         ->  Merge Join  (cost=0.00..167.28 rows=1233 width=171) (actual time=0.27..81.49 rows=1233 loops=1)
               Merge Cond: ("outer".id = "inner".maker)
               ->  Index Scan using maker_pkey on maker (cost=0.00..52.00 rows=1000 width=164) (actual time=0.11..4.29
rows=137loops=1) 
               ->  Index Scan using makers on model  (cost=0.00..94.28 rows=1233 width=7) (actual time=0.04..27.34
rows=1233loops=1) 
 Total runtime: 295.30 msec
(8 rows)

    Following a suggestion sent in private mail, I have created an
index for model.maker column:

# create index model_maker on model(maker);

    but that doesn't seem to have made an appreciable difference in
performance -- it's only about .05 seconds more than the above number if
I drop the index.

    Many thanks for your help.

>>     "maker" has only 137 rows, "model" only 1233 rows. I test the
>> performance in perl, by taking time right before and after query
>> execution. Executing the queries takes anywhere between .3 and .5
>> seconds, depending on some other factors (removing the 'distinct'
>> keyword from the 1st query shaves about .1 second off of the execution
>> time for example).
>
>>     Column     |         Type          |                  Modifiers
>> ---------------+-----------------------+---------------------------------------------
>>  id            | integer               | not null default nextval('model_ids'::text)
>>  name          | character varying(20) | not null
>>  maker         | character varying(4)  |
>>  type_hardware | character varying(4)  |
>>  fullname      | character varying(40) |
>>  spec          | character varying(50) |
>>  lastuser      | character varying(30) |
>>  comments      | text                  |
>>  size_cap      | character varying(10) |
>> Indexes: model_pkey primary key btree (id),
>>          unique_model unique btree (name, maker, type_hardware)
>> Check constraints: "nonempty_fullname" (fullname > ''::character varying)
>> Foreign Key constraints: valid_maker FOREIGN KEY (maker) REFERENCES \
>>                            maker(id) ON UPDATE NO ACTION ON DELETE NO ACTION,
>>                          valid_type FOREIGN KEY (type_hardware)
>> REFERENCES type_hardware(id) ON UPDATE NO ACTION ON DELETE NO ACTION
>
>Hmm, it doesn't look to me like model.maker=<value> type queries are
>indexable with this set of things.  An index on model(maker) might help.
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 4: Don't 'kill -9' the postmaster
>

--
|  Victor  Danilchenko  | Curiosity was framed;     |
| danilche@cs.umass.edu | Ignorance killed the cat. |
|   CSCF   |   5-4231   |              -- Anonymous |


Re: Query speed problems

От
Stephan Szabo
Дата:
On Thu, 17 Apr 2003, Victor Danilchenko wrote:

>  Unique (cost=230.58..255.24 rows=123 width=171) (actual
> time=238.20..293.21 rows=128 loops=1)
>    -> Sort (cost=230.58..233.66 rows=1233 width=171) (actual
> time=238.19..241.07 rows=1233 loops=1)
>          Sort Key: maker.id, maker.fullname, maker.contact,
> maker.phone, maker.service_no, maker.lastuser, maker.comments
>          -> Merge Join (cost=0.00..167.28 rows=1233 width=171) (actual
> time=0.27..81.49 rows=1233 loops=1)
>                Merge Cond: ("outer".id = "inner".maker)
>                -> Index Scan using maker_pkey on maker
> (cost=0.00..52.00 rows=1000 width=164) (actual time=0.11..4.29
> rows=137 loops=1)
>                -> Index Scan using makers on model (cost=0.00..94.28
> rows=1233 width=7) (actual time=0.04..27.34 rows=1233 loops=1)
>  Total runtime: 295.30 msec
> (8 rows)

Hmm, well, for this version, it looks like most of the time is probably
going into the sort.  I wonder if raising sort_mem would help this version
of the query (try a set sort_mem=8192; before running the query).  This
isn't likely to get the time below like 160 msec though.

>     Following a suggestion sent in private mail, I have created an
> index for model.maker column:
>
> # create index model_maker on model(maker);
>
>     but that doesn't seem to have made an appreciable difference in
> performance -- it's only about .05 seconds more than the above number if
> I drop the index.
Yeah, it looks like it's already using an index, but I didn't see that
index in the list of indexes on the table in the original mail, wierd.


Re: Query speed problems

От
"Nikolaus Dilger"
Дата:
Victor,

What is the issue?  You get sub second response time.
Why waste your time trying to make it faster?
If you have a query that runs serveral minutes or hours
then its worthwhile tuning.  Or if your query gets
executed several thausend times a day.

Regards,
Nikolaus

On Thu, 17 Apr 2003 15:17:01 -0400 (EDT), Victor
Danilchenko wrote:

>
>     Hi,
>
>     In the process of developing an API for
> web/perl/postrgres
> interactions, I have come up against a peculiar
> problem; a rather simple
> query, run on two relatively small tables, takes as
> much as 0.4 seconds
> on my development system (it's a P2 266, which in this
> case is a good
> thing, as it exposes speed issues). I tried
> accomplishging the same
> thing via subqueries and joins, and both methods give
> me similarly bad
> result (join query is a little slower, but only a
> little).
>
>     The queries I have tested are as follows:
>
> SELECT DISTINCT maker.* FROM maker,model WHERE
> maker.id=model.maker
> SELECT DISTINCT maker.* FROM maker join model ON
> maker.id=model.maker
>
>     The point of the queries is to extract only the maker
> rows which
> are referenced from the model table. I would happily
> use another way to
> achieve the same end, should anyone suggest it.
>
>     "maker" has only 137 rows, "model" only 1233 rows. I
> test the
> performance in perl, by taking time right before and
> after query
> execution. Executing the queries takes anywhere
between
> .3 and .5
> seconds, depending on some other factors (removing the
> 'distinct'
> keyword from the 1st query shaves about .1 second off
> of the execution
> time for example).
>
>     These execution times seem ridiculous. Any idea what
> the culprit
> may be? I hope it's not the text fields, 'cuz those
> fields are
> important.
>
>     Both tables are quite simple:
>
> # \d maker
>               Table "public.maker"
>    Column   |         Type          | Modifiers
> ------------+-----------------------+-----------
>  id         | character varying(4)  | not null
>  fullname   | character varying(20) |
>  contact    | character varying(20) |
>  phone      | character varying(15) |
>  service_no | character varying(20) |
>  lastuser   | character varying(30) |
>  comments   | text                  |
> Indexes: maker_pkey primary key btree (id)
> Triggers: RI_ConstraintTrigger_18881,
>           RI_ConstraintTrigger_18882
>
> # \d model
>                                 Table "public.model"
>     Column     |         Type          |

>   Modifiers
>
---------------+-----------------------+---------------------------------------------
>  id            | integer               | not null
> default nextval('model_ids'::text)
>  name          | character varying(20) | not null
>  maker         | character varying(4)  |
>  type_hardware | character varying(4)  |
>  fullname      | character varying(40) |
>  spec          | character varying(50) |
>  lastuser      | character varying(30) |
>  comments      | text                  |
>  size_cap      | character varying(10) |
> Indexes: model_pkey primary key btree (id),
>          unique_model unique btree (name, maker,
> type_hardware)
> Check constraints: "nonempty_fullname" (fullname >
> ''::character varying)
> Foreign Key constraints: valid_maker FOREIGN KEY
> (maker) REFERENCES \
>                            maker(id) ON UPDATE NO
> ACTION ON DELETE NO ACTION,
>                          valid_type FOREIGN KEY
> (type_hardware)
> REFERENCES type_hardware(id) ON UPDATE NO ACTION ON
> DELETE NO ACTION
>
> --
> |  Victor  Danilchenko  | Any sufficiently advanced

>   |
> | danilche@cs.umass.edu | technology is
> indistinguishable |
> |   CSCF   |   5-4231   | from a Perl script.

>   |
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 2: you can get off all lists at once with the
> unregister command
>     (send "unregister YourEmailAddressHere" to
> majordomo@postgresql.org)


Re: Query speed problems

От
Victor Danilchenko
Дата:
On Thu, 17 Apr 2003, Nikolaus Dilger wrote:

>Victor,
>
>What is the issue?  You get sub second response time.

    The issue is that the query is a part of *user interface*, as I
wrote in my original message; and there is a small number of such
queries (about 3) that run per each user action. A second-long wait in
*UI* is unacceptable -- people tend to find even third-of-a-second wait
to be annoying. UI interactions should be so fast as to appear nearly
instant.

>Why waste your time trying to make it faster?

    Well, there's also the learning aspect of it -- this is my first
major SQL project, and I am trying to understand as much as I can about
under-the-surface stuff. Thanks to Peter Darley, I already have a fast
solution -- now I simply want to understand more about the performance
issues inherent in reverse-lookup subqueries.

>If you have a query that runs serveral minutes or hours
>then its worthwhile tuning.  Or if your query gets
>executed several thausend times a day.

--
|  Victor  Danilchenko  | Of course my password is the same as    |
| danilche@cs.umass.edu | my pet's name. My macaw's name was      |
|   CSCF   |   5-4231   | Q47pY!3, but I change it every 90 days. |