Обсуждение: Some question

Поиск
Список
Период
Сортировка

Some question

От
Ľubomír Varga
Дата:
Hi, stright to my "problem":

explain
SELECT * FROM t_route
    WHERE t_route.route_type_fk = 1
    limit 4;

"Limit  (cost=0.00..0.88 rows=4 width=2640)"
"  ->  Seq Scan on t_route  (cost=0.00..118115.25 rows=538301 width=2640)"
"        Filter: (route_type_fk = 1)"



If I try to select constant 1 from table with two rows, it will be something
like this:

explain
SELECT * FROM t_route
    WHERE t_route.route_type_fk = (SELECT id FROM t_route_type WHERE type = 2)
    limit 4;

"Limit  (cost=1.02..1.91 rows=4 width=2640)"
"  InitPlan"
"    ->  Seq Scan on t_route_type  (cost=0.00..1.02 rows=1 width=8)"
"          Filter: ("type" = 2)"
"  ->  Seq Scan on t_route  (cost=0.00..118115.25 rows=535090 width=2640)"
"        Filter: (route_type_fk = $0)"



First query is done in about milicesonds. Second is longer than 60 seconds.
t_route is bigger table (~10M rows).

I think that it seq scans whole table. Is it bug? If no, how can I achieve
that second select will not take time to end of world...

Have a nice day and thanks for any reply.

--
Odborník na všetko je zlý odborník. Ja sa snažím byť výnimkou potvrdzujúcou
pravidlo.

Re: Some question

От
Scott Marlowe
Дата:
2010/3/31 Ľubomír Varga <luvar@plaintext.sk>:
> Hi, stright to my "problem":
> If I try to select constant 1 from table with two rows, it will be something
> like this:
>
> explain
> SELECT * FROM t_route
>        WHERE t_route.route_type_fk = (SELECT id FROM t_route_type WHERE type = 2)
>        limit 4;
>
> "Limit  (cost=1.02..1.91 rows=4 width=2640)"
> "  InitPlan"
> "    ->  Seq Scan on t_route_type  (cost=0.00..1.02 rows=1 width=8)"
> "          Filter: ("type" = 2)"
> "  ->  Seq Scan on t_route  (cost=0.00..118115.25 rows=535090 width=2640)"
> "        Filter: (route_type_fk = $0)"
>

Looking at this it looks like you're using prepared queries, which
can't make as good of a decision as regular queries because the values
are opaque to the planner.

Can you provide us with the output of explain analyze of that query?

Re: Some question

От
"Kevin Grittner"
Дата:
*ubomír Varga<luvar@plaintext.sk> wrote:

> Hi, stright to my "problem":

Please show the exact problem query and the results of running it
with EXPLAIN ANALYZE, along with the other information suggested
here:

http://wiki.postgresql.org/wiki/SlowQueryQuestions

-Kevin

Re: Some question

От
Yeb Havinga
Дата:
Scott Marlowe wrote:
> 2010/3/31 Ľubomír Varga <luvar@plaintext.sk>:
>
>> Hi, stright to my "problem":
>> If I try to select constant 1 from table with two rows, it will be something
>> like this:
>>
>> explain
>> SELECT * FROM t_route
>>        WHERE t_route.route_type_fk = (SELECT id FROM t_route_type WHERE type = 2)
>>        limit 4;
>>
>> "Limit  (cost=1.02..1.91 rows=4 width=2640)"
>> "  InitPlan"
>> "    ->  Seq Scan on t_route_type  (cost=0.00..1.02 rows=1 width=8)"
>> "          Filter: ("type" = 2)"
>> "  ->  Seq Scan on t_route  (cost=0.00..118115.25 rows=535090 width=2640)"
>> "        Filter: (route_type_fk = $0)"
>>
>>
>
> Looking at this it looks like you're using prepared queries, which
> can't make as good of a decision as regular queries because the values
> are opaque to the planner.
>
> Can you provide us with the output of explain analyze of that query?
>
ISTM that the initplan 'outputs' id as $0, so it is not a prepared
query. Maybe EXPLAIN VERBOSE ANALYZE of the query reveals that better.
But both plans show seqscans of the large table, so it is surprising
that the performance is different, if the filter expression uses the
same values. Are you sure the output SELECT id FROM t_route_type WHERE
type = 2 is equal to 1?

regards,
Yeb Havinga


Re: Some question

От
"Kevin Grittner"
Дата:
Ľubomír Varga wrote:

> SELECT * FROM t_route
>   WHERE t_route.route_type_fk = 1
>   limit 4;

This one scanned the t_route table until it found four rows that
matched.  It apparently didn't need to look at very many rows to find
the four matches, so it was fast.

> SELECT * FROM t_route
>   WHERE t_route.route_type_fk =
>     (SELECT id FROM t_route_type WHERE type = 2)
>   limit 4;

This one came up with an id for a route type that didn't have any
matches in the t_route table, so it had to scan the entire t_route
table.  (Based on your next query, the subquery probably returned
NULL, so there might be room for some optimization here.)  If you had
chosen a route type with at least four matches near the  start of the
route table, this query would have completed quickly.

> SELECT * FROM t_route, t_route_type
>   WHERE t_route.route_type_fk = t_route_type.id
>     AND type = 2
>   limit 4;

Since it didn't find any t_route_type row which matched, it knew
there couldn't be any output from the JOIN, so it skipped the scan of
the t_route table entirely.

-Kevin



Re: Some question

От
Ľubomír Varga
Дата:
Hi, here are they:

----------------------------------------------------
select * from t_route_type;

ID;description;type
1;"stojim";0
2;"idem";1



----------------------------------------------------
explain analyze
SELECT * FROM t_route
    WHERE t_route.route_type_fk = 1
    limit 4;

"Limit  (cost=0.00..0.88 rows=4 width=2640) (actual time=23.352..23.360 rows=4
loops=1)"
"  ->  Seq Scan on t_route  (cost=0.00..120497.00 rows=549155 width=2640)
(actual time=23.350..23.354 rows=4 loops=1)"
"        Filter: (route_type_fk = 1)"
"Total runtime: 23.404 ms"



----------------------------------------------------
explain analyze
SELECT * FROM t_route
    WHERE t_route.route_type_fk = (SELECT id FROM t_route_type WHERE type = 2)
    limit 4;

"Limit  (cost=1.02..1.91 rows=4 width=2640) (actual
time=267243.019..267243.019 rows=0 loops=1)"
"  InitPlan"
"    ->  Seq Scan on t_route_type  (cost=0.00..1.02 rows=1 width=8) (actual
time=0.006..0.006 rows=0 loops=1)"
"          Filter: ("type" = 2)"
"  ->  Seq Scan on t_route  (cost=0.00..120498.12 rows=545885 width=2640)
(actual time=267243.017..267243.017 rows=0 loops=1)"
"        Filter: (route_type_fk = $0)"
"Total runtime: 267243.089 ms"



----------------------------------------------------
explain analyze
SELECT * FROM t_route, t_route_type
    WHERE t_route.route_type_fk = t_route_type.id AND
        type = 2
    limit 4;

"Limit  (cost=0.00..0.96 rows=4 width=2661) (actual time=0.013..0.013 rows=0
loops=1)"
"  ->  Nested Loop  (cost=0.00..131415.62 rows=545880 width=2661) (actual
time=0.012..0.012 rows=0 loops=1)"
"        Join Filter: (t_route.route_type_fk = t_route_type.id)"
"        ->  Seq Scan on t_route_type  (cost=0.00..1.02 rows=1 width=21)
(actual time=0.011..0.011 rows=0 loops=1)"
"              Filter: ("type" = 2)"
"        ->  Seq Scan on t_route  (cost=0.00..117767.60 rows=1091760
width=2640) (never executed)"
"Total runtime: 0.054 ms"



So I found solution. It is third select, where is used join instead of inner
select to get ID for some constant from t_route_type.

t_route is table with routes taken by some car. It have same strings as
columns and one geometry column with line of travelled path. Type of route is
in t_route_type and it could be "travelling" and "standing" type. In my
select I want to select some routes which are type "travelling" (type = 1, id
= 2). It is only sample select.

Please explain me why second query had taken so long to finish.

Have a nice day.


On Wednesday 07 April 2010 00:11:48 Kevin Grittner wrote:
> *ubomír Varga<luvar@plaintext.sk> wrote:
> > Hi, stright to my "problem":
>
> Please show the exact problem query and the results of running it
> with EXPLAIN ANALYZE, along with the other information suggested
> here:
>
> http://wiki.postgresql.org/wiki/SlowQueryQuestions
>
> -Kevin

--
Odborník na všetko je zlý odborník. Ja sa snažím byť výnimkou potvrdzujúcou
pravidlo.