Обсуждение: Two queries are better than one?

Поиск
Список
Период
Сортировка

Two queries are better than one?

От
Karim Nassar
Дата:
I ran into a situation today maintaining someone else's code where the
sum time running 2 queries seems to be faster than 1. The original code
was split into two queries. I thought about joining them, but
considering the intelligence of my predecessor, I wanted to test it.

The question is, which technique is really faster? Is there some hidden
setup cost I don't see with explain analyze?

Postgres 7.4.7, Redhat AES 3

Each query individually:

test=> explain analyze
test-> select * from order  WHERE ord_batch='343B' AND ord_id='12-645';
                                                         QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------
 Index Scan using order_pkey on order  (cost=0.00..6.02 rows=1 width=486) (actual time=0.063..0.066 rows=1 loops=1)
   Index Cond: ((ord_batch = '343B'::bpchar) AND (ord_id = '12-645'::bpchar))
 Total runtime: 0.172 ms
(3 rows)


test=> explain analyze
test->     select cli_name from client where cli_code='1837';
                                                     QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
 Index Scan using client_pkey on client  (cost=0.00..5.98 rows=2 width=39) (actual time=0.043..0.047 rows=1 loops=1)
   Index Cond: (cli_code = '1837'::bpchar)
 Total runtime: 0.112 ms
(3 rows)

Joined:

test=> explain analyze
test->    SELECT cli_name,order.*
test->               FROM order
test->               JOIN client ON (ord_client = cli_code)
test->              WHERE ord_batch='343B' AND ord_id='12-645';
                                                            QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..12.00 rows=2 width=525) (actual time=0.120..0.128 rows=1 loops=1)
   ->  Index Scan using order_pkey on order  (cost=0.00..6.02 rows=1 width=486) (actual time=0.064..0.066 rows=1
loops=1)
         Index Cond: ((ord_batch = '343B'::bpchar) AND (ord_id = '12-645'::bpchar))
   ->  Index Scan using client_pkey on client  (cost=0.00..5.98 rows=1 width=51) (actual time=0.023..0.026 rows=1
loops=1)
         Index Cond: ("outer".ord_client = client.cli_code)
 Total runtime: 0.328 ms
(6 rows)


--
Karim Nassar <karim.nassar@acm.org>


Re: Two queries are better than one?

От
Michael Fuhr
Дата:
On Thu, Jul 28, 2005 at 04:04:25PM -0700, Karim Nassar wrote:
> I ran into a situation today maintaining someone else's code where the
> sum time running 2 queries seems to be faster than 1. The original code
> was split into two queries. I thought about joining them, but
> considering the intelligence of my predecessor, I wanted to test it.
>
> The question is, which technique is really faster? Is there some hidden
> setup cost I don't see with explain analyze?

To see which technique will be faster in your application, time the
application code.  The queries you show are taking fractions of a
millisecond; the communications overhead of executing two queries
might make that technique significantly slower than just the server
execution time that EXPLAIN ANALYZE shows.

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: Two queries are better than one?

От
John A Meinel
Дата:
Karim Nassar wrote:
> I ran into a situation today maintaining someone else's code where the
> sum time running 2 queries seems to be faster than 1. The original code
> was split into two queries. I thought about joining them, but
> considering the intelligence of my predecessor, I wanted to test it.
>
> The question is, which technique is really faster? Is there some hidden
> setup cost I don't see with explain analyze?

Yes, the time it takes your user code to parse the result, and create
the new query. :)

It does seem like you are taking an extra 0.1ms for the combined query,
but that means you don't have another round trip to the database. So
that would mean one less context switch, and you don't need to know what
the cli_code is before you can get the cli_name.

I would guess the overhead is the time for postgres to parse out the
text, place another index query, and then combine the rows. It seems
like this shouldn't take 0.1ms, but then again, that isn't very long.

Also, did you run it *lots* of times to make sure that this isn't just
noise?

John
=:->

>
> Postgres 7.4.7, Redhat AES 3
>
> Each query individually:
>
> test=> explain analyze
> test-> select * from order  WHERE ord_batch='343B' AND ord_id='12-645';
>                                                          QUERY PLAN
>
----------------------------------------------------------------------------------------------------------------------------
>  Index Scan using order_pkey on order  (cost=0.00..6.02 rows=1 width=486) (actual time=0.063..0.066 rows=1 loops=1)
>    Index Cond: ((ord_batch = '343B'::bpchar) AND (ord_id = '12-645'::bpchar))
>  Total runtime: 0.172 ms
> (3 rows)
>
>
> test=> explain analyze
> test->     select cli_name from client where cli_code='1837';
>                                                      QUERY PLAN
> ---------------------------------------------------------------------------------------------------------------------
>  Index Scan using client_pkey on client  (cost=0.00..5.98 rows=2 width=39) (actual time=0.043..0.047 rows=1 loops=1)
>    Index Cond: (cli_code = '1837'::bpchar)
>  Total runtime: 0.112 ms
> (3 rows)
>
> Joined:
>
> test=> explain analyze
> test->    SELECT cli_name,order.*
> test->               FROM order
> test->               JOIN client ON (ord_client = cli_code)
> test->              WHERE ord_batch='343B' AND ord_id='12-645';
>                                                             QUERY PLAN
>
----------------------------------------------------------------------------------------------------------------------------------
>  Nested Loop  (cost=0.00..12.00 rows=2 width=525) (actual time=0.120..0.128 rows=1 loops=1)
>    ->  Index Scan using order_pkey on order  (cost=0.00..6.02 rows=1 width=486) (actual time=0.064..0.066 rows=1
loops=1)
>          Index Cond: ((ord_batch = '343B'::bpchar) AND (ord_id = '12-645'::bpchar))
>    ->  Index Scan using client_pkey on client  (cost=0.00..5.98 rows=1 width=51) (actual time=0.023..0.026 rows=1
loops=1)
>          Index Cond: ("outer".ord_client = client.cli_code)
>  Total runtime: 0.328 ms
> (6 rows)
>
>


Вложения

Re: Two queries are better than one?

От
Karim Nassar
Дата:
On Thu, 2005-07-28 at 21:01 -0500, John A Meinel wrote:
>
> Also, did you run it *lots* of times to make sure that this isn't just
> noise?

If a dozen is lots, yes. :-)

It was very consistent as I repeatedly ran it.

--
Karim Nassar <karim.nassar@acm.org>


Re: Two queries are better than one?

От
Gnanavel S
Дата:


On 7/29/05, Karim Nassar <karim.nassar@acm.org> wrote:
I ran into a situation today maintaining someone else's code where the
sum time running 2 queries seems to be faster than 1. The original code
was split into two queries. I thought about joining them, but
considering the intelligence of my predecessor, I wanted to test it.

The question is, which technique is really faster? Is there some hidden
setup cost I don't see with explain analyze?

Postgres 7.4.7, Redhat AES 3

Each query individually:

test=> explain analyze
test-> select * from order  WHERE ord_batch='343B' AND ord_id='12-645';
                                                         QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Index Scan using order_pkey on order  (cost=0.00..6.02 rows=1 width=486) (actual time= 0.063..0.066 rows=1 loops=1)
   Index Cond: ((ord_batch = '343B'::bpchar) AND (ord_id = '12-645'::bpchar))
Total runtime: 0.172 ms
(3 rows)


test=> explain analyze
test->     select cli_name from client where cli_code='1837';
                                                     QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Index Scan using client_pkey on client  (cost=0.00..5.98 rows=2 width=39) (actual time= 0.043..0.047 rows=1 loops=1)
   Index Cond: (cli_code = '1837'::bpchar)
Total runtime: 0.112 ms
(3 rows)

Joined:

test=> explain analyze
test->    SELECT cli_name,order.*
test->               FROM order
test->               JOIN client ON (ord_client = cli_code)
test->              WHERE ord_batch='343B' AND ord_id='12-645';

where is the cli_code condition in the above query?
 

                                                            QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Nested Loop  (cost=0.00..12.00 rows=2 width=525) (actual time=0.120..0.128 rows=1 loops=1)
   ->  Index Scan using order_pkey on order  (cost=0.00..6.02 rows=1 width=486) (actual time=0.064..0.066 rows=1 loops=1)
         Index Cond: ((ord_batch = '343B'::bpchar) AND (ord_id = '12-645'::bpchar))
   ->  Index Scan using client_pkey on client  (cost=0.00..5.98 rows=1 width=51) (actual time=0.023..0.026 rows=1 loops=1)
         Index Cond: ("outer".ord_client = client.cli_code)
Total runtime: 0.328 ms
(6 rows)


--
Karim Nassar <karim.nassar@acm.org >


---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend



--
with regards,
S.Gnanavel
Satyam Computer Services Ltd.

Re: Two queries are better than one?

От
Karim Nassar
Дата:
On Fri, 2005-07-29 at 09:41 +0530, Gnanavel S wrote:

>
>         Joined:
>
>         test=> explain analyze
>         test->    SELECT cli_name,order.*
>         test->               FROM order
>         test->               JOIN client ON (ord_client = cli_code)
>         test->              WHERE ord_batch='343B' AND
>         ord_id='12-645';
>
> where is the cli_code condition in the above query?

I don't understand the question. ord_client is the client code, and
cli_code is the client code, for their respective tables. batch/id is
unique, so there is only one record from order, and only one client to
associate.

Clearer?

--
Karim Nassar <karim.nassar@acm.org>


Re: Two queries are better than one?

От
Gnanavel S
Дата:


On 7/29/05, Karim Nassar <karim.nassar@acm.org> wrote:
On Fri, 2005-07-29 at 09:41 +0530, Gnanavel S wrote:

>
>         Joined:
>
>         test=> explain analyze
>         test->    SELECT cli_name,order.*
>         test->               FROM order
>         test->               JOIN client ON (ord_client = cli_code)
>         test->              WHERE ord_batch='343B' AND
>         ord_id='12-645';
>
> where is the cli_code condition in the above query?

I don't understand the question. ord_client is the client code, and
cli_code is the client code, for their respective tables. batch/id is
unique, so there is only one record from order, and only one client to
associate.

Clearer?

ok.

 Reason might be comparing with a literal value (previous case) is cheaper than comparing with column(as it has to be evaluated).  But with the previous case getting and assigning the cli_code in the application and executing in db will be time consuming as it includes IPC cost.

--
Karim Nassar <karim.nassar@acm.org >




--
with regards,
S.Gnanavel
Satyam Computer Services Ltd.