Обсуждение: master/detail: master rows without details

Поиск
Список
Период
Сортировка

master/detail: master rows without details

От
Thomas Guettler
Дата:
Hi,

I have a two tables: master and detail.
I search all master rows without a detail row.

master: 10000 rows
detail: 29800 rows

Although all three solution have the same result, The execution time
differs very much.

My naive  first solution was quite slow. Why is it so slow?
I guess (select d.master_id from detail as d) gets executed for every
master-row. But why? Shouldn't
it be possible to calculate it once and then reuse it?

Has someone a better statement?

############################################ NOT IN

SELECT "master"."id" FROM "master" WHERE master.id not in (select
d.master_id from detail as d);
                                                                 QUERY
PLAN

-----------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on master  (cost=782.68..2661482.52 rows=5132 width=16)
(actual time=2520.509..340387.326 rows=43 loops=1)
   Filter: (NOT (subplan))
   SubPlan
     ->  Materialize  (cost=782.68..1226.57 rows=29789 width=4) (actual
time=0.005..16.696 rows=9482 loops=10269)
           ->  Seq Scan on detail d  (cost=0.00..606.89 rows=29789
width=4) (actual time=0.009..52.536 rows=29793 loops=1)
 Total runtime: 340387.898 ms
(6 Zeilen)

############################################ JOIN

SELECT "master"."id" FROM "master" LEFT OUTER JOIN "detail" ON
("master"."id" = "detail"."master_id") WHERE "detail"."id" IS NULL
;

QUERY
PLAN


---------------------------------------------------------------------------------------------------------------------------------------------
 Merge Left Join  (cost=0.00..3061.08 rows=14894 width=16) (actual
time=107.521..153.840 rows=43 loops=1)
   Merge Cond: (master.id = detail.master_id)
   Filter: (detail.id IS NULL)
   ->  Index Scan using master_pkey on master  (cost=0.00..486.50
rows=10265 width=16) (actual time=0.024..20.519 rows=10269 loops=1)
   ->  Index Scan using detail_master_id on detail  (cost=0.00..2176.55
rows=29789 width=8) (actual time=0.014..59.256 rows=29793 loops=1)
 Total runtime: 153.974 ms
(6 Zeilen)

############################################ NOT EXISTS

SELECT "master"."id" FROM master WHERE NOT EXISTS (SELECT 1 FROM detail
AS d WHERE d.master_id=master.id);

 Seq Scan on master  (cost=0.00..27278.09 rows=5132 width=16) (actual
time=0.327..61.911 rows=43 loops=1)
   Filter: (NOT (subplan))
   SubPlan
     ->  Index Scan using detail_master_id on detail d
(cost=0.00..50.16 rows=19 width=0) (actual time=0.004..0.004 rows=1
loops=10269)
           Index Cond: (master_id = $0)
 Total runtime: 62.028 ms
(6 Zeilen)

--
Thomas Guettler, http://www.thomas-guettler.de/
E-Mail: guettli (*) thomas-guettler + de


Re: master/detail: master rows without details

От
Peter Eisentraut
Дата:
Thomas Guettler wrote:
> My naive  first solution was quite slow. Why is it so slow?
> I guess (select d.master_id from detail as d) gets executed for every
> master-row. But why? Shouldn't
> it be possible to calculate it once and then reuse it?

Please show exact schema dumps and your PostgreSQL version.


Re: master/detail: master rows without details

От
"Merlin Moncure"
Дата:
On Tue, Dec 9, 2008 at 6:56 AM, Thomas Guettler <hv@tbz-pariv.de> wrote:
> Hi,
>
> I have a two tables: master and detail.
> I search all master rows without a detail row.
>
> master: 10000 rows
> detail: 29800 rows
>
> Although all three solution have the same result, The execution time
> differs very much.

post your postgresql version.
btw, 8.4 is going to be much smarter about converting these types of
plans.  in the meantime, just use joins wherever possible.

merlin

Re: master/detail: master rows without details

От
Thomas Guettler
Дата:
Peter Eisentraut <peter_e@gmx.net>Peter Eisentraut schrieb:
> Thomas Guettler wrote:
>> My naive  first solution was quite slow. Why is it so slow?
>> I guess (select d.master_id from detail as d) gets executed for every
>> master-row. But why? Shouldn't
>> it be possible to calculate it once and then reuse it?
>
> Please show exact schema dumps and your PostgreSQL version.
>
>
Version:
PostgreSQL 8.2.6 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
4.2.1 (SUSE Linux)

schema dump of both tables:

historytransaction is the master und historystatement is the detail table.

--
-- PostgreSQL database dump
--

SET client_encoding = 'UTF8';
SET standard_conforming_strings = off;
SET check_function_bodies = false;
SET client_min_messages = warning;
SET escape_string_warning = off;

SET search_path = public, pg_catalog;

SET default_tablespace = '';

SET default_with_oids = false;

CREATE TABLE foo_historytransaction (
    id integer NOT NULL,
    date timestamp with time zone NOT NULL,
    changedby_id integer NOT NULL
);


ALTER TABLE public.foo_historytransaction OWNER TO user;

CREATE SEQUENCE foo_historytransaction_id_seq
    INCREMENT BY 1
    NO MAXVALUE
    NO MINVALUE
    CACHE 1;


ALTER TABLE public.foo_historytransaction_id_seq OWNER TO user;
ALTER SEQUENCE foo_historytransaction_id_seq OWNED BY
foo_historytransaction.id;
ALTER TABLE foo_historytransaction ALTER COLUMN id SET DEFAULT
nextval('foo_historytransaction_id_seq'::regclass);
ALTER TABLE ONLY foo_historytransaction
    ADD CONSTRAINT foo_historytransaction_pkey PRIMARY KEY (id);
CREATE INDEX foo_historytransaction_changedby_id ON
foo_historytransaction USING btree (changedby_id);
ALTER TABLE ONLY foo_historytransaction
    ADD CONSTRAINT foo_historytransaction_changedby_id_fkey FOREIGN KEY
(changedby_id) REFERENCES auth_user(id) DEFERRABLE INITIALLY DEFERRED;
CREATE TABLE foo_historystatement (
    id integer NOT NULL,
    transaction_id integer NOT NULL,
    beleg_id integer NOT NULL,
    operation character varying(7) NOT NULL,
    tablename character varying(40),
    message character varying(1024)
);


ALTER TABLE public.foo_historystatement OWNER TO user;

CREATE SEQUENCE foo_historystatement_id_seq
    INCREMENT BY 1
    NO MAXVALUE
    NO MINVALUE
    CACHE 1;


ALTER TABLE public.foo_historystatement_id_seq OWNER TO user;

ALTER SEQUENCE foo_historystatement_id_seq OWNED BY foo_historystatement.id;

ALTER TABLE foo_historystatement ALTER COLUMN id SET DEFAULT
nextval('foo_historystatement_id_seq'::regclass);

ALTER TABLE ONLY foo_historystatement
    ADD CONSTRAINT foo_historystatement_pkey PRIMARY KEY (id);

CREATE INDEX foo_historystatement_beleg_id ON foo_historystatement USING
btree (beleg_id);
CREATE INDEX foo_historystatement_transaction_id ON foo_historystatement
USING btree (transaction_id);

ALTER TABLE ONLY foo_historystatement
    ADD CONSTRAINT foo_historystatement_transaction_id_fkey FOREIGN KEY
(transaction_id) REFERENCES foo_historytransaction(id) DEFERRABLE
INITIALLY DEFERRED;






--
Thomas Guettler, http://www.thomas-guettler.de/
E-Mail: guettli (*) thomas-guettler + de