Обсуждение: Problem with JOINS

Поиск
Список
Период
Сортировка

Problem with JOINS

От
Charlie Clark
Дата:
Dear list,

I've recently noticed that I've got a problem with query. It seems that due 
to some of the conditions I'm setting an implicit OUTER JOIN is occuring 
which is overriding another condition.

I'm enclosing two queries. The first returns a list of people in a 
particular cities, the second should essentially be similar but with 
additional criteria to be me. In practice, however, it ignores the city 
criteria as the accompanying EXPLAIN details. I assume I've got something 
just slightly wrong but I'm fairly new to EXPLAIN.

Here is Query 1

SELECT 
gender.value as anrede_value,
person.name as person_name,
person.vorname as person_vorname,
person.zusatz as person_zusatz,
person.birthdate as person_birthdate,
address.strasse as address_strasse,
address.hausnummer as address_hausnummer,
address.tel as address_tel,
address.tel_vor as address_tel_vor,
address.fax as address_fax,
address.fax_vor as address_fax_vor,
address.mobil as address_mobil,
address.mobil_vor as address_mobil_vor,
address.plz as address_plz,
address.ort as address_ort,
address.e_mail as address_e_mail,
address.www as address_homepage,
address.wheelchair as address_wheelchair,
therapist.id_person
,users.roles as service
FROM person
INNER JOIN therapist on
(person.id_person = therapist.id_person)
INNER JOIN address on
(person.id_person = address.id_person)
INNER JOIN gender_list as gender on
(person.id_gender = gender.id)
INNER JOIN users on
(users.id_person = person.id_person)
WHERE true
AND
person.id_status = 2
AND
person.id_authorise = 2
AND
ltrim(lower(address.ort)) like lower('Neuss%')
ORDER by person.name
LIMIT 100
Limit  (cost=41.37..41.38 rows=1 width=187)  ->  Sort  (cost=41.37..41.38 rows=1 width=187)        Sort Key:
person.name       ->  Nested Loop  (cost=19.68..41.36 rows=1 width=187)              Join Filter: ("inner".id_person =
"outer".id_person)             ->  Nested Loop  (cost=19.68..35.66 rows=1 width=172)                    Join Filter:
("outer".id_gender= "inner".id)                    ->  Hash Join  (cost=19.68..34.61 rows=1 width=160)
       Hash Cond: ("outer".id_person = 
 
"inner".id_person)                          ->  Hash Join  (cost=4.91..19.13 rows=141 
width=40)                                Hash Cond: ("outer".id_person = 
"inner".id_person)                                ->  Seq Scan on person  (cost=0.00..11.37 
rows=145 width=36)                                      Filter: ((id_status = 2) AND 
(id_authorise = 2))                                ->  Hash  (cost=4.53..4.53 rows=153 
width=4)                                      ->  Seq Scan on therapist  
(cost=0.00..4.53 rows=153 width=4)                          ->  Hash  (cost=14.77..14.77 rows=1 width=120)
                 ->  Seq Scan on address  (cost=0.00..14.77 
 
rows=1 width=120)                                      Filter: (ltrim(lower((ort)::text)) 
~~ 'neuss%'::text)                    ->  Seq Scan on gender_list gender  (cost=0.00..1.02 
rows=2 width=12)              ->  Index Scan using users_pkey on users  (cost=0.00..5.69 
rows=1 width=15)                    Index Cond: (users.id_person = "outer".id_person)  

Here is Query 2
SELECT 
gender.value as anrede_value,
person.name as person_name,
person.vorname as person_vorname,
person.zusatz as person_zusatz,
person.birthdate as person_birthdate,
address.strasse as address_strasse,
address.hausnummer as address_hausnummer,
address.tel as address_tel,
address.tel_vor as address_tel_vor,
address.fax as address_fax,
address.fax_vor as address_fax_vor,
address.mobil as address_mobil,
address.mobil_vor as address_mobil_vor,
address.plz as address_plz,
address.ort as address_ort,
address.e_mail as address_e_mail,
address.www as address_homepage,
address.wheelchair as address_wheelchair,
therapist.id_person
,users.roles as service
FROM person
INNER JOIN therapist on
(person.id_person = therapist.id_person)
INNER JOIN address on
(person.id_person = address.id_person)
INNER JOIN gender_list as gender on
(person.id_gender = gender.id)
INNER JOIN users on
(users.id_person = person.id_person)
INNER JOIN bill ON
(bill.id_person = person.id_person)
INNER JOIN bill_status_list AS bs ON
(bs.id = bill.id_status)
WHERE true
AND
person.id_status = 2
AND
person.id_authorise = 2
AND
ltrim(lower(address.ort)) like lower('Neuss%')
AND
bs.value = 'bezahlt' OR bs.value = 'erlassen'
AND
users.roles like '%Premium'
ORDER by person.name
LIMIT 100Limit  (cost=70.10..70.11 rows=1 width=214)  ->  Sort  (cost=70.10..70.11 rows=1 width=214)        Sort Key:
person.name       ->  Hash Join  (cost=58.91..70.09 rows=1 width=214)              Hash Cond: ("outer".id_status =
"inner".id)             Join Filter: ((("inner".value = 'erlassen'::character 
 
varying) OR ("outer".id_status = 2)) AND (("inner".value = 
'erlassen'::character varying) OR ("outer".id_authorise = 2)) AND 
(("inner".value = 'erlassen'::character varying) OR 
(ltrim(lower(("outer".ort)::text)) ~~ 'neuss%'::text)) AND (("outer".roles 
~~ '%Premium'::text) OR ("inner".value = 'bezahlt'::character varying)))              ->  Hash Join  (cost=57.80..68.97
rows=1width=199)                    Hash Cond: ("outer".id_person = "inner".id_person)                    ->  Seq Scan
onbill  (cost=0.00..10.44 rows=144 
 
width=8)                    ->  Hash  (cost=57.80..57.80 rows=1 width=191)                          ->  Hash Join
(cost=49.85..57.80rows=1 
 
width=191)                                Hash Cond: ("outer".id_person = 
"inner".id_person)                                Join Filter: ((("inner".roles ~~ 
'%Premium'::text) OR ("outer".id_status = 2)) AND (("inner".roles ~~ 
'%Premium'::text) OR ("outer".id_authorise = 2)) AND (("inner".roles ~~ 
'%Premium'::text) OR (ltrim(lower(("outer".ort)::text)) ~~ 'neuss%'::text)))                                ->  Merge
Join (cost=41.84..44.05 
 
rows=153 width=176)                                      Merge Cond: ("outer".id_gender = 
"inner".id)                                      ->  Sort  (cost=40.81..41.19 
rows=153 width=164)                                            Sort Key: person.id_gender
            ->  Hash Join  
 
(cost=18.97..35.25 rows=153 width=164)                                                  Hash Cond: 
("outer".id_person = "inner".id_person)                                                  ->  Seq Scan on address  
(cost=0.00..13.58 rows=158 width=120)                                                  ->  Hash  
(cost=18.59..18.59 rows=153 width=44)                                                        ->  Hash Join  
(cost=4.91..18.59 rows=153 width=44)                                                              Hash Cond: 
("outer".id_person = "inner".id_person)                                                              ->  Seq Scan 
on person  (cost=0.00..10.58 rows=158 width=40)                                                              ->  Hash

(cost=4.53..4.53 rows=153 width=4)                                                                    ->  
Seq Scan on therapist  (cost=0.00..4.53 rows=153 width=4)                                      ->  Sort
(cost=1.03..1.03rows=2 
 
width=12)                                            Sort Key: gender.id                                            ->
SeqScan on gender_list 
 
gender  (cost=0.00..1.02 rows=2 width=12)                                ->  Hash  (cost=7.61..7.61 rows=161 
width=15)                                      ->  Seq Scan on users  
(cost=0.00..7.61 rows=161 width=15)              ->  Hash  (cost=1.10..1.10 rows=2 width=15)                    ->  Seq
Scanon bill_status_list bs  (cost=0.00..1.10 
 
rows=2 width=15)                          Filter: ((value = 'erlassen'::character varying) 
OR (value = 'bezahlt'::character varying))                         

What I notice is that in the second query the following filter is missing.
->  Hash  (cost=14.77..14.77 rows=1 width=120)                                ->  Seq Scan on address
(cost=0.00..14.77
 
rows=1 width=120)                                      Filter: (ltrim(lower((ort)::text)) 
~~ 'neuss%'::text)

I'm going to try and break this down and work through it myself but would 
be very grateful for any pointers.

Thanks

Charlie Clark


Re: Problem with JOINS

От
Stephan Szabo
Дата:
On Fri, 21 May 2004, Charlie Clark wrote:

>  SELECT
> gender.value as anrede_value,
> person.name as person_name,
> person.vorname as person_vorname,
> person.zusatz as person_zusatz,
> person.birthdate as person_birthdate,
> address.strasse as address_strasse,
> address.hausnummer as address_hausnummer,
> address.tel as address_tel,
> address.tel_vor as address_tel_vor,
> address.fax as address_fax,
> address.fax_vor as address_fax_vor,
> address.mobil as address_mobil,
> address.mobil_vor as address_mobil_vor,
> address.plz as address_plz,
> address.ort as address_ort,
> address.e_mail as address_e_mail,
> address.www as address_homepage,
> address.wheelchair as address_wheelchair,
> therapist.id_person
> ,users.roles as service
> FROM person
> INNER JOIN therapist on
> (person.id_person = therapist.id_person)
> INNER JOIN address on
> (person.id_person = address.id_person)
> INNER JOIN gender_list as gender on
> (person.id_gender = gender.id)
> INNER JOIN users on
> (users.id_person = person.id_person)
> INNER JOIN bill ON
> (bill.id_person = person.id_person)
> INNER JOIN bill_status_list AS bs ON
> (bs.id = bill.id_status)
> WHERE true
> AND
> person.id_status = 2
> AND
> person.id_authorise = 2
> AND
> ltrim(lower(address.ort)) like lower('Neuss%')
> AND
> bs.value = 'bezahlt' OR bs.value = 'erlassen'
> AND
> users.roles like '%Premium'
> ORDER by person.name
> LIMIT 100
>
>  Limit  (cost=70.10..70.11 rows=1 width=214)
>    ->  Sort  (cost=70.10..70.11 rows=1 width=214)
>          Sort Key: person.name
>          ->  Hash Join  (cost=58.91..70.09 rows=1 width=214)
>                Hash Cond: ("outer".id_status = "inner".id)
>                Join Filter: ((("inner".value = 'erlassen'::character
> varying) OR ("outer".id_status = 2)) AND (("inner".value =
> 'erlassen'::character varying) OR ("outer".id_authorise = 2)) AND
> (("inner".value = 'erlassen'::character varying) OR
> (ltrim(lower(("outer".ort)::text)) ~~ 'neuss%'::text)) AND (("outer".roles
> ~~ '%Premium'::text) OR ("inner".value = 'bezahlt'::character varying)))
>                ->  Hash Join  (cost=57.80..68.97 rows=1 width=199)
>                      Hash Cond: ("outer".id_person = "inner".id_person)
>                      ->  Seq Scan on bill  (cost=0.00..10.44 rows=144
> width=8)
>                      ->  Hash  (cost=57.80..57.80 rows=1 width=191)
>                            ->  Hash Join  (cost=49.85..57.80 rows=1
> width=191)
>                                  Hash Cond: ("outer".id_person =
> "inner".id_person)
>                                  Join Filter: ((("inner".roles ~~
> '%Premium'::text) OR ("outer".id_status = 2)) AND (("inner".roles ~~
> '%Premium'::text) OR ("outer".id_authorise = 2)) AND (("inner".roles ~~
> '%Premium'::text) OR (ltrim(lower(("outer".ort)::text)) ~~ 'neuss%'::text)))
>                                  ->  Merge Join  (cost=41.84..44.05
> rows=153 width=176)
>                                        Merge Cond: ("outer".id_gender =
> "inner".id)
>                                        ->  Sort  (cost=40.81..41.19
> rows=153 width=164)
>                                              Sort Key: person.id_gender
>                                              ->  Hash Join
> (cost=18.97..35.25 rows=153 width=164)
>                                                    Hash Cond:
> ("outer".id_person = "inner".id_person)
>                                                    ->  Seq Scan on address
> (cost=0.00..13.58 rows=158 width=120)
>                                                    ->  Hash
> (cost=18.59..18.59 rows=153 width=44)
>                                                          ->  Hash Join
> (cost=4.91..18.59 rows=153 width=44)
>                                                                Hash Cond:
> ("outer".id_person = "inner".id_person)
>                                                                ->  Seq Scan
> on person  (cost=0.00..10.58 rows=158 width=40)
>                                                                ->  Hash
> (cost=4.53..4.53 rows=153 width=4)
>                                                                      ->
> Seq Scan on therapist  (cost=0.00..4.53 rows=153 width=4)
>                                        ->  Sort  (cost=1.03..1.03 rows=2
> width=12)
>                                              Sort Key: gender.id
>                                              ->  Seq Scan on gender_list
> gender  (cost=0.00..1.02 rows=2 width=12)
>                                  ->  Hash  (cost=7.61..7.61 rows=161
> width=15)
>                                        ->  Seq Scan on users
> (cost=0.00..7.61 rows=161 width=15)
>                ->  Hash  (cost=1.10..1.10 rows=2 width=15)
>                      ->  Seq Scan on bill_status_list bs  (cost=0.00..1.10
> rows=2 width=15)
>                            Filter: ((value = 'erlassen'::character varying)
> OR (value = 'bezahlt'::character varying))
>
> What I notice is that in the second query the following filter is missing.
> ->  Hash  (cost=14.77..14.77 rows=1 width=120)
>                                  ->  Seq Scan on address  (cost=0.00..14.77
> rows=1 width=120)
>                                        Filter: (ltrim(lower((ort)::text))
> ~~ 'neuss%'::text)

It's not missing, it's merely moved.  It's become part of the join filter
for the top hash join.

>                Join Filter: ((("inner".value = 'erlassen'::character
> varying) OR ("outer".id_status = 2)) AND (("inner".value =
> 'erlassen'::character varying) OR ("outer".id_authorise = 2)) AND
> (("inner".value = 'erlassen'::character varying) OR
> (ltrim(lower(("outer".ort)::text)) ~~ 'neuss%'::text)) AND (("outer".roles
> ~~ '%Premium'::text) OR ("inner".value = 'bezahlt'::character varying)))

This probably means that it thinks that the condition won't push down.

I think perhaps
> bs.value = 'bezahlt' OR bs.value = 'erlassen'
is meant to be (bs.value = 'bezahlt' OR bs.value='erlassen')


Re: Problem with JOINS

От
Tom Lane
Дата:
Charlie Clark <charlie@begeistert.org> writes:
> WHERE true
> AND
> person.id_status = 2
> AND
> person.id_authorise = 2
> AND
> ltrim(lower(address.ort)) like lower('Neuss%')
> AND
> bs.value = 'bezahlt' OR bs.value = 'erlassen'
> AND
> users.roles like '%Premium'

AND binds more tightly than OR --- I suspect you wanted some
parentheses.

AND
(bs.value = 'bezahlt' OR bs.value = 'erlassen')
AND

Or you could express the same thing using IN:

AND
bs.value IN ('bezahlt', 'erlassen')
AND
        regards, tom lane