Обсуждение: Re: v7.2.3 versus v7.3 -> huge performance penalty for JOIN with UNION
Re: v7.2.3 versus v7.3 -> huge performance penalty for JOIN with UNION
От
"ir. F.T.M. van Vugt bc."
Дата:
(Should probably be in [SQL] by now....)
I've changed my table declarations to agree on the datatypes and only one
simular problem with an update-query doesn't seem to be solved.
(see plan below)
* the concatenation in the lbar select can't be avoided, it's just the way the
data is => this does result in a resulting type 'text', AFAIK
* the aux_address.old_id is also of type 'text'
Still, the planner does a nested loop here against large costs... ;(
Any hints on this (last) one....?
TIA,
Frank.
trial=# explain update address set region_id = lbar.region_id from
(select debtor_id || '-' || address_seqnr as id, region_id from
list_base_regions) as lbar, aux_address aa
where lbar.id = aa.old_id and address.id = aa.id;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Merge Join (cost=1.07..65.50 rows=3 width=253)
Merge Cond: ("outer".id = "inner".id)
-> Nested Loop (cost=0.00..643707.03 rows=3980 width=28)
Join Filter: (((("inner".debtor_id)::text || '-'::text) ||
("inner".address_seqnr)::text) = "outer".old_id)
-> Index Scan using aux_address_idx2 on aux_address aa
(cost=0.00..81.88 rows=3989 width=16)
-> Seq Scan on list_base_regions (cost=0.00..71.80 rows=3980
width=12)
-> Sort (cost=1.07..1.08 rows=3 width=225)
Sort Key: address.id
-> Seq Scan on address (cost=0.00..1.05 rows=3 width=225)
Filter: ((id = 1) IS NOT TRUE)
(10 rows)
"ir. F.T.M. van Vugt bc." <ftm.van.vugt@foxi.nl> writes:
> Any hints on this (last) one....?
> -> Nested Loop (cost=0.00..643707.03 rows=3980 width=28)
> Join Filter: (((("inner".debtor_id)::text || '-'::text) ||
> ("inner".address_seqnr)::text) = "outer".old_id)
Looks to me like debtor_id and address_seqnr are not text type, but are
being compared to things that are text. Hard to tell exactly what's
going on though --- I suppose this query is getting rewritten by a rule?
regards, tom lane
> > Any hints on this (last) one....?
> > -> Nested Loop (cost=0.00..643707.03 rows=3980 width=28)
> > Join Filter: (((("inner".debtor_id)::text || '-'::text) ||
> > ("inner".address_seqnr)::text) = "outer".old_id)
>
> Looks to me like debtor_id and address_seqnr are not text type, but are
> being compared to things that are text.
They were coerced, yes, but changing those original types helps only so much:
* lbar.debtor_id is of type text
* lbar.address_seqnr is of type text
* aa.old_id is of type text
trial=# explain update address set region_id = lbar.region_id from
(select debtor_id || '-' || address_seqnr as f_id, region_id from
list_base_regions) as lbar, aux_address aa
where lbar.f_id = aa.old_id and address.id = aa.id;
Since the left side of the join clause is composed out of three concatenated
text-parts resulting in one single piece of type text, I'd expect the planner
to avoid the nested loop. Still:
QUERY PLAN
--------------------------------------------------------------------------------------------------------
Merge Join (cost=1.07..16.07 rows=1 width=309)
Merge Cond: ("outer".id = "inner".id)
-> Nested Loop (cost=0.00..149669.38 rows=1000 width=84)
Join Filter: ((("inner".debitor_id || '-'::text) ||
"inner".address_seqnr) = "outer".old_id)
-> Index Scan using aux_address_idx2 on aux_address aa
(cost=0.00..81.88 rows=3989 width=16)
-> Seq Scan on list_base_regions (cost=0.00..20.00 rows=1000
width=68)
-> Sort (cost=1.07..1.08 rows=3 width=225)
Sort Key: address.id
-> Seq Scan on address (cost=0.00..1.05 rows=3 width=225)
Filter: ((id = 1) IS NOT TRUE)
(10 rows)
> Hard to tell exactly what's going on though
Does this help?
NB: it seems the data types part of the manual doesn't enlighten me on this
subject, any suggestions where to find more input?
Regards,
Frank.
"Frank van Vugt" <ftm.van.vugt@foxi.nl> writes:
> Since the left side of the join clause is composed out of three concatenated
> text-parts resulting in one single piece of type text, I'd expect the
> planner to avoid the nested loop.
Probably not, since the first thing it does is to flatten the
sub-select, leaving it with a concatenation expression in the
WHERE-clause. (I was too sleepy last night to realize that you
were comparing a concatenation to old_id, rather than making two
separate comparisons :-()
We really need to fix the planner to be able to do merge/hash on
"arbitrary expression = arbitrary expression", not only "Var = Var".
IIRC, this is doable in principle, but there are a few routines that
would need to be improved.
regards, tom lane
> Probably not, since the first thing it does is to flatten the > sub-select, leaving it with a concatenation expression in the > WHERE-clause. Ah, I see. So, I'll just split this thingy into two seperate queries, starting with creating a temp table containing the straight subselect results. > We really need to fix the planner to be able to do merge/hash on > "arbitrary expression = arbitrary expression", not only "Var = Var". I can get around it, so I'm not complaining ;-) Tom, thanks a *lot* for the prompt responses !! Best, Frank.