Обсуждение: [PERFORM] anti-join with small table via text/varchar cannot estimate rowscorrectly
[PERFORM] anti-join with small table via text/varchar cannot estimate rowscorrectly
От
Stefan Andreatta
Дата:
Hello,
I have encountered a strange problem when doing an anti-join with a very small table via a varchar or text field as opposed to an integer field. Postgres version is 9.5.3
I did some experiments to extract the problem in a simple form. FIrst generate two tables with a series of numbers - once as integers once as text. The first table has 10,000 rows the second table just one:
Thanks for any help,
Stefan
I have encountered a strange problem when doing an anti-join with a very small table via a varchar or text field as opposed to an integer field. Postgres version is 9.5.3
I did some experiments to extract the problem in a simple form. FIrst generate two tables with a series of numbers - once as integers once as text. The first table has 10,000 rows the second table just one:
=# select generate_series(1, 10000) as id, generate_series(1,10000)::text as text into table tmp_san_1; SELECT 10000 =# select generate_series(1, 1) as id, generate_series(1,1)::text as text into table tmp_san_2; SELECT 1 =# analyze tmp_san_1; ANALYZE =# analyze tmp_san_2; ANALYZE =# \d tmp_san_* Table "public.tmp_san_1"Column | Type | Modifiers --------+---------+-----------id | integer | text | text | Table "public.tmp_san_2"Column | Type | Modifiers --------+---------+-----------id | integer | text | text |Now I do an anti-join between the two tables via the id field (integer). The number of resulting rows are estimated correctly as 9,999:
=# explain analyze select tmp_san_1.id from tmp_san_1 left join tmp_san_2 on tmp_san_1.id = tmp_san_2.id where tmp_san_2.id is null; QUERY PLAN -------------------------------------------------------------------------------------------------------------------Hash Anti Join (cost=1.02..281.26 rows=9999 width=4) (actual time=0.019..2.743 rows=9999 loops=1) Hash Cond: (tmp_san_1.id = tmp_san_2.id) -> Seq Scan on tmp_san_1 (cost=0.00..154.00 rows=10000 width=4) (actual time=0.007..1.023 rows=10000 loops=1) -> Hash (cost=1.01..1.01 rows=1 width=4) (actual time=0.004..0.004 rows=1 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Seq Scan on tmp_san_2 (cost=0.00..1.01 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=1)Planning time: 0.138 msExecution time: 3.218 ms (8 rows)The same anti-join using the text fields, however estimates just 1 resulting row, while there are still of course 9,999 of them:
=# explain analyze select tmp_san_1.id from tmp_san_1 left join tmp_san_2 on tmp_san_1.text = tmp_san_2.text where tmp_san_2.id is null; QUERY PLAN -------------------------------------------------------------------------------------------------------------------Hash Left Join (cost=1.02..192.53 rows=1 width=4) (actual time=0.020..3.091 rows=9999 loops=1) Hash Cond: (tmp_san_1.text = tmp_san_2.text) Filter: (tmp_san_2.id IS NULL) Rows Removed by Filter: 1 -> Seq Scan on tmp_san_1 (cost=0.00..154.00 rows=10000 width=8) (actual time=0.008..0.983 rows=10000 loops=1) -> Hash (cost=1.01..1.01 rows=1 width=6) (actual time=0.004..0.004 rows=1 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Seq Scan on tmp_san_2 (cost=0.00..1.01 rows=1 width=6) (actual time=0.002..0.002 rows=1 loops=1)Planning time: 0.173 msExecution time: 3.546 ms (10 rows)I cannot explain that behavior and much less think of a fix or workaround. Unfortunately my real-world example has to use varchar for the join.
Thanks for any help,
Stefan
Re: [PERFORM] anti-join with small table via text/varchar cannotestimate rows correctly
От
"David G. Johnston"
Дата:
plain analyze select tmp_san_1.id from tmp_san_1 left join tmp_san_2 on tmp_san_1.text = tmp_san_2.text where tmp_san_2.id is null;
Does it help if you check for "tmp_san_2.text is null"?
David J.
On Wed, Mar 1, 2017 at 2:12 PM, David G. Johnston <david.g.johnston@gmail.com> wrote:
plain analyze select tmp_san_1.id from tmp_san_1 left join tmp_san_2 on tmp_san_1.text = tmp_san_2.text where tmp_san_2.id is null;Does it help if you check for "tmp_san_2.text is null"?
Yes. And if you swap it so that the left join is on the integer while IS NULL is on the text, that also gets poorly estimated. Also, if you make both column of both tables be integers, same thing--you get bad estimates when the join condition refers to one column and the where refers to the other. I don't know why the estimate is poor, but it is not related to the types of the columns, but rather the identities of them.
Cheers,
Jeff
Re: [PERFORM] anti-join with small table via text/varchar cannotestimate rows correctly
От
"David G. Johnston"
Дата:
On Wed, Mar 1, 2017 at 2:12 PM, David G. Johnston <david.g.johnston@gmail.com> wrote:plain analyze select tmp_san_1.id from tmp_san_1 left join tmp_san_2 on tmp_san_1.text = tmp_san_2.text where tmp_san_2.id is null;Does it help if you check for "tmp_san_2.text is null"?Yes. And if you swap it so that the left join is on the integer while IS NULL is on the text, that also gets poorly estimated. Also, if you make both column of both tables be integers, same thing--you get bad estimates when the join condition refers to one column and the where refers to the other. I don't know why the estimate is poor, but it is not related to the types of the columns, but rather the identities of them.
I suspect it has to with the lack of a NOT NULL constraint on either column causing the planner to disregard the potential to implement a LEFT JOIN using ANTI-JOIN semantics - or, also possible - the form itself is invalid regardless of the presence or absence of contraints. IIUC, while a true anti-join syntax doesn't exist the canonical form for one uses NOT EXISTS - which would force the author to use only the correct column pair.
David J.
Stefan Andreatta <s.andreatta@synedra.com> writes: > The same anti-join using the text fields, however estimates just 1 > resulting row, while there are still of course 9,999 of them: > =# explain analyze > select tmp_san_1.id > from tmp_san_1 > left join tmp_san_2 on tmp_san_1.text = tmp_san_2.text > where tmp_san_2.id is null; That is not an anti-join. To make it one, you have to constrain the RHS join column to be IS NULL, not some random other column. Note the join type isn't getting shown as Anti: > Hash Left Join (cost=1.02..192.53 rows=1 width=4) (actual time=0.020..3.091 rows=9999 loops=1) As written, the query could return some rows that weren't actually antijoin rows, ie tmp_san_1.text *did* have a match in tmp_san_2, but that row chanced to have a null value of id. Possibly the planner could be smarter about estimating for this case, but it doesn't look much like a typical use-case to me. regards, tom lane
Re: [PERFORM] anti-join with small table via text/varchar cannotestimate rows correctly
От
Stefan Andreatta
Дата:
On 02.03.2017 02:06, Tom Lane wrote: > Stefan Andreatta <s.andreatta@synedra.com> writes: >> The same anti-join using the text fields, however estimates just 1 >> resulting row, while there are still of course 9,999 of them: >> =# explain analyze >> select tmp_san_1.id >> from tmp_san_1 >> left join tmp_san_2 on tmp_san_1.text = tmp_san_2.text >> where tmp_san_2.id is null; > That is not an anti-join. To make it one, you have to constrain the RHS > join column to be IS NULL, not some random other column. Note the join > type isn't getting shown as Anti: > >> Hash Left Join (cost=1.02..192.53 rows=1 width=4) (actual time=0.020..3.091 rows=9999 loops=1) > As written, the query could return some rows that weren't actually > antijoin rows, ie tmp_san_1.text *did* have a match in tmp_san_2, > but that row chanced to have a null value of id. > > Possibly the planner could be smarter about estimating for this case, > but it doesn't look much like a typical use-case to me. > > regards, tom lane Thanks a lot! Right, my problem had nothing to do with the type of the join field, but with the selection of the proper field for the NULL-condition. So, even a join on the id field is badly estimated if checked on the text field: =# EXPLAIN ANALYZE SELECT tmp_san_1.id FROM tmp_san_1 LEFT JOIN tmp_san_2 ON tmp_san_1.id = tmp_san_2.id WHERE (tmp_san_2.text IS NULL); QUERY PLAN ------------------------------------------------------------------------------------------------------------------- Hash Left Join (cost=1.02..192.53 rows=1 width=4) (actual time=0.019..2.939 rows=9999 loops=1) Hash Cond: (tmp_san_1.id = tmp_san_2.id) Filter: (tmp_san_2.text IS NULL) Rows Removed by Filter: 1 -> Seq Scan on tmp_san_1 (cost=0.00..154.00 rows=10000 width=4) (actual time=0.007..1.003 rows=10000 loops=1) -> Hash (cost=1.01..1.01 rows=1 width=6) (actual time=0.004..0.004 rows=1 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Seq Scan on tmp_san_2 (cost=0.00..1.01 rows=1 width=6) (actual time=0.001..0.002 rows=1 loops=1) Planning time: 0.062 ms Execution time: 3.381 ms (10 rows) ... but if the join and the check refer to the same field everything is fine: =# EXPLAIN ANALYZE SELECT tmp_san_1.id FROM tmp_san_1 LEFT JOIN tmp_san_2 ON tmp_san_1.id = tmp_san_2.id WHERE (tmp_san_2.id IS NULL); QUERY PLAN ------------------------------------------------------------------------------------------------------------------- Hash Anti Join (cost=1.02..281.26 rows=9999 width=4) (actual time=0.018..2.672 rows=9999 loops=1) Hash Cond: (tmp_san_1.id = tmp_san_2.id) -> Seq Scan on tmp_san_1 (cost=0.00..154.00 rows=10000 width=4) (actual time=0.007..0.962 rows=10000 loops=1) -> Hash (cost=1.01..1.01 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Seq Scan on tmp_san_2 (cost=0.00..1.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1) Planning time: 0.051 ms Execution time: 3.164 ms (8 rows) It get's more interesting again, if the text field really could be NULL and I wanted to include those rows. If I just include "OR tmp_san_2.text IS NULL" estimates are off again: =# EXPLAIN ANALYZE SELECT tmp_san_1.id FROM tmp_san_1 LEFT JOIN tmp_san_2 ON tmp_san_1.id = tmp_san_2.id WHERE (tmp_san_2.id IS NULL OR tmp_san_2.text IS NULL); QUERY PLAN ------------------------------------------------------------------------------------------------------------------- Hash Left Join (cost=1.02..192.53 rows=1 width=4) (actual time=0.019..2.984 rows=9999 loops=1) Hash Cond: (tmp_san_1.id = tmp_san_2.id) Filter: ((tmp_san_2.id IS NULL) OR (tmp_san_2.text IS NULL)) Rows Removed by Filter: 1 -> Seq Scan on tmp_san_1 (cost=0.00..154.00 rows=10000 width=4) (actual time=0.008..1.024 rows=10000 loops=1) -> Hash (cost=1.01..1.01 rows=1 width=6) (actual time=0.004..0.004 rows=1 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Seq Scan on tmp_san_2 (cost=0.00..1.01 rows=1 width=6) (actual time=0.001..0.002 rows=1 loops=1) Planning time: 0.088 ms Execution time: 3.508 ms (10 rows) Instead, it seems, I have to move this condition (inverted) into the join clause for the planner to make correct estimates again: =# EXPLAIN ANALYZE SELECT tmp_san_1.id FROM tmp_san_1 LEFT JOIN tmp_san_2 ON tmp_san_1.id = tmp_san_2.id AND tmp_san_2.text IS NOT NULL WHERE (tmp_san_2.id IS NULL); QUERY PLAN ------------------------------------------------------------------------------------------------------------------- Hash Anti Join (cost=1.02..281.26 rows=9999 width=4) (actual time=0.017..2.761 rows=9999 loops=1) Hash Cond: (tmp_san_1.id = tmp_san_2.id) -> Seq Scan on tmp_san_1 (cost=0.00..154.00 rows=10000 width=4) (actual time=0.007..1.052 rows=10000 loops=1) -> Hash (cost=1.01..1.01 rows=1 width=4) (actual time=0.004..0.004 rows=1 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Seq Scan on tmp_san_2 (cost=0.00..1.01 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=1) Filter: (text IS NOT NULL) Planning time: 0.058 ms Execution time: 3.232 ms (9 rows) So, yes, the planner could infer a bit more here - after all, if few rows are present to start with only few rows can meet any condition. But that may well be an unusual case. It's just easy to get puzzled by these things once you get used to the postresql planner being very smart in most cases ;-) Thanks again, Stefan