Bad query plan when the wrong data type is used
От | Laszlo Nagy |
---|---|
Тема | Bad query plan when the wrong data type is used |
Дата | |
Msg-id | 4D514FFF.9060308@shopzeus.com обсуждение исходный текст |
Ответы |
Re: Bad query plan when the wrong data type is used
(Josh Berkus <josh@agliodbs.com>)
|
Список | pgsql-performance |
This query:
select p.id,p.producer_id,visa.variation_item_id, vi.qtyavail
from variation_item_sellingsite_asin visa
inner join product p on p.id = visa.product_id
inner join variation_item vi on vi.id = visa.variation_item_id
where visa.id =4
runs in 43 msec. The "visa.id" column has int4 datatype. The query plan uses an index condition:
"Nested Loop (cost=0.00..26.19 rows=1 width=28)"
" -> Nested Loop (cost=0.00..17.75 rows=1 width=24)"
" -> Index Scan using variation_item_sellingsite_asin_pkey on variation_item_sellingsite_asin visa (cost=0.00..8.58 rows=1 width=16)"
" Index Cond: (id = 4)"
" -> Index Scan using pk_product_id on product p (cost=0.00..9.16 rows=1 width=16)"
" Index Cond: (p.id = visa.product_id)"
" -> Index Scan using pk_variation_item_id on variation_item vi (cost=0.00..8.43 rows=1 width=12)"
" Index Cond: (vi.id = visa.variation_item_id)"
This query:
select p.id,p.producer_id,visa.variation_item_id, vi.qtyavail
from variation_item_sellingsite_asin visa
inner join product p on p.id = visa.product_id
inner join variation_item vi on vi.id = visa.variation_item_id
where visa.id =4.0
Runs for 1144 msec! Query plan uses seq scan + filter:
"Nested Loop (cost=33957.27..226162.68 rows=14374 width=28)"
" -> Hash Join (cost=33957.27..106190.76 rows=14374 width=20)"
" Hash Cond: (visa.variation_item_id = vi.id)"
" -> Seq Scan on variation_item_sellingsite_asin visa (cost=0.00..71928.04 rows=14374 width=16)"
" Filter: ((id)::numeric = 4.0)"
" -> Hash (cost=22026.01..22026.01 rows=954501 width=12)"
" -> Seq Scan on variation_item vi (cost=0.00..22026.01 rows=954501 width=12)"
" -> Index Scan using pk_product_id on product p (cost=0.00..8.33 rows=1 width=16)"
" Index Cond: (p.id = visa.product_id)"
Which is silly. I think that PostgreSQL converts the int side to a float, and then compares them.
It would be better to do this, for each item in the loop:
(I spent an hour figuring out what is wrong with my program. In some cases it was slow, in other cases it was really fast, and I never got an error message.)
What do you think?
Laszlo
select p.id,p.producer_id,visa.variation_item_id, vi.qtyavail
from variation_item_sellingsite_asin visa
inner join product p on p.id = visa.product_id
inner join variation_item vi on vi.id = visa.variation_item_id
where visa.id =4
runs in 43 msec. The "visa.id" column has int4 datatype. The query plan uses an index condition:
"Nested Loop (cost=0.00..26.19 rows=1 width=28)"
" -> Nested Loop (cost=0.00..17.75 rows=1 width=24)"
" -> Index Scan using variation_item_sellingsite_asin_pkey on variation_item_sellingsite_asin visa (cost=0.00..8.58 rows=1 width=16)"
" Index Cond: (id = 4)"
" -> Index Scan using pk_product_id on product p (cost=0.00..9.16 rows=1 width=16)"
" Index Cond: (p.id = visa.product_id)"
" -> Index Scan using pk_variation_item_id on variation_item vi (cost=0.00..8.43 rows=1 width=12)"
" Index Cond: (vi.id = visa.variation_item_id)"
This query:
select p.id,p.producer_id,visa.variation_item_id, vi.qtyavail
from variation_item_sellingsite_asin visa
inner join product p on p.id = visa.product_id
inner join variation_item vi on vi.id = visa.variation_item_id
where visa.id =4.0
Runs for 1144 msec! Query plan uses seq scan + filter:
"Nested Loop (cost=33957.27..226162.68 rows=14374 width=28)"
" -> Hash Join (cost=33957.27..106190.76 rows=14374 width=20)"
" Hash Cond: (visa.variation_item_id = vi.id)"
" -> Seq Scan on variation_item_sellingsite_asin visa (cost=0.00..71928.04 rows=14374 width=16)"
" Filter: ((id)::numeric = 4.0)"
" -> Hash (cost=22026.01..22026.01 rows=954501 width=12)"
" -> Seq Scan on variation_item vi (cost=0.00..22026.01 rows=954501 width=12)"
" -> Index Scan using pk_product_id on product p (cost=0.00..8.33 rows=1 width=16)"
" Index Cond: (p.id = visa.product_id)"
Which is silly. I think that PostgreSQL converts the int side to a float, and then compares them.
It would be better to do this, for each item in the loop:
- evaluate the right side (which is float)
- tell if it is an integer or not
- if not an integer, then discard the row immediately
- otherwise use its integer value for the index scan
(I spent an hour figuring out what is wrong with my program. In some cases it was slow, in other cases it was really fast, and I never got an error message.)
What do you think?
Laszlo
В списке pgsql-performance по дате отправления: