Effect of caching hash bucket size while costing

Поиск

Список

Период

Сортировка

От	Srinivas Karthik V
Тема	Effect of caching hash bucket size while costing
Дата	8 декабря 2016 г. 08:53:51
Msg-id	CAEfuzeSWU-V8WqnQ-uyAv85dgqvjCxoTOAoF46LEE+etTz2+rw@mail.gmail.com обсуждение исходный текст
Ответы	Re: [HACKERS] Effect of caching hash bucket size while costing
Список	pgsql-hackers

Дерево обсуждения

Dear PostgreSQL Hackers,

I am working in PostgreSQL 9.4.* optimizer module. In costsize.c file and final_cost_hashjoin() function, the innerbucketsize is either:

a) calculated using a cached copy

b) calculated afresh using statistics captured by the following code snippet:
thisbucketsize = estimate_hash_bucketsize(root, get_leftop(restrictinfo->clause),virtualbuckets);

For the query I used, if I disable the caching for calculating the innerbucketsize, I get a different plan with cost change of around 1000 units.

1) Can you please let me know if innerbucketsize*innerpathrows captures the maximum bucket size?
2) why is it not calculated afresh all the time?

For reference, below is the query I am using:

explain select i_item_id, avg(cs_quantity) , avg(cs_list_price) , avg(cs_coupon_amt) , avg(cs_sales_price) from catalog_sales, customer_demographics, date_dim, item, promotion where cs_sold_date_sk = d_date_sk and cs_item_sk = i_item_sk and cs_bill_cdemo_sk = cd_demo_sk and cs_promo_sk = p_promo_sk and cd_gender = 'F' and cd_marital_status = 'U' and cd_education_status = 'Unknown' and (p_channel_email = 'N' or p_channel_event = 'N') and d_year = 2002 and i_current_price <= 100 group by i_item_id order by i_item_id

and the hashclause which was tried was (item.i_item_sk = catalog_sales.cs_item_sk).

Thanks,

Srinivas Karthik

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Effect of caching hash bucket size while costing