Re: plan time of MASSIVE partitioning ...

Поиск
Список
Период
Сортировка
От Boszormenyi Zoltan
Тема Re: plan time of MASSIVE partitioning ...
Дата
Msg-id 4CC95E9A.5000605@cybertec.at
обсуждение исходный текст
Ответ на Re: plan time of MASSIVE partitioning ...  (Boszormenyi Zoltan <zb@cybertec.at>)
Ответы Re: plan time of MASSIVE partitioning ...  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Boszormenyi Zoltan írta:
> Boszormenyi Zoltan írta:
>   
>> Boszormenyi Zoltan írta:
>>   
>>     
>>> Heikki Linnakangas írta:
>>>   
>>>     
>>>       
>>>> On 26.10.2010 18:34, Boszormenyi Zoltan wrote:
>>>>     
>>>>       
>>>>         
>>>>> thank you very much for pointing me to dynahash, here is the
>>>>> next version that finally seems to work.
>>>>>
>>>>> Two patches are attached, the first is the absolute minimum for
>>>>> making it work, this still has the Tree type for canon_pathkeys
>>>>> and eq_classes got the same treatment as join_rel_list/join_rel_hash
>>>>> has in the current sources: if the list grows larger than 32, a hash
>>>>> table
>>>>> is created. It seems to be be enough for doing in for
>>>>>       get_eclass_for_sort_expr()
>>>>> only, the other users of eq_classes aren't bothered by this change.
>>>>>       
>>>>>         
>>>>>           
>>>> That's better, but can't you use dynahash for canon_pathkeys as well?
>>>>     
>>>>       
>>>>         
>>> Here's a purely dynahash solution. It's somewhat slower than
>>> the tree version, 0.45 vs 0.41 seconds in the cached case for the
>>> previously posted test case.
>>>   
>>>     
>>>       
>> And now in context diff, sorry for my affection towards unified diffs. :-)
>>   
>>     
>
> A little better version, no need for the heavy hash_any, hash_uint32
> on the lower 32 bits on pk_eclass is enough. The profiling runtime
> is now 0.42 seconds vs the previous 0.41 seconds for the tree version.
>
> Best regards,
> Zoltán Böszörményi
>   

Btw, the top entries in the current gprof output are:

Each sample counts as 0.01 seconds. %   cumulative   self              self     total          time   seconds   seconds
  calls  ms/call  ms/call  name   19.05      0.08     0.08      482     0.17     0.29 
 
add_child_rel_equivalences11.90      0.13     0.05  1133447     0.00     0.00  bms_is_subset 9.52      0.17     0.04
331162    0.00     0.00 
 
hash_search_with_hash_value 7.14      0.20     0.03   548971     0.00     0.00  AllocSetAlloc 4.76      0.22     0.02
 2858     0.01     0.01  get_tabstat_entry 4.76      0.24     0.02     1136     0.02     0.02  tzload
 

This means add_child_rel_equivalences() is still takes
too much time, the previously posted test case calls this
function 482 times, it's called for almost  every 10th entry
added to eq_classes. The elog() I put into this function says
that at the last call list_length(eq_classes) == 4754.

Best regards,
Zoltán Böszörményi

-- 
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de    http://www.postgresql.at/



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Golub
Дата:
Сообщение: Re: add label to enum syntax
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: plan time of MASSIVE partitioning ...