Re: Parallel CREATE INDEX for GIN indexes

Поиск
Список
Период
Сортировка
От Andy Fan
Тема Re: Parallel CREATE INDEX for GIN indexes
Дата
Msg-id 87o777jrcc.fsf@163.com
обсуждение исходный текст
Ответ на Parallel CREATE INDEX for GIN indexes  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Список pgsql-hackers
Andy Fan <zhihuifan1213@163.com> writes:

I just realize all my replies is replied to sender only recently,
probably because I upgraded the email cient and the short-cut changed
sliently, resent the lastest one only.... 

>>> Suppose RBTree's output is:
>>> 
>>> batch-1 at RBTree:
>>> 1  [tid1, tid8, tid100]
>>> 2  [tid1, tid9, tid800]
>>> ...
>>> 78 [tid23, tid99, tid800]
>>> 
>>> batch-2 at RBTree
>>> 1  [tid1001, tid1203, tid1991]
>>> ...
>>> ...
>>> 97 [tid1023, tid1099, tid1800]
>>> 
>>> Since all the tuples in each batch (1, 2, .. 78) are sorted already, we
>>> can just flush them into tuplesort as a 'run' *without any sorts*,
>>> however within this way, it is possible to produce more 'runs' than what
>>> you did in your patch.
>>> 
>>
>> Oh! Now I think I understand what you were proposing - you're saying
>> that when dumping the RBTree to the tuplesort, we could tell the
>> tuplesort this batch of tuples is already sorted, and tuplesort might
>> skip some of the work when doing the sort later.
>>
>> I guess that's true, and maybe it'd be useful elsewhere, I still think
>> this could be left as a future improvement. Allowing it seems far from
>> trivial, and it's not quite clear if it'd be a win (it might interfere
>> with the existing sort code in unexpected ways).
>
> Yes, and I agree that can be done later and I'm thinking Matthias's
> proposal is more promising now. 
>
>>> new way: the No. of batch depends on size of RBTree's batch size.
>>> existing way: the No. of batch depends on size of work_mem in tuplesort.
>>> Usually the new way would cause more no. of runs which is harmful for
>>> mergeruns.  so I can't say it is an improve of not and not include it in
>>> my previous patch. 
>>> 
>>> however case 1 sounds a good canidiates for this method.
>>> 
>>> Tuples from state->bs_worker_state after the perform_sort and ctid
>>> merge: 
>>> 
>>> 1 [tid1, tid8, tid100, tid1001, tid1203, tid1991]
>>> 2 [tid1, tid9, tid800]
>>> 78 [tid23, tid99, tid800]
>>> 97 [tid1023, tid1099, tid1800]
>>> 
>>> then when we move tuples to bs_sort_state, a). we don't need to sort at
>>> all. b). we can merge all of them into 1 run which is good for mergerun
>>> on leader as well. That's the thing I did in the previous patch.
>>> 
>>
>> I'm sorry, I don't understand what you're proposing. Could you maybe
>> elaborate in more detail?
>
> After we called "tuplesort_performsort(state->bs_worker_sort);" in
> _gin_process_worker_data, all the tuples in bs_worker_sort are sorted
> already, and in the same function _gin_process_worker_data, we have
> code:
>
> while ((tup = tuplesort_getgintuple(worker_sort, &tuplen, true)) != NULL)
> {
>
>   ....(1)
>   
>   tuplesort_putgintuple(state->bs_sortstate, ntup, ntuplen);
>
> }
>
> and later we called 'tuplesort_performsort(state->bs_sortstate);'.  Even
> we have some CTID merges activity in  '....(1)', the tuples are still
> ordered, so the sort (in both tuplesort_putgintuple and
> 'tuplesort_performsort) are not necessary, what's more, in the each of
> 'flush-memory-to-disk' in tuplesort, it create a 'sorted-run', and in
> this case, acutally we only need 1 run only since all the input tuples
> in the worker is sorted. The reduction of 'sort-runs' in worker will be
> helpful to leader's final mergeruns.  the 'sorted-run' benefit doesn't
> exist for the case-1 (RBTree -> worker_state). 
>
> If Matthias's proposal is adopted, my optimization will not be useful
> anymore and Matthias's porposal looks like a more natural and effecient
> way.

-- 
Best Regards
Andy Fan




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Built-in CTYPE provider
Следующее
От: jian he
Дата:
Сообщение: Re: Doc Rework: Section 9.16.13 SQL/JSON Query Functions