Re: [PATCH] Use optimized single-datum tuplesort in ExecSort
От | Ronan Dunklau |
---|---|
Тема | Re: [PATCH] Use optimized single-datum tuplesort in ExecSort |
Дата | |
Msg-id | 4385594.q5DUe7zQD7@aivenronan обсуждение исходный текст |
Ответ на | Re: [PATCH] Use optimized single-datum tuplesort in ExecSort (Dilip Kumar <dilipbalaut@gmail.com>) |
Ответы |
Re: [PATCH] Use optimized single-datum tuplesort in ExecSort
|
Список | pgsql-hackers |
Thank you for the review, I will address those shortly, but will answer some questions in the meantime. > > First, the changes are lacking any explanatory comments. Probably we > > should follow how nodeAgg does this and add both comments to the > > ExecSort function header as well as specific comments above the "if" > > around the new tuplesort_begin_datum explaining the specific > > conditions that are required for the optimization to be useful and > > safe. Done, since I lifted the restrictions following your questions, there isn't much left to comment. (see below) > > > > That leads to a question I had: I don't follow why bounded mode (when > > using byval) needs to be excluded. Comments should be added if there's > > a good reason (as noted above), but maybe it's a case we can handle > > safely? I had test failures when trying to move the Datum around when performing a bounded sort, but did not look into it at first. Now I've looked into it, and the switch to a heapsort when using bounded mode just unconditionnaly tried to free a tuple that was never there to begin with. So if the SortTuple does not contain an actual tuple, but only a single datum, do not do that. I've updated the patch to fix this and enable the optimization in the case of bounded sort. > > > > A second question: at first glance it's intuitively the case we might > > not be able to handle byref values. But nodeAgg doesn't seem to have > > that restriction. What's the difference here? > > I think tuplesort_begin_datum, doesn't have any such limitation, it > can handle any type of Datum so I think we don't need to consider the > only attbyval, we can consider any type of attribute for this > optimization. I've restricted the optimization to byval types because of the following comment in nodeAgg.c: /* * Note: if input type is pass-by-ref, the datums returned by the sort are * freshly palloc'd in the per-query context, so we must be careful to * pfree them when they are no longer needed. */ As I was not sure how to handle that, I prefered the safety of not enabling it. Since you both told me it should be safe, I've lifted that restriction too. > A few small code observations: > - In my view the addition of unlikely() in ExecSort is unlikely to be > of benefit because it's a single call for the entire node's execution > (not in the tuple loop). Done. > - It seems clearer to change the "if (!node->is_single_val)" to flip > the true/false cases so we don't need the negation. Agreed, done. > - I assume there are tests that likely already cover this case, but > it'd be worth verifying that. Yes many test cases cover that, but maybe it would be better to explictly check for it on some cases: do you think we could add a debug message that can be checked for ? > Finally, I believe the same optimization likely ought to be added to > nodeIncrementalSort. It's less likely the tests there are sufficient > for both this and the original case, but we'll see. I will look into it, but isn't incrementalsort used to sort tuples on several keys, when they are already sorted on the first ? In that case, I doubt we would ever have a single-valued tuple here, except if there is a projection to strip the tuple from extraneous attributes. -- Ronan Dunklau
Вложения
В списке pgsql-hackers по дате отправления: