Re: [PATCH] Incremental sort (was: PoC: Partial sort)
От | Tomas Vondra |
---|---|
Тема | Re: [PATCH] Incremental sort (was: PoC: Partial sort) |
Дата | |
Msg-id | 20190709121125.5bpaxgtd5wl7aghy@development обсуждение исходный текст |
Ответ на | Re: [PATCH] Incremental sort (was: PoC: Partial sort) (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Ответы |
Re: [PATCH] Incremental sort (was: PoC: Partial sort)
|
Список | pgsql-hackers |
On Tue, Jul 09, 2019 at 03:37:03AM +0200, Tomas Vondra wrote: > > ... > >Notice that cost of the second plan is almost double the first one. That >means 0004 does not even generate the first plan, i.e. there are cases >where we don't try to add the explicit sort before passing the path to >generate_gather_paths(). > >And I think I know why is that - while gather_grouping_paths() tries to >add explicit sort below the gather merge, there are other places that >call generate_gather_paths() that don't do that. In this case it's >probably apply_scanjoin_target_to_paths() which simply builds > > parallel (seq|index) scan + gather merge > >and that's it. The problem is likely the same - the code does not know >which pathkeys are "interesting" at that point. We probably need to >teach planner to do this. > I've looked into this again, and yes - that's the reason. I've added generate_useful_gather_paths() which is a wrapper on top of generate_gather_paths(). It essentially does what 0001 patch did directly in generate_gather_paths() so it's more like create_grouping_paths(). And that does the trick - we now generate the cheaper paths, and I don't see any crashes in regression tests etc. I still suspect we already have code doing similar checks whether pathkeys might be useful somewhere. I've looked into pathkeys.c and elsewhere but no luck. Attached is a slightly modified patch series: 1) 0001 considers incremental sort paths in various places (adds the new generate_useful_gather_paths and modifies places calling create_sort_path) 2) 0002 and 0003 are fixes I mentioned before 3) 0004 adds a new GUC force_incremental_sort that (when set to 'on') tries to nudge the optimizer into using incremental sort by essentially making it free (i.e. using startup/total costs of the subpath). I've found this useful when trying to force incremental sorts into plans where it may not be the best strategy. I won't have time to hack on this over the next ~2 weeks, but I'll try to respond to questions when possible. > >FWIW tweaking all the create_sort_path() places to also consider adding >incremental sort is a bit tedious and invasive, and it almost doubles >the amount of repetitive code. It's OK for experiment like this, but we >should try handling this in a nicer way (move to a separate function >that does both, or something like that). > This definitely needs more work. We need to refactor it in some way, e.g. have a function that would consider both explicit sort (on the cheapest path) and incremental sort (on all paths), and call it from all those places. Haven't tried it, though. There's also a couple more places where we do create_sort_path() and don't consider incremental sort yet - window functions, distinct etc. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
В списке pgsql-hackers по дате отправления: