Re: [PATCH] Incremental sort (was: PoC: Partial sort)
От | James Coleman |
---|---|
Тема | Re: [PATCH] Incremental sort (was: PoC: Partial sort) |
Дата | |
Msg-id | CAAaqYe9K1LRRMaVpMwPiv3dwZu9i_0JD7tpLfgNHqUJMkEnU7Q@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [PATCH] Incremental sort (was: PoC: Partial sort) (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Ответы |
Re: [PATCH] Incremental sort (was: PoC: Partial sort)
(Tomas Vondra <tomas.vondra@2ndquadrant.com>)
|
Список | pgsql-hackers |
On Fri, Sep 13, 2019 at 10:51 AM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > > On Thu, Sep 12, 2019 at 12:49:29PM -0300, Alvaro Herrera wrote: > >On 2019-Jul-30, Tomas Vondra wrote: > >> I've decided to do a couple of experiments, trying to make my mind about > >> which modified places matter to diffrent queries. But instead of trying > >> to reverse engineer the queries, I've taken a different approach - I've > >> compiled a list of queries that I think are sensible and relevant, and > >> then planned them with incremental sort enabled in different places. > > > >[...] > > > >> The list of queries (synthetic, but hopefully sufficiently realistic) > >> and a couple of scripts to collect the plans is in this repository: > >> > >> https://github.com/tvondra/incremental-sort-tests-2 > >> > >> There's also a spreadsheet with a summary of results, with a visual > >> representation of which GUCs affect which queries. > > > >OK, so we have that now. I suppose this spreadsheet now tells us which > >places are useful and which aren't, at least for the queries that you've > >tested. Dowe that mean that we want to get the patch to consider adding > >paths only the places that your spreadsheet says are useful? I'm not > >sure what the next steps are for this patch. > > > > Yes. I think the spreadsheet call help us with answering two things: > > 1) places actually affecting the plan (all but three do) > > 2) redundant places (there are some cases where two GUCs produce the > same plan in the end) To expand on this further, (1) should probably help us to be able to write test cases. Additionally, one big thing we still need that's somewhat external to the patch is a good way to benchmark/a set of queries that we believe are representative enough to be good benchmarks. I'd really appreciate some input from you all on that particular question; I feel like it's in some sense the biggest barrier to getting the patch merged, but also the part where long experience in the community/exposure to other use cases will probably be quite valuable. > Of course, this does assume the query set makes sense and is somewhat > realistic, but I've tried to construct queries where that is true. We > may extend it over time, of course. > > I think we've agreed to add incremental sort paths different places in > separate patches, to make review easier. So this may be a useful way to > decide which places to address first. I'd probably do it in this order: > > - create_ordered_paths > - create_ordered_paths (parallel part) > - add_paths_to_grouping_rel > - ... not sure ... > > but that's just a proposal. It'd give us most of the benefits, I think, > and we could also focus on the rest of the patch. Certainly the first two seem like pretty obvious most necessary base cases. I think supporting group bys also seems like a pretty standard case, so at first glance I'd say this seems like a reasonable course to me. I'm going to start breaking up the patches in this thread into a series in support of that. Since I've started a new thread with the add_partial_path change, I'll include that patch here as part of this series also. Do you think it's worth moving the tuplesort changes into a standalone patch in the series also? Attached is a rebased v31 now broken into the following: - 001-consider-startup-cost-in-add-partial-path_v1.patch: From the other thread (Tomas's patch unmodified) - 002-incremental-sort_v31.patch: Updated base incremental sort patch Besides rebasing, I've changed the enable_incrementalsort GUC to prevent generating paths entirely rather than being cost-based, since incremental sort is never absolutely necessary in the way regular sort is. I'm hoping to add 003 soon with the initial parallel parts, but I'm about out of time right now and wanted to get something out, so sending this without that. Side question: for the patch tester do I have to attach each part of the series each time even if nothing's changed in several of them? And does the vN number at the end need to stay the same for all of them? My attachments to this email don't follow that... Also, since this email changes patch naming, so I need to do anything to clear out the old ones? (I suppose if not, then that would imply an answer to the first question also.) James
Вложения
В списке pgsql-hackers по дате отправления: