Re: Proposal: "query_work_mem" GUC, to distribute working memory to the query's individual operators
От | James Hunter |
---|---|
Тема | Re: Proposal: "query_work_mem" GUC, to distribute working memory to the query's individual operators |
Дата | |
Msg-id | CAJVSvF5n3_uEGW5GZSRehDuTfz7XVDohbn7tVJ+2ZnweQEVFrQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Proposal: "query_work_mem" GUC, to distribute working memory to the query's individual operators (James Hunter <james.hunter.pg@gmail.com>) |
Ответы |
Re: Proposal: "query_work_mem" GUC, to distribute working memory to the query's individual operators
|
Список | pgsql-hackers |
Attaching a new revision, which substantially reworks the previous revision -- For the previous revision, I ran into problems (exposed by CI tests) when trying to get my "subPlan" list to work, because this means we have two pointers into a single SubPlan, which breaks both serialization and copyObject(). This led to a new approach. The former Patch 1 is no longer needed, because that "subPlan" logic never worked anyway. Now, I store the workmem info, in Lists, first on the PlannerGlobal, then transferred to the PlannedStmt. Every [Sub]Plan that needs working memory now gets a "workmem_id" index into these Lists. Since it's just an index, it survives serialization and copyObject(). So, now the workmem info can now be successfully roundtripped. It also makes it easier (and faster) for an extension to adjust workmem limits for an entire query, since all of the query's workmem info is available directly from the PlannedStmt -- without requiring us to traverse the Plan + Expr trees. (My example hook/extension dropped by a couple hundred LoC, since the previous revision, because now it can just loop over a List, instead of needing to walk a Plan tree.) So, now we have: - Patch 1: adds a workmem limit to the PlannerGlobal, inside createplan.c, and stores the corresponding workmem_id on the Plan or SubPlan. The List is copied from the PlannerGlobal to the PlannedStmt, as normal. We trivially set the workmem limit inside ExecAssignWorkMem(), called from InitPlan. This patch is a no-op, since it just copies existing GUC values to the workmem limit, and then applies that limit inside ExecInitNode(). - Patch 2: copies the planner's workmem estimate to the PlannerGlobal / PlannedStmt, to allow an extension to set the workmem limit intelligently (without needing to traverse to the Plan or SubPlan). This patch is a no-op, since it just records an estimate on the PlannerGlobal / PlannedStmt, but doesn't do anything with it (yet). - Patch 3: displays the workmem info we set in Patches 1 and 2, to a new EXPLAIN (work_mem on) option. Also adds a unit test. - Patch 4: adds a hook and extension that show how to override the default workmem limits, to implement a query_work_mem GUC. I think this version is pretty close to a finished design proposal: * top-level list(s) of workmem info; * Plans and SubPlans that need workmem "registering" themselves during createplan.c; * exec nodes reading their workmem limits from the PlannedStmt, via plan->workmem_id (or variants, in cases where a [Sub]Plan has multiple data structures of *different* sizes); * InitPlan() calls a function or hook to fill in the actual workmem limits; * Workmem info copied / serialized to PQ workers, and stored in Plan cache (but the limit is always overwritten inside InitPlan()); and * Hook / extension reads the workmem info and sets a sensible limit, based on its own heuristic. Patch 4 shows that we can pretty easily (400 lines, including comments) propagate a per-query workmem limit to individual [Sub]Plans' data structures, in a reasonable way. Compared to the previous revision, this patch set: - eliminates the Plan traversal in execWorkMem.c and workmem.c; - removes the "SubPlan" logic from setrefs.c, leaving setrefs unchanged; and - sets the estimate and reserves a slot for the limit, inside createplan.c. So, now, the logic to assign workmem limits is just a for- loop in execWorkMem.c; and it's just 2 for- loops + 1 sort, in the workmem extension. Questions, comments? Thanks, James
Вложения
В списке pgsql-hackers по дате отправления: