Expose custom planning data in EXPLAIN
От | Andrei Lepikhov |
---|---|
Тема | Expose custom planning data in EXPLAIN |
Дата | |
Msg-id | e7cb9db0-afb6-4ba5-b86f-307683133387@gmail.com обсуждение исходный текст |
Список | pgsql-hackers |
Hi, Background and motivation ------------------------- This feature is inspired by the pg_overexplain, new EXPLAIN-related hooks and recent change of the Memoize node representation (4bc62b86849). Based on user reports from the production system, we typically receive no more than an 'explain verbose analyse' output to identify the issue's origin. It is obvious to me that the detailisation of the EXPLAIN format will never be enough. Even more, each new parameter, adding more information, also complicates life for some people, flooding the screen with unnecessary (for them) data. With extensible EXPLAIN options and per-node/summary hooks, we can add as many details as needed using modules. However, there is one limitation: we can only explore Plan and PlanState data. If something isn't transferred from the planning stage to the plan, we won't have a way to expose this data. For example, it is sometimes unclear why the optimiser chose IncrementalSort or [did not choose] HashJoin, as we don't know the ngroups estimation used at this specific point of the plan. Design Overview --------------- It appears that the only two changes required to enable the feature are a hook and a field in the Plan node. In this patch, I have chosen to add the hook to the copy_generic_path_info routine to limit its usage for tracking purposes only. Also, I extended its interface with the PlannerInfo pointer, which may be helpful in many cases. The new extlist field in the Plan structure should contain (by convention) extensible nodes only to let modules correctly pick their data. Also, it simplifies the logic of the node serialisation. An additional motivation for choosing Extensible Node is its lack of core usage, which makes it seem unpolished and requires copying a significant amount of code to use. This patch highlights this imperfection. Tests ----- To demonstrate its real-life application, I added an example to pg_overexplain. Here, a ngroups value is computed, stored in the Plan node, and exposed in explain. Also, it is a test for the ExtensionNode machinery. Downsides ----------- 1. Growth of the plan node 2. Read/write extensible node - what if reading worker (or backend ?) doesn't have the module installed? 3. The point for the hook call. The first issue is quite limited because the only version of the plan exists, in contrast to the multiple paths. The second issue is a little more complicated. However, I believe the issue could be resolved by allowing extensions to determine the logic for serialising their ExtensibleNode. The selection of the point for the hook appears to be quite strict. It does not permit any extensions to alter the final plan or disrupt its consistency, except for some cost data. However, it does allow for tracking the decisions made during the planning phase. See the patch attached. -- regards, Andrei Lepikhov
Вложения
В списке pgsql-hackers по дате отправления: