Re: pg_plan_advice

Поиск
Список
Период
Сортировка
От Haibo Yan
Тема Re: pg_plan_advice
Дата
Msg-id CABXr29E5ASPg8=4h2LLYFyEd8QuXXBOHJi4ujHigTt3XV4J4eQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: pg_plan_advice  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: pg_plan_advice
Список pgsql-hackers
>> 1. GEQO interaction (patch 4):
>> Since GEQO relies on randomized search, is there a risk that the optimizer may fail to explore the specific join order or path that is being enforced by the advice mask? In that case, could this lead to failures such as inability to construct the required join relation or excessive planning time if the desired path is not sampled?

> The interaction of this feature with GEQO definitely needs more study.
> If you have some time to work on this, I think testing and reporting
> results would be quite useful. However, I don't think we should ever
> get planner failure, and I'm doubtful about excessive planning time as
> well. The effect of plan advice is to disable some paths just as if
> enable_<whatever> were set to false, so if you provide very specific
> advice while planning with GEQO, I think you might just end up with a
> disabled path that doesn't account for the advice. However, this
> should be checked, and I haven't gotten there yet. I'll add an XXX to
> the README to make sure this doesn't get forgotten.

I conducted extensive tests today using randomized advice strings to challenge pg_plan_advice under GEQO pressure. The results strongly support your hypothesis: for standard Left-Deep trees (which GEQO natively supports), the interaction is stable and efficient.

I executed a stress test involving 100,000 iterations (100 random join structures x 1000 random seeds). The planning time remained low, and no planning failures occurred for valid topology advice.

Observation on Bushy Plans: I did identify one anomaly regarding "Bushy Plans" (e.g., ((t1 t2) (t3 t4))). Since PostgreSQL's GEQO implementation is strictly Left-Deep and cannot generate Bushy trees, if a user manually forges a Bushy Plan advice:

It does not cause a planner crash (e.g., "failed to construct join relation").

Instead, the planner seems to silently ignore the structural constraint of the advice and falls back to a path GEQO can actually find.

I believe this behavior is acceptable because pg_plan_advice is intended to stabilize plans that the optimizer can generate. Since GEQO cannot generate Bushy plans, users should not be supplying them.

Script
-----------------------------------------------------------------------------------
/* * GEQO Stress Test for pg_plan_advice
 * -----------------------------------
 * Methodology:
 * 1. Generates 100 random "Left-Deep" join topologies (t1 joining t2..t100 in random orders).
 * 2. This simulates valid advice that GEQO is capable of producing.
 * 3. For each topology, runs 1000 iterations with random GEQO seeds.
 * 4. Measures success rate and planning time overhead.
 */
DO $$
DECLARE
    v_jo       TEXT;
    v_jo_rest  TEXT;
    v_nl       TEXT;
    v_scan     TEXT;
    v_ng       TEXT;
    v_adv      TEXT;
    v_sql      TEXT;
    v_seed     FLOAT;
    v_ok       INT := 0;
    v_err      INT := 0;
    v_msg      TEXT;
    k          INT;
    i          INT;
    j          INT;
    v_ts1      timestamp;
    v_ts2      timestamp;
    v_cur_ms   numeric;
    v_total_ms numeric := 0;
    v_max_ms   numeric := 0;
BEGIN
    -- Pre-generate static parts of the advice to save time
    SELECT string_agg('t'||n, ' ' ORDER BY n) INTO v_nl   FROM generate_series(2,100) n;
    SELECT string_agg('t'||n, ' ' ORDER BY n) INTO v_scan FROM generate_series(1,100) n;
    SELECT string_agg('t'||n, ' ' ORDER BY n) INTO v_ng   FROM generate_series(1,100) n;

    -- Construct the SQL: t1 JOIN t2 JOIN t3 ... JOIN t100
    v_sql := 'EXPLAIN (COSTS OFF) SELECT count(*) FROM t1';
    FOR j IN 2..100 LOOP
        v_sql := v_sql || ' JOIN t' || j || ' ON t1.id=t' || j || '.id';
    END LOOP;

    -- Configure GEQO for stress testing (force it ON, low effort/pool)
    PERFORM set_config('geqo', 'on', false);
    PERFORM set_config('geqo_threshold', '12', false);
    PERFORM set_config('geqo_effort', '1', false);
    PERFORM set_config('geqo_pool_size', '0', false);

    RAISE NOTICE 'Starting Stress Test: 100 Outer Loops (Random Plans) x 1000 Inner Loops (Random Seeds)...';

    -- Outer Loop: Generate 100 different valid Advice structures
    FOR k IN 1..100 LOOP
        -- Randomize the join order of t2..t100 to simulate different Left-Deep trees
        SELECT string_agg('t'||n, ' ' ORDER BY random()) INTO v_jo_rest FROM generate_series(2,100) n;
        v_jo := 't1 ' || v_jo_rest;

        v_adv := 'JOIN_ORDER(' || v_jo || ') ' ||
                 'NESTED_LOOP_PLAIN(' || v_nl || ') ' ||
                 'SEQ_SCAN(' || v_scan || ') ' ||
                 'NO_GATHER(' || v_ng || ')';

        PERFORM set_config('pg_plan_advice.advice', v_adv, false);

        -- Inner Loop: Test the specific advice against 1000 random GEQO seeds
        FOR i IN 1..1000 LOOP
            v_seed := random();
            PERFORM set_config('geqo_seed', v_seed::text, false);

            BEGIN
                v_ts1 := clock_timestamp();
               
                EXECUTE v_sql;
               
                v_ts2 := clock_timestamp();

                v_cur_ms := EXTRACT(EPOCH FROM (v_ts2 - v_ts1)) * 1000;
                v_total_ms := v_total_ms + v_cur_ms;
               
                IF v_cur_ms > v_max_ms THEN
                    v_max_ms := v_cur_ms;
                END IF;

                v_ok := v_ok + 1;

            EXCEPTION WHEN OTHERS THEN
                GET STACKED DIAGNOSTICS v_msg = MESSAGE_TEXT;
                v_err := v_err + 1;
                RAISE WARNING 'Outer % / Inner % Crashed! Seed: %, Err: %', k, i, v_seed, v_msg;
            END;
        END LOOP;
       
        RAISE NOTICE 'Batch %/100 completed.', k;
    END LOOP;

    RAISE NOTICE '---------------------------';
    RAISE NOTICE 'Total Scenarios: 100,000';
    RAISE NOTICE 'Success:         %', v_ok;
    RAISE NOTICE 'Failed:          %', v_err;
    RAISE NOTICE 'Total Time:      % ms', round(v_total_ms, 2);
    RAISE NOTICE 'Avg Time:        % ms', round(v_total_ms / (v_ok + v_err + 0.0001), 2);
    RAISE NOTICE 'Max Time:        % ms', round(v_max_ms, 2);
    RAISE NOTICE '---------------------------';

    IF v_err > 0 THEN
        RAISE NOTICE 'CONCLUSION: Conflict found.';
    ELSE
        RAISE NOTICE 'CONCLUSION: No errors found.';
    END IF;
END $$;
-----------------------------------------------------------------------------------


On Tue, Jan 6, 2026 at 11:50 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Dec 29, 2025 at 6:34 PM Haibo Yan <tristan.yim@gmail.com> wrote:
> 1. GEQO interaction (patch 4):
> Since GEQO relies on randomized search, is there a risk that the optimizer may fail to explore the specific join order or path that is being enforced by the advice mask? In that case, could this lead to failures such as inability to construct the required join relation or excessive planning time if the desired path is not sampled?

The interaction of this feature with GEQO definitely needs more study.
If you have some time to work on this, I think testing and reporting
results would be quite useful. However, I don't think we should ever
get planner failure, and I'm doubtful about excessive planning time as
well. The effect of plan advice is to disable some paths just as if
enable_<whatever> were set to false, so if you provide very specific
advice while planning with GEQO, I think you might just end up with a
disabled path that doesn't account for the advice. However, this
should be checked, and I haven't gotten there yet. I'll add an XXX to
the README to make sure this doesn't get forgotten.

> 2. Parallel query serialization (patches 1–3):
> Several new fields (subrtinfos, elidedNodes, child_append_relid_sets) are added to PlannedStmt, but I did not see corresponding changes in outfuncs.c / readfuncs.c. Without serialization support, parallel workers executing subplans or Append nodes may not receive this metadata. Is this handled elsewhere, or is it something still pending?

I believe that gen_node_support.pl should take care of this
automatically unless the node type is flagged as
pg_node_attr(custom_read_write).

> 3. Alias handling when generating advice (patch 5):
> In pgpa_output_relation_name, the advice string is generated using get_rel_name(relid), which resolves to the underlying table name rather than the RTE alias. In self-join cases this could be ambiguous (e.g., my_table vs my_table). Would it be more appropriate to use the RTE alias when available?

No. That function is only used for indexes.

> 4. Minor typo (patch 4):
> In src/include/nodes/relation.h, parititonwise appears to be a typo and should likely be partitionwise.

Will fix, thanks.

--
Robert Haas
EDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления: