Re: A performance issue with Memoize

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: A performance issue with Memoize
Дата
Msg-id CAApHDvoTyLG35Rm26F7-MFmX3kv-uAs9FD4XF7fqy-B0W6o_UQ@mail.gmail.com
обсуждение исходный текст
Ответ на A performance issue with Memoize  (Richard Guo <guofenglinux@gmail.com>)
Ответы Re: A performance issue with Memoize  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: A performance issue with Memoize  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Fri, 20 Oct 2023 at 23:40, Richard Guo <guofenglinux@gmail.com> wrote:
> The Memoize runtime stats 'Hits: 0  Misses: 10000  Evictions: 9999'
> seems suspicious to me, so I've looked into it a little bit, and found
> that the MemoizeState's keyparamids and its outerPlan's chgParam are
> always different, and that makes us have to purge the entire cache each
> time we rescan the memoize node.
>
> But why are they always different?  Well, for the query above, we have
> two NestLoopParam nodes, one (with paramno 1) is created when we replace
> outer-relation Vars in the scan qual 't1.two = s.two', the other one
> (with paramno 0) is added from the subquery's subplan_params, which was
> created when we replaced uplevel vars with Param nodes for the subquery.
> That is to say, the chgParam would be {0, 1}.
>
> When it comes to replace outer-relation Vars in the memoize keys, the
> two 't1.two' Vars are both replaced with the NestLoopParam with paramno
> 1, because it is the first NLP we see in root->curOuterParams that is
> equal to the Vars in memoize keys.  That is to say, the memoize node's
> keyparamids is {1}.

I see the function calls were put this way around in 5ebaaa494
(Implement SQL-standard LATERAL subqueries.), per:

@ -1640,6 +1641,7 @@ create_subqueryscan_plan(PlannerInfo *root, Path
*best_path,
    {
        scan_clauses = (List *)
            replace_nestloop_params(root, (Node *) scan_clauses);
+       identify_nestloop_extparams(root, best_path->parent->subplan);
    }

(identify_nestloop_extparams was later renamed to
process_subquery_nestloop_params in 46c508fbc.)

I think fixing it your way makes sense.  I don't really see any reason
why we should have two. However...

Another way it *could* be fixed would be to get rid of pull_paramids()
and change create_memoize_plan() to set keyparamids to all the param
IDs that match are equal() to each param_exprs.  That way nodeMemoize
wouldn't purge the cache as we'd know the changing param is accounted
for in the cache.  For the record, I don't think that's better, but it
scares me a bit less as I don't know what other repercussions there
are of applying your patch to get rid of the duplicate
NestLoopParam.paramval.

I'd feel better about doing it your way if Tom could comment on if
there was a reason he put the function calls that way around in
5ebaaa494.

I also feel like we might be getting a bit close to the minor version
releases to be adjusting this stuff in back branches.

David



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: [PATCH] Add native windows on arm64 support
Следующее
От: Peter Smith
Дата:
Сообщение: Re: Documentation to upgrade logical replication cluster