Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment
От | David Rowley |
---|---|
Тема | Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment |
Дата | |
Msg-id | CAApHDvpq2gOt3gnzkd4Ud=wrT7YRGrF3zb29jgWzDsYEhETrOA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment (Ilia Evdokimov <ilya.evdokimov@tantorlabs.com>) |
Ответы |
Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment
Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment |
Список | pgsql-hackers |
On Thu, 20 Mar 2025 at 21:48, Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> wrote: > -> Memoize (cost=0.30..0.41 rows=1 width=4) > Cache Key: t2.thousand > Cache Mode: logical > Cache Estimated Entries: 655 > Cache Estimated NDistinct: 721 > -> Index Only Scan using tenk1_unique1 on tenk1 t1 (cost=0.29..0.40 rows=1 width=4) > Index Cond: (unique1 = t2.thousand) > (11 rows) > > Additionally, since this information would only be shown in EXPLAIN when costs are enabled, it should not cause any performanceregression in normal execution. However, reviewers should be especially careful when verifying test outputs, asthis change could affect plan details in regression tests. > > Any thoughts? > + ExplainIndentText(es); > + appendStringInfo(es->str, "Cache Estimated Entries: %d\n", ((Memoize *) plan)->est_entries); > + ExplainIndentText(es); > + appendStringInfo(es->str, "Cache Estimated NDistinct: %0.0f\n", ((Memoize *) plan)->ndistinct); est_entries is a uint32, so %u is the correct format character for that type. I don't think you need to prefix all these properties with "Cache" just because the other two properties have that prefix. I also don't think the names you've chosen really reflect the meaning. How about something like: "Estimated Distinct Lookup Keys: 721 Estimated Capacity: 655", in that order. I think maybe having that as one line for format=text is better than 2 lines. The EXPLAIN output is already often taking up more lines in v18 than in v17, would be good to not make that even worse unnecessarily. I see the existing code there could use ExplainPropertyText rather than have a special case for text and non-text formats. That's likely my fault. If we're touching this code, then we should probably tidy that up. Do you want to create a precursor fixup patch for that? + double ndistinct; /* Estimated number of distinct memoization keys, + * used for cache size evaluation. Kept for EXPLAIN */ Maybe this field in MemoizePath needs a better name. How about "est_unique_keys"? and also do the rename in struct Memoize. I'm also slightly concerned about making struct Memoize bigger. I had issues with a performance regression [1] for 908a96861 when increasing the WindowAgg struct size last year and the only way I found to make it go away was to shuffle the fields around so that the struct size didn't increase. I think we'll need to see a benchmark of a query that hits Memoize quite hard with a small cache size to see if the performance decreases as a result of adding the ndistinct field. It's unfortunate that we'll not have the luxury of squeezing this double into padding if we do see a slowdown. David [1] https://postgr.es/m/flat/CAHoyFK9n-QCXKTUWT_xxtXninSMEv%2BgbJN66-y6prM3f4WkEHw%40mail.gmail.com
В списке pgsql-hackers по дате отправления: