Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment
От | Ilia Evdokimov |
---|---|
Тема | Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment |
Дата | |
Msg-id | fbf77241-e4c5-4f70-ac71-8bf8cff89f39@tantorlabs.com обсуждение исходный текст |
Ответ на | Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment (David Rowley <dgrowleyml@gmail.com>) |
Список | pgsql-hackers |
On 20.03.2025 13:37, David Rowley wrote:
est_entries is a uint32, so %u is the correct format character for that type. I don't think you need to prefix all these properties with "Cache" just because the other two properties have that prefix. I also don't think the names you've chosen really reflect the meaning. How about something like: "Estimated Distinct Lookup Keys: 721 Estimated Capacity: 655", in that order. I think maybe having that as one line for format=text is better than 2 lines. The EXPLAIN output is already often taking up more lines in v18 than in v17, would be good to not make that even worse unnecessarily.
LGTM
I see the existing code there could use ExplainPropertyText rather than have a special case for text and non-text formats. That's likely my fault. If we're touching this code, then we should probably tidy that up. Do you want to create a precursor fixup patch for that?
I fully agree with this suggestion. Then I'll begin with this on another new thread.
+ double ndistinct; /* Estimated number of distinct memoization keys, + * used for cache size evaluation. Kept for EXPLAIN */ Maybe this field in MemoizePath needs a better name. How about "est_unique_keys"? and also do the rename in struct Memoize.
LGTM
I'm also slightly concerned about making struct Memoize bigger. I had issues with a performance regression [1] for 908a96861 when increasing the WindowAgg struct size last year and the only way I found to make it go away was to shuffle the fields around so that the struct size didn't increase. I think we'll need to see a benchmark of a query that hits Memoize quite hard with a small cache size to see if the performance decreases as a result of adding the ndistinct field. It's unfortunate that we'll not have the luxury of squeezing this double into padding if we do see a slowdown.
I tried rearranging the fields in the structure with 'ndistinct' field, but unfortunately, sizeof(Memoize) did not remain the same. Therefore, benchmarking Memoize is definitely necessary. Thanks for the information!
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.
В списке pgsql-hackers по дате отправления: