Re: Safer hash table initialization macro
| От | Bertrand Drouvot |
|---|---|
| Тема | Re: Safer hash table initialization macro |
| Дата | |
| Msg-id | aTAcV/OF0uAfT4rL@ip-10-97-1-34.eu-west-3.compute.internal обсуждение исходный текст |
| Ответ на | Re: Safer hash table initialization macro (Jelte Fennema-Nio <postgres@jeltef.nl>) |
| Список | pgsql-hackers |
Hi, On Mon, Dec 01, 2025 at 03:44:41PM +0100, Jelte Fennema-Nio wrote: > On Mon, 1 Dec 2025 at 14:45, Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > Thoughts? > > I think the hashtable creation API in postgres is so terrible that it > actively discourages usage. Thanks for sharing your thoughts! > So I'm definitely in favor of improving this API (probably by adding a > few new functions). I have a few main thoughts on what could be > improved: > > 1. Automatically determine keysize and entrysize given a keymember and > entrytype (like you suggested). PFA a patch introducing and using the new macro. Note that it also introduces HASH_ELEM_INIT_FULL for the rare cases where the whole struct is the key. Also one case remains untouched: $ git grep "entrysize = sizeof" "*.c" src/backend/replication/logical/relation.c: ctl.entrysize = sizeof(LogicalRepRelMapEntry); That's because the key is a member of a nested struct so that the new macro can not be used. As there is only one occurrence of it, I think we can keep it as it is. But we could create a dedicated macro for those cases if we feel the need. Now that I'm writing this, that might be a better idea: that way we'd avoid any "entrysize/keysize = " in the .c files. Also a nice side effect of using the macros: 138 insertions(+), 203 deletions(-) > 2. Autodect most of the flags. > a. HASH_PARTITION, HASH_SEGMENT, HASH_FUNCTION, HASH_DIRSIZE, > HASH_COMPARE, HASH_KEYCOPY, HASH_ALLOC can all be simply be detected > based on the relevant fields from HASHCTL. Passing them in explicitly > is just duplication that causes code noise and is easy to forget > accidentally. > b. HASH_ELEM is useless noise because it is required > c. HASH_BLOBS could be the default (and maybe use HASH_STRINGS by > default if the keytype is char*) > 3. Default to CurrentMemoryContext instead of TopMemoryContext. Like > all of the other functions that allocate, because right now it's way > too easy to accidentally use TopMemoryContext when you did not intend > to. > 4. Have a creation function that doesn't require HASHCTL at all (just > takes entrytype and keymember and maybe a version of this that takes a > memorycontext). Thanks for the above suggestions! I did not think so deep as you did during your Citus time, but will think about those too. I suggest we move forward one step at a time, first step being the new macros. Does that make sense to you? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Вложения
В списке pgsql-hackers по дате отправления: