Re: optimize lookups in snapshot [sub]xip arrays
От | Andres Freund |
---|---|
Тема | Re: optimize lookups in snapshot [sub]xip arrays |
Дата | |
Msg-id | 20220715200857.nm2gc4weixwlwbnx@awork3.anarazel.de обсуждение исходный текст |
Ответ на | optimize lookups in snapshot [sub]xip arrays (Nathan Bossart <nathandbossart@gmail.com>) |
Ответы |
Re: optimize lookups in snapshot [sub]xip arrays
|
Список | pgsql-hackers |
Hi, Sounds worth pursuing. On 2022-07-13 10:09:50 -0700, Nathan Bossart wrote: > The attached patch has some key differences from the previous proposal. > For example, the new patch uses simplehash instead of open-coding a new > hash table. +1 > The attached patch waits until a lookup of [sub]xip before generating the > hash table, so we only need to allocate enough space for the current > elements in the [sub]xip array, and we avoid allocating extra memory for > workloads that do not need the hash tables. Hm. Are there any contexts where we'd not want the potential for failing due to OOM? I wonder if we additionally / alternatively could use a faster method of searching the array linearly, e.g. using SIMD. Another thing that might be worth looking into is to sort the xip/subxip arrays into a binary-search optimized layout. That'd make the binary search faster, wouldn't require additional memory (a boolean indicating whether sorted somewhere, I guess), and would easily persist across copies of the snapshot. > I'm slightly worried about increasing the number of memory > allocations in this code path, but the results above seemed encouraging on > that front. ISMT that the test wouldn't be likely to show those issues. > These hash tables are regarded as ephemeral; they only live in > process-local memory and are never rewritten, copied, or > serialized. What does rewriting refer to here? Not convinced that's the right idea in case of copying. I think we often end up copying snapshots frequently, and building & allocating the hashed xids separately every time seems not great. > + snapshot->xiph = NULL; > + snapshot->subxiph = NULL; Do we need separate hashes for these? ISTM that if we're overflowed then we don't need ->subxip[h], and if not, then the action for an xid being in ->xip and ->subxiph is the same? Greetings, Andres Freund
В списке pgsql-hackers по дате отправления: