Обсуждение: dshash_find_or_insert vs. OOM
Hi, For most memory allocation primitives, we offer a "no OOM" version. dshash_find_or_insert is an exception. Here's a small patch to fix that. I have some interest in slipping this into v19 even though I am submitting it quite late, because it would be useful for pg_stash_advice[1]. Let me know what you think about that. -- Robert Haas EDB: http://www.enterprisedb.com [1] As yet uncommitted. See pg_plan_advice thread.
Вложения
> On Mar 18, 2026, at 07:34, Robert Haas <robertmhaas@gmail.com> wrote: > > Hi, > > For most memory allocation primitives, we offer a "no OOM" version. > dshash_find_or_insert is an exception. Here's a small patch to fix > that. I have some interest in slipping this into v19 even though I am > submitting it quite late, because it would be useful for > pg_stash_advice[1]. Let me know what you think about that. > > -- > Robert Haas > EDB: http://www.enterprisedb.com > > [1] As yet uncommitted. See pg_plan_advice thread. > <v1-0001-dshash-Make-it-possible-to-suppress-out-of-memory.patch> I think adding a DSHASH_INSERT_NO_OOM flag is the right choice. As you mentioned, we already have other _NO_OOM flags, sothis feels consistent with existing patterns. I don’t see any correctness issue with the patch. I just have a couple of minor nits. 1 ``` @@ -477,14 +485,22 @@ restart: * reacquire all the locks in the right order to avoid deadlocks. */ LWLockRelease(PARTITION_LOCK(hash_table, partition_index)); - resize(hash_table, hash_table->size_log2 + 1); + if (!resize(hash_table, hash_table->size_log2 + 1, flags)) + return NULL; goto restart; } /* Finally we can try to insert the new item. */ item = insert_into_bucket(hash_table, key, - &BUCKET_FOR_HASH(hash_table, hash)); + &BUCKET_FOR_HASH(hash_table, hash), + flags); + if (item == NULL) + { + Assert((flags & DSHASH_INSERT_NO_OOM) != 0); + LWLockRelease(PARTITION_LOCK(hash_table, partition_index)); + return NULL; + } ``` When OOM happens, Assert((flags & DSHASH_INSERT_NO_OOM) != 0); makes sense. But for resize(), the assert is inside resize(),while for insert_into_bucket(), the assert is in the caller. That feels a bit inconsistent to me, and I think ithurts readability a little. A reader might wonder why there is no corresponding assert after resize() unless they go readthe function body. I think either style is fine, but using the same style in both places would be better. 2 ``` /* Allocate the space for the new table. */ - new_buckets_shared = - dsa_allocate_extended(hash_table->area, - sizeof(dsa_pointer) * new_size, - DSA_ALLOC_HUGE | DSA_ALLOC_ZERO); - new_buckets = dsa_get_address(hash_table->area, new_buckets_shared); + { + int dsa_flags = DSA_ALLOC_HUGE | DSA_ALLOC_ZERO; + + if (flags & DSHASH_INSERT_NO_OOM) + dsa_flags |= DSA_ALLOC_NO_OOM; + new_buckets_shared = + dsa_allocate_extended(hash_table->area, + sizeof(dsa_pointer) * new_size, + dsa_flags); + } ``` Making this a nested block does have the benefit of keeping dsa_flags close to where it is used. But from my impression,this style is still fairly uncommon in the codebase. I worry it may implicitly signal to other hackers that thisis an acceptable pattern. So unless we intentionally want to encourage that style, I would lean toward avoiding it here. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/
> Hi,
>
> For most memory allocation primitives, we offer a "no OOM" version.
> dshash_find_or_insert is an exception. Here's a small patch to fix
> that. I have some interest in slipping this into v19 even though I am
> submitting it quite late, because it would be useful for
> pg_stash_advice[1]. Let me know what you think about that.
Thanks for the patch!
I agree with this idea, as it could act as a good triggering point for
a caller to perform an eviction of the dshash if they run out of space,
and avoid a hard failure.
Passing this as a flag seems OK with me. Not sure what other
flags could be added in the future, but the flexibility is a good
thing, IMO. I was debating if this should just be a dsh_params
option, but maybe for the same dshash a caller may want OOM
in some code path, and retry in some other. maybe?
> + &BUCKET_FOR_HASH(hash_table, hash),
> + flags);
> + if (item == NULL)
> + {
> + Assert((flags & DSHASH_INSERT_NO_OOM) != 0);
> + LWLockRelease(PARTITION_LOCK(hash_table, partition_index));
> + return NULL;
> + }
> ```
>
> When OOM happens, Assert((flags & DSHASH_INSERT_NO_OOM) != 0); makes sense. But for resize(), the assert is inside
resize(),while for insert_into_bucket(), the assert is in the caller.
> That feels a bit inconsistent to me, and I think it hurts readability a little. A reader might wonder why there is no
correspondingassert after resize() unless they go read the function body.
>
> I think either style is fine, but using the same style in both places would be better.
>
I agree with this. The assert should be
if (!resize(hash_table, hash_table->size_log2 + 1, flags))
{
Assert((flags & DSHASH_INSERT_NO_OOM) != 0);
return NULL;
}
instead of inside resize().
I did some testing by triggering an OOM, a no-OOM, and an OOM with
eviction afterwards, as a sanity check. All looks good.
I've attached the tests I created with other basic testing, as dshash
lacks direct testing in general, if you're inclined to add them.
--
Sami Imseih
Amazon Web Services (AWS)
Вложения
On Tue, Mar 17, 2026 at 9:34 PM Chao Li <li.evan.chao@gmail.com> wrote: > When OOM happens, Assert((flags & DSHASH_INSERT_NO_OOM) != 0); makes sense. But for resize(), the assert is inside resize(),while for insert_into_bucket(), the assert is in the caller. That feels a bit inconsistent to me, and I think ithurts readability a little. A reader might wonder why there is no corresponding assert after resize() unless they go readthe function body. Adjusted. > Making this a nested block does have the benefit of keeping dsa_flags close to where it is used. But from my impression,this style is still fairly uncommon in the codebase. I worry it may implicitly signal to other hackers that thisis an acceptable pattern. So unless we intentionally want to encourage that style, I would lean toward avoiding it here. Yeah, that was dumb. Fixed. Thanks for the review; here's v2. -- Robert Haas EDB: http://www.enterprisedb.com
Вложения
> Thanks for the review; here's v2. LGTM -- Sami Imseih Amazon Web Services (AWS)
On Wed, Mar 18, 2026 at 11:21 AM Sami Imseih <samimseih@gmail.com> wrote:
> Passing this as a flag seems OK with me. Not sure what other
> flags could be added in the future, but the flexibility is a good
> thing, IMO. I was debating if this should just be a dsh_params
> option, but maybe for the same dshash a caller may want OOM
> in some code path, and retry in some other. maybe?
Yeah, I think this is a property of the individual call to
dshash_find_or_insert(), rather than a property of the table. It does
seem likely that most callers will want the same behavior in all
cases, but it would be very sad if we embedded that idea into the
structure of the code and then somebody has a case where it's not what
they want, and I see no real reason why that couldn't happen. Also,
doing it that way wouldn't really follow existing precedents
(pg_malloc_extended, palloc_extended, MemoryContextAllocExtended,
dsa_allocate_extended).
> I did some testing by triggering an OOM, a no-OOM, and an OOM with
> eviction afterwards, as a sanity check. All looks good.
> I've attached the tests I created with other basic testing, as dshash
> lacks direct testing in general, if you're inclined to add them.
I tried running a coverage report for this. It's pretty good. It
doesn't cover these things:
- dshash_create: OOM while allocating the bucket array.
- dshash_find_or_insert_extended: OOM while inserting an item
(apparently, at least here, it instead hits OOM while resizing)
- dshash_delete_key: the case where the entry not found
- dshash_dump: not called at all
- resize: the case where we exit early due to a concurrent resize
- delete_item_from_bucket: the case where there is more than one item
in the bucket
I think that adding a call to dshash_dump() somewhere would probably
make sense, and I'd suggest also trying to delete a nonexistent key.
The rest don't seem worth worrying about.
On the code itself:
+ /* Verify all entries via find. */
+ for (int i = 0; i < count; i++)
+ {
+ entry = dshash_find(ht, &i, false);
+ if (entry == NULL)
+ elog(ERROR, "key %d not found", i);
+ dshash_release_lock(ht, entry);
+ }
You could verify that entry->key == i. The current code wouldn't
notice if the hash table returned a garbage pointer. You could
possibly also include some kind of a payload in each record and verify
that, e.g. set entry->value =
some_homomorphism_over_0_to_9(entry->key).
+ dsa_set_size_limit(area, dsa_minimum_size());
This is an abuse of the documented purpose of dsa_set_size_limit().
Seems better to just pick a constant here.
+ /* Insert until OOM — without NO_OOM flag, this should raise ERROR. */
+ for (;;)
I think it would be safer to code this as a fixed iteration count. For
example, if you choose the size limit so that we should run out of
memory after 10 entries, you could terminate this loop after 1000
iterations. That way, if something goes wrong, we're more likely to
see "expected out-of-memory, but completed all %d iterations" in the
log rather than a core dump or whatever.
I+ {
+ TestDshashEntry *entry;
+
+ entry = dshash_find_or_insert(ht, &key, &found);
+ dshash_release_lock(ht, entry);
+ key++;
+ }
In other places, you check for entry == NULL, but not here.
+ {
+ TestDshashEntry *entry;
+
+ entry = dshash_find_or_insert_extended(ht, &key, &found,
+ DSHASH_INSERT_NO_OOM);
+ if (entry == NULL)
+ elog(ERROR, "insert after eviction failed");
+ dshash_release_lock(ht, entry);
+ }
I just got in trouble for letting a bare block sneak into my code, so
now it's my turn to complain.
I suggest git config user.email in whatever directory you use to
generate patches like this, just to make life a bit easier for anyone
who might eventually be committing one of them.
--
Robert Haas
EDB: http://www.enterprisedb.com
> On Mar 19, 2026, at 01:46, Robert Haas <robertmhaas@gmail.com> wrote: > > On Tue, Mar 17, 2026 at 9:34 PM Chao Li <li.evan.chao@gmail.com> wrote: >> When OOM happens, Assert((flags & DSHASH_INSERT_NO_OOM) != 0); makes sense. But for resize(), the assert is inside resize(),while for insert_into_bucket(), the assert is in the caller. That feels a bit inconsistent to me, and I think ithurts readability a little. A reader might wonder why there is no corresponding assert after resize() unless they go readthe function body. > > Adjusted. > >> Making this a nested block does have the benefit of keeping dsa_flags close to where it is used. But from my impression,this style is still fairly uncommon in the codebase. I worry it may implicitly signal to other hackers that thisis an acceptable pattern. So unless we intentionally want to encourage that style, I would lean toward avoiding it here. > > Yeah, that was dumb. Fixed. > > Thanks for the review; here's v2. > > -- > Robert Haas > EDB: http://www.enterprisedb.com > <v2-0001-dshash-Make-it-possible-to-suppress-out-of-memory.patch> Thanks for updating the patch. V2 LGTM. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/
On Wed, Mar 18, 2026 at 8:09 PM Chao Li <li.evan.chao@gmail.com> wrote: > Thanks for updating the patch. V2 LGTM. Committed, thanks to both of you. -- Robert Haas EDB: http://www.enterprisedb.com
Thanks for the review!
> > I did some testing by triggering an OOM, a no-OOM, and an OOM with
> > eviction afterwards, as a sanity check. All looks good.
> > I've attached the tests I created with other basic testing, as dshash
> > lacks direct testing in general, if you're inclined to add them.
>
> I tried running a coverage report for this. It's pretty good. It
> doesn't cover these things:
>
> - dshash_create: OOM while allocating the bucket array.
> - dshash_find_or_insert_extended: OOM while inserting an item
> (apparently, at least here, it instead hits OOM while resizing)
> - dshash_delete_key: the case where the entry not found
> - dshash_dump: not called at all
> - resize: the case where we exit early due to a concurrent resize
> - delete_item_from_bucket: the case where there is more than one item
> in the bucket
These are good to add. I also included delete_entry and dshash_dump
for more coverage. delete_item_from_bucket will require us to hash
the item to the same bucket, and I'm not sure it's worth the hassle.
resize() will occur in the OOM test.
> I think that adding a call to dshash_dump() somewhere would probably
Done
> make sense, and I'd suggest also trying to delete a nonexistent key.
Done
>
> On the code itself:
>
> + /* Verify all entries via find. */
> + for (int i = 0; i < count; i++)
> + {
> + entry = dshash_find(ht, &i, false);
> + if (entry == NULL)
> + elog(ERROR, "key %d not found", i);
> + dshash_release_lock(ht, entry);
> + }
>
> You could verify that entry->key == i. The current code wouldn't
> notice if the hash table returned a garbage pointer. You could
> possibly also include some kind of a payload in each record and verify
> that, e.g. set entry->value =
> some_homomorphism_over_0_to_9(entry->key).
I added key verification as you suggested.
> + dsa_set_size_limit(area, dsa_minimum_size());
>
> This is an abuse of the documented purpose of dsa_set_size_limit().
> Seems better to just pick a constant here.
I changed this to use a different limit.
> + /* Insert until OOM — without NO_OOM flag, this should raise ERROR. */
> + for (;;)
>
> I think it would be safer to code this as a fixed iteration count. For
> example, if you choose the size limit so that we should run out of
> memory after 10 entries, you could terminate this loop after 1000
> iterations. That way, if something goes wrong, we're more likely to
> see "expected out-of-memory, but completed all %d iterations" in the
> log rather than a core dump or whatever.
Done. I also removed the OOM retry test and just kept a simple
test. Adding entries after a delete is now happening in the basic test.
> I+ {
> + TestDshashEntry *entry;
> +
> + entry = dshash_find_or_insert(ht, &key, &found);
> + dshash_release_lock(ht, entry);
> + key++;
> + }
>
> In other places, you check for entry == NULL, but not here.
Fixed.
> I just got in trouble for letting a bare block sneak into my code, so
> now it's my turn to complain.
ugggh. fixed.
> I suggest git config user.email in whatever directory you use to
> generate patches like this, just to make life a bit easier for anyone
> who might eventually be committing one of them.
Sorry. I usually do, but I started using a new machine :(
--
Sami Imseih
Amazon Web Services (AWS)
Вложения
I just realized v2 removed the test for dshash_find_or_insert_extended with the NO_OOM flag. Corrected in v3. -- Sami Imseih Amazon Web Services (AWS)
Вложения
On Thu, Mar 19, 2026 at 8:26 PM Sami Imseih <samimseih@gmail.com> wrote: > I just realized v2 removed the test for dshash_find_or_insert_extended > with the NO_OOM flag. Corrected in v3. You need to get rid of the bare blocks. If you happen to be using an AI to help write your code, I definitely suggest telling it to remember not to ever do that. This verifies that dshash_dump() doesn't crash, but not that it produces useful output. Whether that's worth the infrastructure, I don't know. But more generally, I wonder what people think about how much value this kind of thing has. With just the core regression test suite, we get coverage of 76.5% of the lines in dshash.c. Adding the isolation test suite and the test_dsm_registry test suite brings coverage up to 81.9% (334 of 408). The things that are not covered are: OOM cases, dshash_destroy(), dshash_dump(), concurrency case in resizing. delete_key_from_bucket() loop iterating more than once, delete_item_from_bucket() returning false. With the v3 patch, running just the test_dshash suite, coverage rises to 93.4% of lines (381 of 408). The omissions are: some OOM cases, dshash_delete_bucket() successfully deletes something, concurrency case in resizing, most of delete_key_from_bucket(), delete_item_from_bucket() iterating more than once or returning false. Combining your patch with the test suite previously mentioned, we get up to 97.1% coverage by lines (396 of 408). So the patch definitely has some value: it adds coverage of 60+ lines of code. On the other hand, it still feels to me like we'd be far more likely to notice new dshash bugs based on its existing usages than because of this. If I were modifying dshash, I wouldn't run this test suite first; I'd run the regression tests first. And the patch does have a cost: it's 334 lines of code that will have to be maintained going forward. I don't know if that's worth it. I feel like we're a lot more likely to break dshash behavior in some complicated concurrency scenario than we are to break it in a test like this. And, if somebody modifies dshash_destory() or dshash_dump(), they just need to test the code at least once before committing to get essentially the same benefit we'd get from this, which you would hope people would do anyway. None of this is to say that this patch is a horrible idea, but I'm also not entirely convinced that it is an excellent idea, so I would like to hear some other opinions. Of course, there's also the fact that this patch is quite similar to some other test_* modules that we already have, like test_dsa or test_bitmapset. So maybe those are good and this is also good. But I'm not sure. The good news is, if we do want this, it probably doesn't need much more work to be committable. Thanks, -- Robert Haas EDB: http://www.enterprisedb.com
> You need to get rid of the bare blocks. done in v4 > This verifies that dshash_dump() doesn't crash, but not that it > produces useful output. Whether that's worth the infrastructure, I > don't know. I don't think dshash_dump() will be deterministic, as keys may hash to different buckets on different platforms, perhaps. Which is why I did not think it was a good idea to compare the output from the logs. Testing that it does not crash seemed worthwhile to increase the coverage. > 408). The omissions are: some OOM cases, dshash_delete_bucket() > successfully deletes something, concurrency case in resizing, most of > delete_key_from_bucket(), delete_item_from_bucket() iterating more Yes, we can probably address some of these things with more complexity. > than once or returning false. Combining your patch with the test suite > previously mentioned, we get up to 97.1% coverage by lines (396 of > 408). This is solid coverage > So the patch definitely has some value: it adds coverage of 60+ lines > of code. On the other hand, it still feels to me like we'd be far more > likely to notice new dshash bugs based on its existing usages than > because of this. If I were modifying dshash, I wouldn't run this test > suite first; I'd run the regression tests first. Sure, but once a bug is fixed, having direct test coverage fo this bug in a dedicated test suite is a good thing. The NO_OOM flag that started this discussion would not fit neatly in other test suites, AFAICT. -- Sami