Обсуждение: BUG #19018: high memory usage and "stack depth limit exceeded", with GiST index on ltree
BUG #19018: high memory usage and "stack depth limit exceeded", with GiST index on ltree
От
PG Bug reporting form
Дата:
The following bug has been logged on the website:
Bug reference: 19018
Logged by: Joseph Silva
Email address: dull.bananas0@gmail.com
PostgreSQL version: 17.5
Operating system: Fedora
Description:
If I run this, then the postgres process's memory usage approaches 6 GB, and
the insertion when
i=253 fails with "stack depth limit exceeded":
```
CREATE EXTENSION ltree;
CREATE TABLE comment (path ltree);
CREATE INDEX ON comment USING gist (path);
DO $$
DECLARE
i int := 1;
p text := '0';
BEGIN
WHILE i < 1000 LOOP
p := p || '.' || i::text;
i := i + 1;
INSERT INTO comment (path) VALUES (p::ltree);
COMMIT;
END LOOP;
END
$$;
```
If index creation is delayed until after insertions, then the insertions
succeed but index creation
fails.
On Tue, Aug 12, 2025 at 5:44 PM PG Bug reporting form
<noreply@postgresql.org> wrote:
>
> The following bug has been logged on the website:
>
> Bug reference: 19018
> Logged by: Joseph Silva
> Email address: dull.bananas0@gmail.com
> PostgreSQL version: 17.5
> Operating system: Fedora
> Description:
>
> If I run this, then the postgres process's memory usage approaches 6 GB, and
> the insertion when
> i=253 fails with "stack depth limit exceeded":
>
> ```
> CREATE EXTENSION ltree;
>
> CREATE TABLE comment (path ltree);
>
> CREATE INDEX ON comment USING gist (path);
>
> DO $$
> DECLARE
> i int := 1;
> p text := '0';
> BEGIN
> WHILE i < 1000 LOOP
> p := p || '.' || i::text;
> i := i + 1;
> INSERT INTO comment (path) VALUES (p::ltree);
> COMMIT;
> END LOOP;
> END
> $$;
> ```
>
> If index creation is delayed until after insertions, then the insertions
> succeed but index creation
> fails.
>
Thanks for reporting, I didn't analyze it fully but here is what I
have analyzed so far. While debugging I have noticed that it is
recursively trying to complete the previously incomplete split,
ideally there should not be any incomplete split because I am just
running this from a single sessions so there should not be any issue
in acquiring parent page lock and there is no crash so
GistFollowRight() must be cleared but it is not in some cases and its
keep recursively splitting until it hits the stack overflow [2]. So
this seems like somewhere we have missed to call
GistClearFollowRight() after splitting.
Then I tried to observed the relpages and it shows 270065 relpages for
comment_path_idx which was just 1 before executing this ANONYMOUS
block[2]
[1]
postgres[2882817]=# ANALYZE ;
ANALYZE
postgres[2882817]=# select relname, relpages from pg_class where
relname like '%comment%';
relname | relpages
------------------+----------
comment | 0
comment_path_idx | 1
(2 rows)
postgres[2882817]=# DO $$
DECLARE
i int := 1;
p text := '0';
BEGIN
WHILE i < 500 LOOP
p := p || '.' || i::text;
i := i + 1;
INSERT INTO comment (path) VALUES (p::ltree);
COMMIT;
END LOOP;
END
$$
ERROR: 54001: stack depth limit exceeded
HINT: Increase the configuration parameter "max_stack_depth"
(currently 2048kB), after ensuring the platform's stack depth limit is
adequate.
CONTEXT: SQL statement "INSERT INTO comment (path) VALUES (p::ltree)"
PL/pgSQL function inline_code_block line 9 at SQL statement
LOCATION: check_stack_depth, stack_depth.c:99
postgres[2882817]=# select relname, relpages from pg_class where
relname like '%comment%';
relname | relpages
------------------+----------
comment | 35
comment_path_idx | 270065
(2 rows)
[2]
#0 check_stack_depth () at stack_depth.c:99
#1 0x000055659812bad5 in gistSplit (r=0x7f0c1d9dd008,
page=0x7f0c1e44d000 "\001", itup=0x5567a00a1650, len=3,
giststate=0x5565d74ce9f8) at gist.c:1463
#2 0x0000556598129720 in gistplacetopage (rel=0x7f0c1d9dd008,
freespace=0, giststate=0x5565d74ce9f8, buffer=453,
itup=0x7ffd5f39b800, ntup=2, oldoffnum=2, newblkno=0x0,
leftchildbuf=13010, splitinfo=0x7ffd5f39b790,
markfollowright=true, heapRel=0x7f0c1d9de928, is_build=false) at
gist.c:315
#3 0x000055659812b7eb in gistinserttuples (state=0x7ffd5f5993c0,
stack=0x5567a009a078, giststate=0x5565d74ce9f8, tuples=0x7ffd5f39b800,
ntup=2, oldoffnum=2
<--------clip------>
leftchild=12971, rightchild=12974, unlockbuf=true,
unlockleftchild=true) at gist.c:1337
#18642 0x000055659812ba52 in gistfinishsplit (state=0x7ffd5f5993c0,
stack=0x5565d7726968, giststate=0x5565d74ce9f8,
splitinfo=0x5565d7768848, unlockbuf=true)
at gist.c:1408
#18643 0x000055659812b855 in gistinserttuples (state=0x7ffd5f5993c0,
stack=0x5565d7726968, giststate=0x5565d74ce9f8, tuples=0x7ffd5f599300,
ntup=2, oldoffnum=1,
leftchild=12972, rightchild=12973, unlockbuf=true,
unlockleftchild=false) at gist.c:1337
#18644 0x000055659812ba52 in gistfinishsplit (state=0x7ffd5f5993c0,
stack=0x5565d7758388, giststate=0x5565d74ce9f8,
splitinfo=0x5565d77593e8, unlockbuf=false)
at gist.c:1408
#18645 0x000055659812b6e2 in gistfixsplit (state=0x7ffd5f5993c0,
giststate=0x5565d74ce9f8) at gist.c:1246
#18646 0x000055659812a61c in gistdoinsert (r=0x7f0c1d9dd008,
itup=0x5565d746c580, freespace=0, giststate=0x5565d74ce9f8,
heapRel=0x7f0c1d9de928, is_build=false)
--
Regards,
Dilip Kumar
Google
Re: BUG #19018: high memory usage and "stack depth limit exceeded", with GiST index on ltree
От
Arseniy Mukhin
Дата:
On Wed, Aug 13, 2025 at 7:46 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Tue, Aug 12, 2025 at 5:44 PM PG Bug reporting form > <noreply@postgresql.org> wrote: > > > > The following bug has been logged on the website: > > > > Bug reference: 19018 > > Logged by: Joseph Silva > > Email address: dull.bananas0@gmail.com > > PostgreSQL version: 17.5 > > Operating system: Fedora > > Description: > > > > If I run this, then the postgres process's memory usage approaches 6 GB, and > > the insertion when > > i=253 fails with "stack depth limit exceeded": > > > > ``` > > CREATE EXTENSION ltree; > > > > CREATE TABLE comment (path ltree); > > > > CREATE INDEX ON comment USING gist (path); > > > > DO $$ > > DECLARE > > i int := 1; > > p text := '0'; > > BEGIN > > WHILE i < 1000 LOOP > > p := p || '.' || i::text; > > i := i + 1; > > INSERT INTO comment (path) VALUES (p::ltree); > > COMMIT; > > END LOOP; > > END > > $$; > > ``` > > > > If index creation is delayed until after insertions, then the insertions > > succeed but index creation > > fails. > > > > Thanks for reporting, I didn't analyze it fully but here is what I > have analyzed so far. While debugging I have noticed that it is > recursively trying to complete the previously incomplete split, > ideally there should not be any incomplete split because I am just > running this from a single sessions so there should not be any issue > in acquiring parent page lock and there is no crash so > GistFollowRight() must be cleared but it is not in some cases and its > keep recursively splitting until it hits the stack overflow [2]. So > this seems like somewhere we have missed to call > GistClearFollowRight() after splitting. Hi, I managed to reproduce it too. I agree that the problem here is an endless split. I encountered similar issues while I was trying to insert big tuples into the gist index. And the main problem I think here is that GIST insert code does not limit index tuple size (ofc it could not be larger then page size). There is a macros GISTMaxIndexTupleSize, but it's not used. Probably there are some limits on the operator classes side, but it doesn't seem to work here. I added several logs while investigating it. So how stack overflow is happening in this case: During the insert that leads to stack overflow, we have a split. To finish the split we need to insert into the parent last two downlinks. Here the sizes of these downlinks (all sizes here are in bytes): 2025-08-13 17:38:38.614 MSK [829680] DEBUG: finish split: insert child left size: 3776, right size: 4016 You can see that it's a huge index tuples. At the moment of insert of the two new downlinks, the parent has 2 tuples in it: 2025-08-13 17:38:38.614 MSK [829680] DEBUG: number of tups on the page before split: 2 We want to add two more. But it does not fit into the page, so we have a parent split. Split algorithm decided that we need 3 pages in this split: 2025-08-13 17:38:38.614 MSK [829680] DEBUG: split during insert, children number: 3 And guess what, here are sizes of new downlinks that we have for these 3 pages: 2025-08-13 17:38:38.614 MSK [829680] DEBUG: downlink size: 3776 2025-08-13 17:38:38.614 MSK [829680] DEBUG: downlink size: 4016 2025-08-13 17:38:38.614 MSK [829680] DEBUG: downlink size: 4160 Here we can see that sizes of the last two downlinks that we will try to insert into the parent of the parent are 3776 and 4016. The same that we were trying to insert into the initial parent. In short, when we try to insert 3 huge tuples in the parent page, we have a new split, which results in new 3 huge tuples that we need to insert into the parent of the parent etc. This way we have neverending split. Here is a draft patch that checks index size before insert and during the split, and now the reproducer fails on the check. Best regards, Arseniy Mukhin
Вложения
On Wed, Aug 13, 2025 at 9:23 PM Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> wrote: > > On Wed, Aug 13, 2025 at 7:46 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Tue, Aug 12, 2025 at 5:44 PM PG Bug reporting form > > <noreply@postgresql.org> wrote: > > > > > > The following bug has been logged on the website: > > > > > > Bug reference: 19018 > > > Logged by: Joseph Silva > > > Email address: dull.bananas0@gmail.com > > > PostgreSQL version: 17.5 > > > Operating system: Fedora > > > Description: > > > > > > If I run this, then the postgres process's memory usage approaches 6 GB, and > > > the insertion when > > > i=253 fails with "stack depth limit exceeded": > > > > > > ``` > > > CREATE EXTENSION ltree; > > > > > > CREATE TABLE comment (path ltree); > > > > > > CREATE INDEX ON comment USING gist (path); > > > > > > DO $$ > > > DECLARE > > > i int := 1; > > > p text := '0'; > > > BEGIN > > > WHILE i < 1000 LOOP > > > p := p || '.' || i::text; > > > i := i + 1; > > > INSERT INTO comment (path) VALUES (p::ltree); > > > COMMIT; > > > END LOOP; > > > END > > > $$; > > > ``` > > > > > > If index creation is delayed until after insertions, then the insertions > > > succeed but index creation > > > fails. > > > > > > > Thanks for reporting, I didn't analyze it fully but here is what I > > have analyzed so far. While debugging I have noticed that it is > > recursively trying to complete the previously incomplete split, > > ideally there should not be any incomplete split because I am just > > running this from a single sessions so there should not be any issue > > in acquiring parent page lock and there is no crash so > > GistFollowRight() must be cleared but it is not in some cases and its > > keep recursively splitting until it hits the stack overflow [2]. So > > this seems like somewhere we have missed to call > > GistClearFollowRight() after splitting. > > Hi, > > I managed to reproduce it too. I agree that the problem here is an > endless split. I encountered similar issues while I was trying to > insert big tuples into the gist index. And the main problem I think > here is that GIST insert code does not limit index tuple size (ofc it > could not be larger then page size). There is a macros > GISTMaxIndexTupleSize, but it's not used. Probably there are some > limits on the operator classes side, but it doesn't seem to work here. > > I added several logs while investigating it. So how stack overflow is > happening in this case: > > During the insert that leads to stack overflow, we have a split. To > finish the split we need to insert into the parent last two downlinks. > Here the sizes of these downlinks (all sizes here are in bytes): > > 2025-08-13 17:38:38.614 MSK [829680] DEBUG: finish split: insert > child left size: 3776, right size: 4016 > > You can see that it's a huge index tuples. > > At the moment of insert of the two new downlinks, the parent has 2 tuples in it: > 2025-08-13 17:38:38.614 MSK [829680] DEBUG: number of tups on the > page before split: 2 > > We want to add two more. But it does not fit into the page, so we have > a parent split. Split algorithm decided that we need 3 pages in this > split: > 2025-08-13 17:38:38.614 MSK [829680] DEBUG: split during insert, > children number: 3 > > And guess what, here are sizes of new downlinks that we have for these 3 pages: > 2025-08-13 17:38:38.614 MSK [829680] DEBUG: downlink size: 3776 > 2025-08-13 17:38:38.614 MSK [829680] DEBUG: downlink size: 4016 > 2025-08-13 17:38:38.614 MSK [829680] DEBUG: downlink size: 4160 > > Here we can see that sizes of the last two downlinks that we will try > to insert into the parent of the parent are 3776 and 4016. The same > that we were trying to insert into the initial parent. > > In short, when we try to insert 3 huge tuples in the parent page, we > have a new split, which results in new 3 huge tuples that we need to > insert into the parent of the parent etc. > > This way we have neverending split. > > Here is a draft patch that checks index size before insert and during > the split, and now the reproducer fails on the check. I think your root cause analysis looks valid to me, In my analysis I said that we had the F_FOLLOW_RIGHT flag set for some of the pages and I assumed we missed to clear that flag. But your analysis shows that we never completed the split and that's the reason this flag is not cleared. And I remember I debugged the second run, after I hit the stack overflow in the first run. So during the second run it was trying to finish the incomplete split because that was leftover from the first run. So yeah this analysis makes sense, and the fix too. -- Regards, Dilip Kumar Google
Re: BUG #19018: high memory usage and "stack depth limit exceeded", with GiST index on ltree
От
Sergei Kornilov
Дата:
Hello This reminded me of some very old bug reports, like mine: https://www.postgresql.org/message-id/15431-7a89470f7879bed4%40postgresql.org Still reproduces giving different results depending on the order of operations. The proposed patch makes the behavior thesame in both cases. Since this code path may be reachable by the user, instead of elog there should be ereport with a more user-friendly description.I would suggest something more like the text used for btree in a case like this. (ereport in _bt_check_third_page) ereport(ERROR, (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), errmsg("index row size %zu exceeds gist maximum %zu for index \"%s\""... errdetail("Index row references tuple (%u,%u) in relation \"%s\".",... errhint("Values larger than 1/2 of a buffer page cannot be indexed"... regards, Sergei
Re: BUG #19018: high memory usage and "stack depth limit exceeded", with GiST index on ltree
От
Arseniy Mukhin
Дата:
Hi, Dilip, Sergei, thank you for your feedback. After researching it more I think the fix is incorrect. It seems that we shouldn't limit index tuple size of leaf pages on the gist core level. Some operator classes can handle page size tuples on the leaf level and it doesn't break split (like trgm, hstore) because representation of leaf and not leaf index tuples are different. So you can have almost page sized index tuples on the leaf level and all inner level tuples will still be small enough to provide valid splitting. It means if we want to introduce some limitation for leaf tuples, we should do it in operator class implementation. For inner level tuples it's probably a good idea to introduce some limitation in the gist core, because no matter what opclass do you use, it doesn't make sense to have page size tuples. Speaking about the bug, I still think the main cause is a lack of index tuple size limitation, but we need to fix it in the operator class, not gist core. ltree inner page index tuple has next structure: * Non-Leaf * (len)(flag)(sign)(left_ltree)(right_ltree) * ALLTRUE: (len)(flag)(left_ltree)(right_ltree) Sign is a signature, it has a fixed size that you can set up while index creation (param sigsize). left_ltree, right_ltree are keys that we build index on. So if we want to have internal tuples that less than GISTMaxIndexTupleSize (~1/4 page size) than we should limit leaf index tuples keys, and max size for such keys should be: max_size = (GISTMaxIndexKeySize - sigsize) / 2 For default signature size and default page size maximum key size is 1008. It works only for single column indexes, for multicolumn we need a more strict limit, but it seems all operator classes are quite carefree about it, so probably if we fix at least a single column case it would be good enough. There is a new patch version with implementation of the above idea. Bug reproducer fails on the new check. Sergei's reproducer fails as well (most of the inserts in it now fail on the new check). I found an ereport about exceeding size limitation in gist code, so I use the same error message here too, it's more user friendly than the original message in v1. Unfortunately, there is another point that should be addressed. Upper limit for signature size is 2024, which results in max_size = 0 which means we can't insert any tuples with it. There is a regression test that fails because of it. I think the upper limit for signature should be decreased. But I don't know what to do with existing indexes that could have big signatures. What do you think? Best regards, Arseniy Mukhin