Обсуждение: Non-robust plpgsql_trap test

Поиск
Список
Период
Сортировка

Non-robust plpgsql_trap test

От
Tom Lane
Дата:
I've noticed a few buildfarm failures similar to [1]:

# diff -U3 /repos/client-code-REL_19_1/HEAD/pgsql.build/src/pl/plpgsql/src/expected/plpgsql_trap.out
/repos/client-code-REL_19_1/HEAD/pgsql.build/src/pl/plpgsql/src/results/plpgsql_trap.out
# --- /repos/client-code-REL_19_1/HEAD/pgsql.build/src/pl/plpgsql/src/expected/plpgsql_trap.out    2026-04-21
04:22:01.030204342-0300 
# +++ /repos/client-code-REL_19_1/HEAD/pgsql.build/src/pl/plpgsql/src/results/plpgsql_trap.out    2026-04-21
04:29:54.795187855-0300 
# @@ -155,7 +155,7 @@
#  begin;
#  set statement_timeout to 1000;
#  select trap_timeout();
# -NOTICE:  nyeah nyeah, can't stop me
# +NOTICE:  caught others?
#  ERROR:  end of function
#  CONTEXT:  PL/pgSQL function trap_timeout() line 15 at RAISE
#  rollback;
not ok 11    - plpgsql_trap                              502 ms

which is coming from unexpected behavior of this bit of plpgsql
code:

  begin
    -- we assume this will take longer than 1 second:
    select count(*) into x from generate_series(1, 1_000_000_000_000);
  exception
    when others then
      raise notice 'caught others?';
    when query_canceled then
      raise notice 'nyeah nyeah, can''t stop me';
  end;

The light bulb went on when I noticed a nearby failure from the same
machine that was clearly traceable to out-of-disk-space.  What
happened here, I have no doubt, was that the "from generate_series"
bit tried to make a large temporary file, ran out of space, and threw
an appropriate error, causing us to take the "wrong" exception
handler.

Proposal:

1. Replace that query with something not so resource-intensive.
I'm not really sure why we didn't just use "perform pg_sleep(10)".
Maybe it didn't exist or didn't reliably wait 10 seconds at the
time, but it does now.

2. Adjust the "when others" handler to report the actual error,
to make this sort of thing easier to debug next time.

            regards, tom lane

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=caiman&dt=2026-04-21%2007%3A21%3A57



Re: Non-robust plpgsql_trap test

От
Andrew Dunstan
Дата:
On 2026-04-21 Tu 9:54 AM, Tom Lane wrote:
> I've noticed a few buildfarm failures similar to [1]:
>
> # diff -U3 /repos/client-code-REL_19_1/HEAD/pgsql.build/src/pl/plpgsql/src/expected/plpgsql_trap.out
/repos/client-code-REL_19_1/HEAD/pgsql.build/src/pl/plpgsql/src/results/plpgsql_trap.out
> # --- /repos/client-code-REL_19_1/HEAD/pgsql.build/src/pl/plpgsql/src/expected/plpgsql_trap.out    2026-04-21
04:22:01.030204342-0300
 
> # +++ /repos/client-code-REL_19_1/HEAD/pgsql.build/src/pl/plpgsql/src/results/plpgsql_trap.out    2026-04-21
04:29:54.795187855-0300
 
> # @@ -155,7 +155,7 @@
> #  begin;
> #  set statement_timeout to 1000;
> #  select trap_timeout();
> # -NOTICE:  nyeah nyeah, can't stop me
> # +NOTICE:  caught others?
> #  ERROR:  end of function
> #  CONTEXT:  PL/pgSQL function trap_timeout() line 15 at RAISE
> #  rollback;
> not ok 11    - plpgsql_trap                              502 ms
>
> which is coming from unexpected behavior of this bit of plpgsql
> code:
>
>    begin
>      -- we assume this will take longer than 1 second:
>      select count(*) into x from generate_series(1, 1_000_000_000_000);
>    exception
>      when others then
>        raise notice 'caught others?';
>      when query_canceled then
>        raise notice 'nyeah nyeah, can''t stop me';
>    end;
>
> The light bulb went on when I noticed a nearby failure from the same
> machine that was clearly traceable to out-of-disk-space.  What
> happened here, I have no doubt, was that the "from generate_series"
> bit tried to make a large temporary file, ran out of space, and threw
> an appropriate error, causing us to take the "wrong" exception
> handler.
>
> Proposal:
>
> 1. Replace that query with something not so resource-intensive.
> I'm not really sure why we didn't just use "perform pg_sleep(10)".
> Maybe it didn't exist or didn't reliably wait 10 seconds at the
> time, but it does now.
>
> 2. Adjust the "when others" handler to report the actual error,
> to make this sort of thing easier to debug next time.
>
>             regards, tom lane
>
> [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=caiman&dt=2026-04-21%2007%3A21%3A57


Sounds good.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com