Re: Is SPI + Procedures (with COMMIT) inside a bgworker broken?

Поиск
Список
Период
Сортировка
От Fabrízio de Royes Mello
Тема Re: Is SPI + Procedures (with COMMIT) inside a bgworker broken?
Дата
Msg-id CAFcNs+pTiqVbPJD0Vywfb4H7LA7FKaemcWYce8sCcjH9AWKxNQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Is SPI + Procedures (with COMMIT) inside a bgworker broken?  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers

On Mon, Sep 13, 2021 at 4:30 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> The direct cause of that is that SPI_execute() doesn't permit the called
> query to perform COMMIT/ROLLBACK, which is because most callers would fail
> to cope with that.  You can instruct SPI to allow that by replacing the
> SPI_execute() call with something like
>
>                 SPIExecuteOptions options;
>
>                 ...
>                 memset(&options, 0, sizeof(options));
>                 options.allow_nonatomic = true;
>
>                 ret = SPI_execute_extended(buf.data, &options);
>

I completely forgot about the SPI execute options... Thanks for the explanation!!!


> However, that's not enough to make this example work :-(.
> I find that it still fails inside the procedure's COMMIT,
> with
>
> 2021-09-13 15:14:54.775 EDT worker_spi[476310] ERROR:  portal snapshots (0) did not account for all active snapshots (1)
> 2021-09-13 15:14:54.775 EDT worker_spi[476310] CONTEXT:  PL/pgSQL function schema4.counted_proc() line 1 at COMMIT
>         SQL statement "CALL "schema4"."counted_proc"()"
>
> I think what this indicates is that worker_spi_main's cavalier
> management of the active snapshot isn't up to snuff for this
> use-case.  The error is coming from ForgetPortalSnapshots, which
> is expecting that all active snapshots are attached to Portals;
> but that one isn't.
>

That is exactly the root cause of all my investigation. 

At Timescale we have a scheduler (background worker) that launches another background worker to "execute a job", and by executing a job it means to call a function [1] or a procedure [2] directly without a SPI.

But now a user raised an issue about snapshots [3] and when I saw the code for the first time I tried to use SPI and it didn't work as expected.

Even tweaking worker_spi to execute the procedure without SPI by calling ExecuteCallStmt (attached) we end up with the same situation about the active snapshots:

2021-09-13 20:14:36.654 -03 [21483] LOG:  worker_spi worker 2 initialized with schema2.counted
2021-09-13 20:14:36.655 -03 [21484] LOG:  worker_spi worker 1 initialized with schema1.counted
2021-09-13 20:14:36.657 -03 [21483] ERROR:  portal snapshots (0) did not account for all active snapshots (1)
2021-09-13 20:14:36.657 -03 [21483] CONTEXT:  PL/pgSQL function schema2.counted_proc() line 1 at COMMIT
2021-09-13 20:14:36.657 -03 [21484] ERROR:  portal snapshots (0) did not account for all active snapshots (1)
2021-09-13 20:14:36.657 -03 [21484] CONTEXT:  PL/pgSQL function schema1.counted_proc() line 1 at COMMIT
2021-09-13 20:14:36.659 -03 [21476] LOG:  background worker "worker_spi" (PID 21483) exited with exit code 1
2021-09-13 20:14:36.659 -03 [21476] LOG:  background worker "worker_spi" (PID 21484) exited with exit code 1


> Probably the most appropriate fix is to make worker_spi_main
> set up a Portal to run the query inside of.  There are other
> bits of code that are not happy if they're not inside a Portal,
> so if you're hoping to run arbitrary SQL this way, sooner or
> later you're going to have to cross that bridge.
>

I started digging with it [4] by creating a Portal from scratch to execute the Function or Procedure and it worked.

We're wondering if we can avoid the parser for PortalRun, can we??

Regards,
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: postgres.h included from relcache.h - but removing it breaks pg_upgrade
Следующее
От: Andres Freund
Дата:
Сообщение: Re: pgsql: Deduplicate choice of horizon for a relation procarray.c.