Обсуждение: Autonomous transactions 2023, WIP

Поиск

Список

Период

Сортировка

Autonomous transactions 2023, WIP

От

Ivan Kush

Дата:

15 декабря 2023 г., 11:28:12


Hello. I'm working on the support of autonomous transactions in Postgres.
Could you please make a preliminary review and give advices (see section 
#TODO)

# Patch
v0001-Autonomous-transactions.patch

# Introduction
This patch implements Autonomous Transactions for PL/pgSQL.
Autonomous transaction is a transaction that can be succesfully commited 
even if base transaction is rolled back. Common use cases: 
logging/auditing/tracking progress in tables, so that information about 
the execution attempt is preserved even when the main transaction is 
rolled back — for example, due to an error.

# Glossary
Session - entity that groups multiple related SQL commands into a single 
transaction.
Main session (backend, foreground session) - session through which the 
user interacts.
Main transaction (parent) - transaction that runs in the main session.
Autonomous session - session that performs an offline transaction. It 
starts from the main session.
Autonomous transaction - independent transaction that runs inside an 
autonomous session.
Autonomous function - function with the pragma AUTONOMOUS_TRANSACTION. 
When it is executed, an autonomous session is created in it.
Background worker - background process that performs some actions in the 
background, without the user's participation.
dsm - dynamic shared memory.
shm_mq - shared memory message queue.

# Internals
This patch introduces a "pragma AUTONOMOUS_TRANSACTION" to functions. 
When one such function is executed all (at the current time not all, 
WIP) statements from it are executed in an autonomous session.

* Example *
*SQL-request:*

```sql
CREATE TABLE tbl (a int);
CREATE OR REPLACE FUNCTION func() RETURNS void
LANGUAGE plpgsql
AS $$
DECLARE
   PRAGMA AUTONOMOUS_TRANSACTION;
BEGIN
   INSERT INTO tbl VALUES (1);
END;
$$;

START TRANSACTION;
SELECT func();
ROLLBACK;

SELECT * FROM tbl;

DROP FUNCTION func;
DROP TABLE tbl;
```

*Output:*

```bash
  a
---
  1
(1 row)
```


For each backend the patch lazily creates a pool of autonomous sessions. 
When backend calls autonomous function, backend takes one autonomous 
session from this pool and sends there function's statements for 
execution. When execution is finished backend returns session to pool. 
Lazily means that pool is created only when first autonomous session is 
needed.
Backend and autonomous session communicate with the help of Postgres 
client-server protocol. Messages are sent through dynamic shared memory. 
Execution of backend and autonomous session is synchronous: autonomous 
session waits for messages from backendand backend waits for messages 
from autonomous session.
Autonomous session uses Background workers internally. As it's a 
separate process, it contains caches, etc. In order to prevent infinite 
grow of resources usage we reset all caches by timeout using restart of 
autonomous sessions. This timeout is set by guc setting 
autonomous_session_lifetime.
Source code contains more detailed comments.

# Alternatives
At the current time for this functionality may be uses extensions: 
dblink and pg_background. But they have shortcomings:
     1) not in the Postgres core, they are extensions
     2) lower performance. Each call creates new process that is 
destroyed immediately after transaction is finished.

# TODO
Could you please give advices how implement public pool shared between 
all backends?
1) Support execution of remaining statements in autonomous sessions.
2) Public pool shared between all backends. At the current time for each 
backend private pool is created.

# Tests
Implementation contains many regression tests of varying complexity, 
which check supported features.

# Platform
This patch was checkouted from tag 15.4. This is WIP. I've developed in 
Linux, code doesn't contain platfrom-specific code, only Postgres 
internal data structures and functions.

# Documentation
Regression tests contain many examples
* Describe the effect your patch has on performance, if any.
It adds a new feature and increase performance compared to dblink and 
pg_background

# History
## 1st feature requests and discussions in pgsql-hackers (without code)
1) 2008
https://www.postgresql.org/message-id/flat/1A6E6D554222284AB25ABE3229A9276271549A%40nrtexcus702.int.asurion.com
2) 2010
https://www.postgresql.org/message-id/flat/AANLkTi%3DuogmYxLKWmUfFSg-Ki2bejsQiO2g5GTMxvdW2%40mail.gmail.com
3) 2011
https://www.postgresql.org/message-id/flat/1303399444.9126.8.camel%40vanquo.pezone.net
4) 2011
https://wiki.postgresql.org/wiki/Autonomous_subtransactions
5) 2011
https://www.postgresql.org/message-id/flat/20111218082812.GA14355%40leggeri.gi.lan
https://wiki.postgresql.org/wiki/Autonomous_subtransactions
## Implementaion
1) 2014, Rajeev Rastogi, implementation based on subtransactions
https://www.postgresql.org/message-id/flat/BF2827DCCE55594C8D7A8F7FFD3AB7713DDDEF59%40SZXEML508-MBX.china.huawei.com
2) 2015, Rajeev Rastogi, new theme, continues discussion about semantics 
and syntax of autonomous transactions
https://www.postgresql.org/message-id/flat/BF2827DCCE55594C8D7A8F7FFD3AB7715990499A%40szxeml521-mbs.china.huawei.com
3) 2016, Peter Eisentraut, implementation based on background workers
https://www.postgresql.org/message-id/flat/659a2fce-b6ee-06de-05c0-c8ed6a01979e%402ndquadrant.com

# Summary
* Add pragma AUTONOMOUS_TRANSACTION in the functions. When function 
contains this pragma, the it's executed autonomously
* Background workers are used to run autonomous sessions.
* Synchronous execution between backend and autonomous session
* Postgres Client-Server Protocol is used to communicate between them
* Pool of autonomous sessions. Pool is created lazily.
* Infinite nested calls of autonomous functions are allowed. Limited 
only by computer resources.
* If another 2nd autonomous function is called in the 1st autonomous 
function, the 2nd is executed at the beginning, and then the 1st 
continues execution.

-- 
Best wishes,
Ivan Kush
Tantor Labs LLC

Вложения

0001-Autonomous-transactions.patch

Re: Autonomous transactions 2023, WIP

От

Pavel Stehule

Дата:

21 декабря 2023 г., 09:35:47

although I like the idea related to autonomous transactions, I don't think so this way is the best

1. The solution based on background workers looks too fragile - it can be easy to exhaust all background workers, and because this feature is proposed mainly for logging, then it is a little bit dangerous, because it means loss of possibility of logging.

2. although the Oracle syntax is interesting, and I proposed PRAGMA more times, it doesn't allow this functionality in other PL

I don't propose exactly firebird syntax https://firebirdsql.org/refdocs/langrefupd25-psql-autonomous-trans.html, but I think this solution is better than ADA's PRAGMAs. I can imagine some special flag for function like

CREATE OR REPLACE FUNCTION ...

AS $$

$$ LANGUAGE plpgsql AUTONOMOUS TRANSACTION;

as another possibility.

3. Heikki wrote about the possibility to support threads in Postgres. One significant part of this project is elimination of global variables. It can be common with autonomous transactions.

Surely, the first topic should be the method of implementation. Maybe I missed it, but there is no agreement of background worker based.

Regards

Pavel

Re: Autonomous transactions 2023, WIP

От

"Andrey M. Borodin"

Дата:

21 декабря 2023 г., 11:26:43

> On 15 Dec 2023, at 16:28, Ivan Kush <ivan.kush@tantorlabs.com> wrote:
>
>
>
> Hello. I'm working on the support of autonomous transactions in Postgres.
>

> # Summary
> * Add pragma AUTONOMOUS_TRANSACTION in the functions. When function
> contains this pragma, the it's executed autonomously
> * Background workers are used to run autonomous sessions.
> * Synchronous execution between backend and autonomous session
> * Postgres Client-Server Protocol is used to communicate between them
> * Pool of autonomous sessions. Pool is created lazily.
> * Infinite nested calls of autonomous functions are allowed. Limited
> only by computer resources.
> * If another 2nd autonomous function is called in the 1st autonomous
> function, the 2nd is executed at the beginning, and then the 1st
> continues execution.

Cool, looks interesting! As far as I know EnterpriseDB, Postgres Pro and OracleDB have this functionality. So, seems
likethe stuff is in demand. 
How does your version compare to this widely used databases? Is anyone else using backgroud connections? Which syntax
isused by other DBMS'? 

Looking into the code it seems like an easy way for PL\pgSQL function to have a client connection. I think this might
workfor other PLs too. 

The patch touches translations ( src/backend/po/). I think we typically do not do this in code patches, because this
workis better handled by translators. 

Best regards, Andrey Borodin.

Re: Autonomous transactions 2023, WIP

От

Ivan Kush

Дата:

24 декабря 2023 г., 11:27:47

 > 1. The solution based on background workers looks too fragile - it 
can be easy to exhaust all background workers, and because this feature 
is proposed mainly for logging, then it is a little bit dangerous, 
because it means loss of possibility of logging.

1. We could add types for background workers. For each type add 
guc-settings, like max workers of each type.
For examaple, for `common` leave `max_worker_processes` setting for 
backward compatibility
enum bgw_type {
   common,
   autonomous,
   etc....
};

 > 2. although the Oracle syntax is interesting, and I proposed PRAGMA 
more times,  it doesn't allow this functionality in other PL

2. Add `AUTONOMOUS` to `BEGIN` instead of `PRAGMA` in `DECLARE`? `BEGIN 
AUTONOMOUS`.
It shows immediately that we are in autonomous session, no need to 
search in subsequent lines for keyword.

```
CREATE FUNCTION foo() RETURNS void AS $$
BEGIN AUTONOMOUS
   INSERT INTO tbl VALUES (1);
   BEGIN AUTONOMOUS
    ....
    END;
END;
$$ LANGUAGE plpgsql;
```

 > CREATE OR REPLACE FUNCTION ...
 > AS $$
 > $$ LANGUAGE plpgsql AUTONOMOUS TRANSACTION;

The downside with the keyword in function declaration, that we will not 
be able to create autonomous subblocks. With `PRAGMA AUTONOMOUS` or 
`BEGIN AUTONOMOUS` it's possible to create them.

```
-- BEGIN AUTONOMOUS

CREATE FUNCTION foo() RETURNS void AS $$
BEGIN
   INSERT INTO tbl VALUES (1);
   BEGIN AUTONOMOUS
     INSERT INTO tbl VALUES (2);
   END;
END;
$$ LANGUAGE plpgsql;

-- or PRAGMA AUTONOMOUS

CREATE FUNCTION foo() RETURNS void AS $$
BEGIN
   INSERT INTO tbl VALUES (1);
   BEGIN
   DECLARE AUTONOMOUS_TRANSACTION;
     INSERT INTO tbl VALUES (2);
   END;
END;
$$ LANGUAGE plpgsql;

START TRANSACTION;
foo();
ROLLBACK;
```

```
Output:
2
```

 > it doesn't allow this functionality in other PL

I didn't work out on other PLs at the current time, but...

## Python

In plpython we could use context managers, like was proposed in Peter's 
patch. ```

with plpy.autonomous() as a:
     a.execute("INSERT INTO tbl VALUES (1) ");

```

## Perl

I don't programm in Perl. But googling shows Perl supports subroutine 
attributes. Maybe add `autonomous` attribute for autonomous execution?

```
sub foo :autonomous {
}
```

https://www.perl.com/article/untangling-subroutine-attributes/

 > Heikki wrote about the possibility to support threads in Postgres.

3. Do you mean this thread?
https://www.postgresql.org/message-id/flat/31cc6df9-53fe-3cd9-af5b-ac0d801163f4%40iki.fi
Thanks for info. Will watch it. Unfortunately it takes many years to 
implement threads =(

 > Surely, the first topic should be the method of implementation. Maybe 
I missed it, but there is no agreement of background worker based.
I agree. No consensus at the current time.
Pros of bgworkers are:
1. this entity is already in Postgres.
2. possibility of asynchronous execution of autonomous session in the 
future. Like in pg_background extension. For asynchronous execution we 
need a separate process, bgworkers are this separate process.

Also maybe later create autonomous workers themselves without using 
bgworkers internally: launch of separate process, etc. But I think will 
be many common code with bgworkers.

On 21.12.2023 12:35, Pavel Stehule wrote:
> Hi
>
> although I like the idea related to autonomous transactions, I don't 
> think so this way is the best
>
> 1. The solution based on background workers looks too fragile - it can 
> be easy to exhaust all background workers, and because this feature is 
> proposed mainly for logging, then it is a little bit dangerous, 
> because it means loss of possibility of logging.
>
> 2. although the Oracle syntax is interesting, and I proposed PRAGMA 
> more times,  it doesn't allow this functionality in other PL
>
> I don't propose exactly  firebird syntax 
> https://firebirdsql.org/refdocs/langrefupd25-psql-autonomous-trans.html, 
> but I think this solution is better than ADA's PRAGMAs. I can imagine 
> some special flag for function like
>
> CREATE OR REPLACE FUNCTION ...
> AS $$
> $$ LANGUAGE plpgsql AUTONOMOUS TRANSACTION;
>
> as another possibility.
>
> 3. Heikki wrote about the possibility to support threads in Postgres. 
> One significant part of this project is elimination of global 
> variables. It can be common with autonomous transactions.
>
> Surely, the first topic should be the method of implementation. Maybe 
> I missed it, but there is no agreement of background worker based.
>
> Regards
>
> Pavel
>
>
-- 
Best wishes,
Ivan Kush
Tantor Labs LLC

Re: Autonomous transactions 2023, WIP

От

Ivan Kush

Дата:

24 декабря 2023 г., 11:32:48

> Is anyone else using backgroud connections?

Don't know at the current time. Maybe EnterpriseDB uses bgworkers as 
Peter Eisentraut works there currently (LinkedIn  says =)) And in 2016 
he has proposed a patch with autonomous transactions with bgworkers.
https://www.postgresql.org/message-id/flat/659a2fce-b6ee-06de-05c0-c8ed6a01979e%402ndquadrant.com

 > Which syntax is used by other DBMS'?

Main databases use:
1) PRAGMA in block declaration: Oracle, EnterpriseDB, this patch
2) AUTONOMOUS keyword near BEGIN keyword: PostgresPro, SAP HANA
3) AUTONOMOUS keyword in function declaration: IBM DB2
4) сompletely new syntax of autonomous block: Firebird
1 and 2 cases are the same, autonomicity by sub-blocks. Difference only 
in syntax, added to existing block definition
3 case autonomicity only by function (as keyword in function declaration)
4 case should we add completely new block definitions?

# Oracle

Uses PRAGMA AUTONOMOUS_TRANSACTION

```
CREATE FUNCTION foo() RETURNS void AS $$
PRAGMA AUTONOMOUS_TRANSACTION;
BEGIN
   INSERT INTO tbl VALUES (1);
END;
$$ LANGUAGE plpgsql;
```

https://docs.oracle.com/cd/B13789_01/appdev.101/b10807/13_elems002.htm


# EnterpriseDB

Uses PRAGMA AUTONOMOUS_TRANSACTION; as in Oracle

```
CREATE FUNCTION foo() RETURNS void AS $$
PRAGMA AUTONOMOUS_TRANSACTION;
BEGIN
   INSERT INTO tbl VALUES (1);
END;
$$ LANGUAGE plpgsql;
```


https://www.enterprisedb.com/docs/epas/latest/application_programming/epas_compat_spl/06_transaction_control/03_pragma_autonomous_transaction/


# PostgresPro

* plpgsql
Block construction in PL/pgSQL is extended by the optional autonomous 
keyword.

```
CREATE FUNCTION foo() RETURNS void AS $$
BEGIN AUTONOMOUS
   INSERT INTO tbl VALUES (1);
     BEGIN AUTONOMOUS
       ....
     END;
END;
$$ LANGUAGE plpgsql;
```

https://postgrespro.com/docs/enterprise/15/ch16s04

* plpython

autonomous method that can be used in the WITH clause to start an 
autonomous transaction

```
with plpy.autonomous() as a:
     a.execute("INSERT INTO tbl VALUES (1);")
```

https://postgrespro.com/docs/enterprise/15/ch16s05


# IBM DB2

AUTONOMOUS keyword in function declaration

```
CREATE PROCEDURE foo()
AUTONOMOUS
LANGUAGE SQL
BEGIN
   BEGIN AUTONOMOUS TRANSACTION;
     INSERT INTO tbl VALUES (1);
   END:
END;
$$ LANGUAGE plpgsql;
```

https://github.com/IBM/db2-samples/blob/master/admin_scripts/autonomous_transaction.db2
https://subscription.packtpub.com/book/programming/9781849683968/1/ch01lvl1sec09/using-autonomous-transactions


# SAP HANA

Also AUTONOMOUS_TRANSACTION option for blocks

```
CREATE PROCEDURE foo() LANGUAGE SQLSCRIPT AS
BEGIN
   BEGIN AUTONOMOUS TRANSACTION
     INSERT INTO tbl VALUES (1);
   END;
END;
```

https://help.sap.com/docs/SAP_HANA_PLATFORM/de2486ee947e43e684d39702027f8a94/4ad70daee8b64b90ab162565ed6f73ef.html

# Firebird

Completely new block definition `IN AUTONOMOUS TRANSACTION DO`

```

CREATE PROCEDURE foo() AS
BEGIN
   IN AUTONOMOUS TRANSACTION DO
     INSERT INTO tbl VALUES (1);
   END;
END;

```

https://firebirdsql.org/refdocs/langrefupd25-psql-autonomous-trans.html


On 21.12.2023 14:26, Andrey M. Borodin wrote:
>
>> On 15 Dec 2023, at 16:28, Ivan Kush <ivan.kush@tantorlabs.com> wrote:
>>
>>
>>
>> Hello. I'm working on the support of autonomous transactions in Postgres.
>>
>> # Summary
>> * Add pragma AUTONOMOUS_TRANSACTION in the functions. When function
>> contains this pragma, the it's executed autonomously
>> * Background workers are used to run autonomous sessions.
>> * Synchronous execution between backend and autonomous session
>> * Postgres Client-Server Protocol is used to communicate between them
>> * Pool of autonomous sessions. Pool is created lazily.
>> * Infinite nested calls of autonomous functions are allowed. Limited
>> only by computer resources.
>> * If another 2nd autonomous function is called in the 1st autonomous
>> function, the 2nd is executed at the beginning, and then the 1st
>> continues execution.
> Cool, looks interesting! As far as I know EnterpriseDB, Postgres Pro and OracleDB have this functionality. So, seems
likethe stuff is in demand.
 
> How does your version compare to this widely used databases? Is anyone else using backgroud connections? Which syntax
isused by other DBMS'?
 
>
> Looking into the code it seems like an easy way for PL\pgSQL function to have a client connection. I think this might
workfor other PLs too.
 
>
> The patch touches translations ( src/backend/po/). I think we typically do not do this in code patches, because this
workis better handled by translators.
 
>
>
> Best regards, Andrey Borodin.

-- 
Best wishes,
Ivan Kush
Tantor Labs LLC

Re: Autonomous transactions 2023, WIP

От

Pavel Stehule

Дата:

24 декабря 2023 г., 12:38:14

ne 24. 12. 2023 v 12:27 odesílatel Ivan Kush <ivan.kush@tantorlabs.com> napsal:

> 1. The solution based on background workers looks too fragile - it
can be easy to exhaust all background workers, and because this feature
is proposed mainly for logging, then it is a little bit dangerous,
because it means loss of possibility of logging.

1. We could add types for background workers. For each type add
guc-settings, like max workers of each type.
For examaple, for `common` leave `max_worker_processes` setting for
backward compatibility
enum bgw_type {
   common,
   autonomous,
   etc....
};

Can you show some benchmarks? I don't like this system too much but maybe it can work enough.

Still I am interested in possible use cases. If it should be used only for logging, then we can implement something less generic, but surely with better performance and stability. Logging to tables is a little bit outdated.

Regards

Pavel

> 2. although the Oracle syntax is interesting, and I proposed PRAGMA
more times, it doesn't allow this functionality in other PL

2. Add `AUTONOMOUS` to `BEGIN` instead of `PRAGMA` in `DECLARE`? `BEGIN
AUTONOMOUS`.
It shows immediately that we are in autonomous session, no need to
search in subsequent lines for keyword.

```
CREATE FUNCTION foo() RETURNS void AS $$
BEGIN AUTONOMOUS
   INSERT INTO tbl VALUES (1);
   BEGIN AUTONOMOUS
    ....
    END;
END;
$$ LANGUAGE plpgsql;
```

> CREATE OR REPLACE FUNCTION ...
> AS $$
> $$ LANGUAGE plpgsql AUTONOMOUS TRANSACTION;

The downside with the keyword in function declaration, that we will not
be able to create autonomous subblocks. With `PRAGMA AUTONOMOUS` or
`BEGIN AUTONOMOUS` it's possible to create them.

```
-- BEGIN AUTONOMOUS

CREATE FUNCTION foo() RETURNS void AS $$
BEGIN
   INSERT INTO tbl VALUES (1);
   BEGIN AUTONOMOUS
    INSERT INTO tbl VALUES (2);
   END;
END;
$$ LANGUAGE plpgsql;

-- or PRAGMA AUTONOMOUS

CREATE FUNCTION foo() RETURNS void AS $$
BEGIN
   INSERT INTO tbl VALUES (1);
   BEGIN
   DECLARE AUTONOMOUS_TRANSACTION;
    INSERT INTO tbl VALUES (2);
   END;
END;
$$ LANGUAGE plpgsql;

START TRANSACTION;
foo();
ROLLBACK;
```

```
Output:
2
```

> it doesn't allow this functionality in other PL

I didn't work out on other PLs at the current time, but...

## Python

In plpython we could use context managers, like was proposed in Peter's
patch. ```

with plpy.autonomous() as a:
     a.execute("INSERT INTO tbl VALUES (1) ");

```

## Perl

I don't programm in Perl. But googling shows Perl supports subroutine
attributes. Maybe add `autonomous` attribute for autonomous execution?

```
sub foo :autonomous {
}
```

https://www.perl.com/article/untangling-subroutine-attributes/

> Heikki wrote about the possibility to support threads in Postgres.

3. Do you mean this thread?
https://www.postgresql.org/message-id/flat/31cc6df9-53fe-3cd9-af5b-ac0d801163f4%40iki.fi
Thanks for info. Will watch it. Unfortunately it takes many years to
implement threads =(

> Surely, the first topic should be the method of implementation. Maybe
I missed it, but there is no agreement of background worker based.
I agree. No consensus at the current time.
Pros of bgworkers are:
1. this entity is already in Postgres.
2. possibility of asynchronous execution of autonomous session in the
future. Like in pg_background extension. For asynchronous execution we
need a separate process, bgworkers are this separate process.

Also maybe later create autonomous workers themselves without using
bgworkers internally: launch of separate process, etc. But I think will
be many common code with bgworkers.

On 21.12.2023 12:35, Pavel Stehule wrote:
> Hi
>
> although I like the idea related to autonomous transactions, I don't
> think so this way is the best
>
> 1. The solution based on background workers looks too fragile - it can
> be easy to exhaust all background workers, and because this feature is
> proposed mainly for logging, then it is a little bit dangerous,
> because it means loss of possibility of logging.
>
> 2. although the Oracle syntax is interesting, and I proposed PRAGMA
> more times, it doesn't allow this functionality in other PL
>
> I don't propose exactly firebird syntax
> https://firebirdsql.org/refdocs/langrefupd25-psql-autonomous-trans.html,
> but I think this solution is better than ADA's PRAGMAs. I can imagine
> some special flag for function like
>
> CREATE OR REPLACE FUNCTION ...
> AS $$
> $$ LANGUAGE plpgsql AUTONOMOUS TRANSACTION;
>
> as another possibility.
>
> 3. Heikki wrote about the possibility to support threads in Postgres.
> One significant part of this project is elimination of global
> variables. It can be common with autonomous transactions.
>
> Surely, the first topic should be the method of implementation. Maybe
> I missed it, but there is no agreement of background worker based.
>
> Regards
>
> Pavel
>
>
--
Best wishes,
Ivan Kush
Tantor Labs LLC

Re: Autonomous transactions 2023, WIP

От

Ivan Kush

Дата:

31 декабря 2023 г., 14:15:33

On 24.12.2023 15:38, Pavel Stehule wrote:
> Can you show some benchmarks? I don't like this system too much but 
> maybe it can work enough.
>
> Still I am interested in possible use cases. If it should be used only 
> for logging, then we can implement something less generic, but surely 
> with better performance and stability. Logging to tables is a little 
> bit outdated.
>
> Regards
>
> Pavel

All use cases of pg_background, except asynchronous execution. If later 
add asynchronous execution, then all =)

For example, also:

* conversion from Oracle's `PRAGMA AUTONOMOUS` to Postgres.

* possibility to create functions that calls utility statements, like 
VACUUM, etc.

I don't have good benchmarks now. Some simple, like many INSERTs. Pool 
gives advantage, more tps compared to pg_background with increasing 
number of backends.

The main advantage over pg_background is pool of workers. In this patch 
separate pool is created for each backend. At the current time I'm 
coding one shared pool for all backends.

>
>
>      > 2. although the Oracle syntax is interesting, and I proposed
>     PRAGMA
>     more times,  it doesn't allow this functionality in other PL
>
>     2. Add `AUTONOMOUS` to `BEGIN` instead of `PRAGMA` in `DECLARE`?
>     `BEGIN
>     AUTONOMOUS`.
>     It shows immediately that we are in autonomous session, no need to
>     search in subsequent lines for keyword.
>
>     ```
>     CREATE FUNCTION foo() RETURNS void AS $$
>     BEGIN AUTONOMOUS
>        INSERT INTO tbl VALUES (1);
>        BEGIN AUTONOMOUS
>         ....
>         END;
>     END;
>     $$ LANGUAGE plpgsql;
>     ```
>
>      > CREATE OR REPLACE FUNCTION ...
>      > AS $$
>      > $$ LANGUAGE plpgsql AUTONOMOUS TRANSACTION;
>
>     The downside with the keyword in function declaration, that we
>     will not
>     be able to create autonomous subblocks. With `PRAGMA AUTONOMOUS` or
>     `BEGIN AUTONOMOUS` it's possible to create them.
>
>     ```
>     -- BEGIN AUTONOMOUS
>
>     CREATE FUNCTION foo() RETURNS void AS $$
>     BEGIN
>        INSERT INTO tbl VALUES (1);
>        BEGIN AUTONOMOUS
>          INSERT INTO tbl VALUES (2);
>        END;
>     END;
>     $$ LANGUAGE plpgsql;
>
>
>     -- or PRAGMA AUTONOMOUS
>
>     CREATE FUNCTION foo() RETURNS void AS $$
>     BEGIN
>        INSERT INTO tbl VALUES (1);
>        BEGIN
>        DECLARE AUTONOMOUS_TRANSACTION;
>          INSERT INTO tbl VALUES (2);
>        END;
>     END;
>     $$ LANGUAGE plpgsql;
>
>
>     START TRANSACTION;
>     foo();
>     ROLLBACK;
>     ```
>
>     ```
>     Output:
>     2
>     ```
>
>      > it doesn't allow this functionality in other PL
>
>     I didn't work out on other PLs at the current time, but...
>
>     ## Python
>
>     In plpython we could use context managers, like was proposed in
>     Peter's
>     patch. ```
>
>     with plpy.autonomous() as a:
>          a.execute("INSERT INTO tbl VALUES (1) ");
>
>     ```
>
>     ## Perl
>
>     I don't programm in Perl. But googling shows Perl supports subroutine
>     attributes. Maybe add `autonomous` attribute for autonomous execution?
>
>     ```
>     sub foo :autonomous {
>     }
>     ```
>
>     https://www.perl.com/article/untangling-subroutine-attributes/
>
>
>      > Heikki wrote about the possibility to support threads in Postgres.
>
>     3. Do you mean this thread?
>     https://www.postgresql.org/message-id/flat/31cc6df9-53fe-3cd9-af5b-ac0d801163f4%40iki.fi
>     Thanks for info. Will watch it. Unfortunately it takes many years to
>     implement threads =(
>
>      > Surely, the first topic should be the method of implementation.
>     Maybe
>     I missed it, but there is no agreement of background worker based.
>     I agree. No consensus at the current time.
>     Pros of bgworkers are:
>     1. this entity is already in Postgres.
>     2. possibility of asynchronous execution of autonomous session in the
>     future. Like in pg_background extension. For asynchronous
>     execution we
>     need a separate process, bgworkers are this separate process.
>
>     Also maybe later create autonomous workers themselves without using
>     bgworkers internally: launch of separate process, etc. But I think
>     will
>     be many common code with bgworkers.
>
>
>     On 21.12.2023 12:35, Pavel Stehule wrote:
>     > Hi
>     >
>     > although I like the idea related to autonomous transactions, I
>     don't
>     > think so this way is the best
>     >
>     > 1. The solution based on background workers looks too fragile -
>     it can
>     > be easy to exhaust all background workers, and because this
>     feature is
>     > proposed mainly for logging, then it is a little bit dangerous,
>     > because it means loss of possibility of logging.
>     >
>     > 2. although the Oracle syntax is interesting, and I proposed PRAGMA
>     > more times,  it doesn't allow this functionality in other PL
>     >
>     > I don't propose exactly  firebird syntax
>     >
>     https://firebirdsql.org/refdocs/langrefupd25-psql-autonomous-trans.html,
>
>     > but I think this solution is better than ADA's PRAGMAs. I can
>     imagine
>     > some special flag for function like
>     >
>     > CREATE OR REPLACE FUNCTION ...
>     > AS $$
>     > $$ LANGUAGE plpgsql AUTONOMOUS TRANSACTION;
>     >
>     > as another possibility.
>     >
>     > 3. Heikki wrote about the possibility to support threads in
>     Postgres.
>     > One significant part of this project is elimination of global
>     > variables. It can be common with autonomous transactions.
>     >
>     > Surely, the first topic should be the method of implementation.
>     Maybe
>     > I missed it, but there is no agreement of background worker based.
>     >
>     > Regards
>     >
>     > Pavel
>     >
>     >
>     -- 
>     Best wishes,
>     Ivan Kush
>     Tantor Labs LLC
>
-- 
Best wishes,
Ivan Kush
Tantor Labs LLC

Re: Autonomous transactions 2023, WIP

От

Pavel Stehule

Дата:

01 января 2024 г., 06:47:21

ne 31. 12. 2023 v 15:15 odesílatel Ivan Kush <ivan.kush@tantorlabs.com> napsal:

On 24.12.2023 15:38, Pavel Stehule wrote:
> Can you show some benchmarks? I don't like this system too much but
> maybe it can work enough.
>
> Still I am interested in possible use cases. If it should be used only
> for logging, then we can implement something less generic, but surely
> with better performance and stability. Logging to tables is a little
> bit outdated.
>
> Regards
>
> Pavel

All use cases of pg_background, except asynchronous execution. If later
add asynchronous execution, then all =)

For example, also:

* conversion from Oracle's `PRAGMA AUTONOMOUS` to Postgres.

* possibility to create functions that calls utility statements, like
VACUUM, etc.

almost all these tasks are more or less dirty. It is a serious question if we want to integrate pg_background to core.

I don't have good benchmarks now. Some simple, like many INSERTs. Pool
gives advantage, more tps compared to pg_background with increasing
number of backends.

The main advantage over pg_background is pool of workers. In this patch
separate pool is created for each backend. At the current time I'm
coding one shared pool for all backends.

I afraid so this solution can be very significantly slower than logging to postgres log or forwarding to syslog

>
>
>   > 2. although the Oracle syntax is interesting, and I proposed
> PRAGMA
> more times, it doesn't allow this functionality in other PL
>
> 2. Add `AUTONOMOUS` to `BEGIN` instead of `PRAGMA` in `DECLARE`?
> `BEGIN
> AUTONOMOUS`.
> It shows immediately that we are in autonomous session, no need to
> search in subsequent lines for keyword.
>
> ```
> CREATE FUNCTION foo() RETURNS void AS $$
> BEGIN AUTONOMOUS
>     INSERT INTO tbl VALUES (1);
>     BEGIN AUTONOMOUS
>      ....
>      END;
> END;
> $$ LANGUAGE plpgsql;
> ```
>
>   > CREATE OR REPLACE FUNCTION ...
>   > AS $$
>   > $$ LANGUAGE plpgsql AUTONOMOUS TRANSACTION;
>
> The downside with the keyword in function declaration, that we
> will not
> be able to create autonomous subblocks. With `PRAGMA AUTONOMOUS` or
> `BEGIN AUTONOMOUS` it's possible to create them.
>
> ```
> -- BEGIN AUTONOMOUS
>
> CREATE FUNCTION foo() RETURNS void AS $$
> BEGIN
>     INSERT INTO tbl VALUES (1);
>     BEGIN AUTONOMOUS
>      INSERT INTO tbl VALUES (2);
>     END;
> END;
> $$ LANGUAGE plpgsql;
>
>
> -- or PRAGMA AUTONOMOUS
>
> CREATE FUNCTION foo() RETURNS void AS $$
> BEGIN
>     INSERT INTO tbl VALUES (1);
>     BEGIN
>     DECLARE AUTONOMOUS_TRANSACTION;
>      INSERT INTO tbl VALUES (2);
>     END;
> END;
> $$ LANGUAGE plpgsql;
>
>
> START TRANSACTION;
> foo();
> ROLLBACK;
> ```
>
> ```
> Output:
> 2
> ```
>
>   > it doesn't allow this functionality in other PL
>
> I didn't work out on other PLs at the current time, but...
>
> ## Python
>
> In plpython we could use context managers, like was proposed in
> Peter's
> patch. ```
>
> with plpy.autonomous() as a:
>       a.execute("INSERT INTO tbl VALUES (1) ");
>
> ```
>
> ## Perl
>
> I don't programm in Perl. But googling shows Perl supports subroutine
> attributes. Maybe add `autonomous` attribute for autonomous execution?
>
> ```
> sub foo :autonomous {
> }
> ```
>
> https://www.perl.com/article/untangling-subroutine-attributes/
>
>
>   > Heikki wrote about the possibility to support threads in Postgres.
>
> 3. Do you mean this thread?
> https://www.postgresql.org/message-id/flat/31cc6df9-53fe-3cd9-af5b-ac0d801163f4%40iki.fi
> Thanks for info. Will watch it. Unfortunately it takes many years to
> implement threads =(
>
>   > Surely, the first topic should be the method of implementation.
> Maybe
> I missed it, but there is no agreement of background worker based.
> I agree. No consensus at the current time.
> Pros of bgworkers are:
> 1. this entity is already in Postgres.
> 2. possibility of asynchronous execution of autonomous session in the
> future. Like in pg_background extension. For asynchronous
> execution we
> need a separate process, bgworkers are this separate process.
>
> Also maybe later create autonomous workers themselves without using
> bgworkers internally: launch of separate process, etc. But I think
> will
> be many common code with bgworkers.
>
>
> On 21.12.2023 12:35, Pavel Stehule wrote:
> > Hi
> >
> > although I like the idea related to autonomous transactions, I
> don't
> > think so this way is the best
> >
> > 1. The solution based on background workers looks too fragile -
> it can
> > be easy to exhaust all background workers, and because this
> feature is
> > proposed mainly for logging, then it is a little bit dangerous,
> > because it means loss of possibility of logging.
> >
> > 2. although the Oracle syntax is interesting, and I proposed PRAGMA
> > more times, it doesn't allow this functionality in other PL
> >
> > I don't propose exactly firebird syntax
> >
> https://firebirdsql.org/refdocs/langrefupd25-psql-autonomous-trans.html,
>
> > but I think this solution is better than ADA's PRAGMAs. I can
> imagine
> > some special flag for function like
> >
> > CREATE OR REPLACE FUNCTION ...
> > AS $$
> > $$ LANGUAGE plpgsql AUTONOMOUS TRANSACTION;
> >
> > as another possibility.
> >
> > 3. Heikki wrote about the possibility to support threads in
> Postgres.
> > One significant part of this project is elimination of global
> > variables. It can be common with autonomous transactions.
> >
> > Surely, the first topic should be the method of implementation.
> Maybe
> > I missed it, but there is no agreement of background worker based.
> >
> > Regards
> >
> > Pavel
> >
> >
> --
> Best wishes,
> Ivan Kush
> Tantor Labs LLC
>
--
Best wishes,
Ivan Kush
Tantor Labs LLC

Re: Autonomous transactions 2023, WIP

От

Ivan Kush

Дата:

01 января 2024 г., 11:15:27

On 01.01.2024 09:47, Pavel Stehule wrote:
>
>
>     All use cases of pg_background, except asynchronous execution. If
>     later
>     add asynchronous execution, then all =)
>
>     For example, also:
>
>     * conversion from Oracle's `PRAGMA AUTONOMOUS` to Postgres.
>
>     * possibility to create functions that calls utility statements, like
>     VACUUM, etc.
>
>
> almost all these tasks are more or less dirty. It is a serious 
> question if we want to integrate pg_background to core.

What do you mean by the "dirty"?

>
>     I don't have good benchmarks now. Some simple, like many INSERTs.
>     Pool
>     gives advantage, more tps compared to pg_background with increasing
>     number of backends.
>
>     The main advantage over pg_background is pool of workers. In this
>     patch
>     separate pool is created for each backend. At the current time I'm
>     coding one shared pool for all backends.
>
>
> I afraid so this solution can be very significantly slower than 
> logging to postgres log or forwarding to syslog

Maybe. Need to benchmark. Also OLAP like ClickHouse is better for 
storing logs.

But in this case (log file -> database) a company needs to write a 
custom utility to load log file to the database:

* detect file size,

* load to database

* autorotate file by timeout of filesize

* etc.

Some of our customers use "Autonomous transactions" for logging =)

-- 
Best wishes,
Ivan Kush
Tantor Labs LLC

Re: Autonomous transactions 2023, WIP

От

Pavel Stehule

Дата:

01 января 2024 г., 12:49:43

po 1. 1. 2024 v 12:15 odesílatel Ivan Kush <ivan.kush@tantorlabs.com> napsal:

On 01.01.2024 09:47, Pavel Stehule wrote:
>
>
> All use cases of pg_background, except asynchronous execution. If
> later
> add asynchronous execution, then all =)
>
> For example, also:
>
> * conversion from Oracle's `PRAGMA AUTONOMOUS` to Postgres.
>
> * possibility to create functions that calls utility statements, like
> VACUUM, etc.
>
>
> almost all these tasks are more or less dirty. It is a serious
> question if we want to integrate pg_background to core.

What do you mean by the "dirty"?

I don't think so these task should be implemented in stored procedures

>
> I don't have good benchmarks now. Some simple, like many INSERTs.
> Pool
> gives advantage, more tps compared to pg_background with increasing
> number of backends.
>
> The main advantage over pg_background is pool of workers. In this
> patch
> separate pool is created for each backend. At the current time I'm
> coding one shared pool for all backends.
>
>
> I afraid so this solution can be very significantly slower than
> logging to postgres log or forwarding to syslog

Maybe. Need to benchmark. Also OLAP like ClickHouse is better for
storing logs.

But in this case (log file -> database) a company needs to write a
custom utility to load log file to the database:

* detect file size,

* load to database

* autorotate file by timeout of filesize

* etc.

Some of our customers use "Autonomous transactions" for logging =)

I understand the motivation. But it was designed 20 years ago, and I don't see a reason why we need to implement the same "bad" patterns, mainly when the proposed implementation is not fully robust or can have performance issues.

--
Best wishes,
Ivan Kush
Tantor Labs LLC

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Autonomous transactions 2023, WIP

Вложения