Обсуждение: Statistics updates is delayed when using `commit and chain`

Поиск
Список
Период
Сортировка

Statistics updates is delayed when using `commit and chain`

От
Lætitia Avrot
Дата:
Hello,

With a customer, we found out that when using `commit and chain`, statistics on the table were not updated. Here are the steps to reproduce (My customer saw this on Postgres 13, I confirmed it under current main version):

laetitia=# laetitia=# select n_tup_ins from pg_stat_all_tables where relname = 'test';

 n_tup_ins 

-----------

        17

(1 row)


laetitia=# begin;

BEGIN

laetitia=*# insert into test (value) values ('bla');

INSERT 0 1

laetitia=*# commit and chain;

COMMIT

laetitia=*# select n_tup_ins from pg_stat_all_tables where relname = 'test';

 n_tup_ins 

-----------

        17

(1 row)


laetitia=*# commit;

COMMIT

laetitia=# select n_tup_ins from pg_stat_all_tables where relname = 'test';

 n_tup_ins 

-----------

        18

(1 row)


Before issuing the last `commit`, I used another connection to check the value of the statistics from another transaction and it was not updated:

laetitia=# select n_tup_ins from pg_stat_all_tables where relname = 'test';

 n_tup_ins 

-----------

        17

(1 row)


Maybe it's not a bug and it's on purpose but I can't understand what would prevent the statistics collector from being triggered after a `commti and chain`.

Have a nice day,

Lætitia

Re: Statistics updates is delayed when using `commit and chain`

От
Japin Li
Дата:
On Fri, 09 Jul 2021 at 17:05, Lætitia Avrot <laetitia.avrot@gmail.com> wrote:
> Hello,
>
> With a customer, we found out that when using `commit and chain`, statistics on the table were not updated. Here are
thesteps to reproduce (My customer saw this on Postgres 13, I confirmed it under current main version): 
>
> laetitia=# laetitia=# select n_tup_ins from pg_stat_all_tables where relname = 'test';
>
>  n_tup_ins
>
> -----------
>
>         17
>
> (1 row)
>
> laetitia=# begin;
>
> BEGIN
>
> laetitia=*# insert into test (value) values ('bla');
>
> INSERT 0 1
>
> laetitia=*# commit and chain;
>
> COMMIT
>
> laetitia=*# select n_tup_ins from pg_stat_all_tables where relname = 'test';
>
>  n_tup_ins
>
> -----------
>
>         17
>
> (1 row)
>
> laetitia=*# commit;
>
> COMMIT
>
> laetitia=# select n_tup_ins from pg_stat_all_tables where relname = 'test';
>
>  n_tup_ins
>
> -----------
>
>         18
>
> (1 row)
>
> Before issuing the last `commit`, I used another connection to check the value of the statistics from another
transactionand it was not updated: 
>
> laetitia=# select n_tup_ins from pg_stat_all_tables where relname = 'test';
>
>  n_tup_ins
>
> -----------
>
>         17
>
> (1 row)
>
> Maybe it's not a bug and it's on purpose but I can't understand what would prevent the statistics collector from
beingtriggered after a `commti and chain`. 
>

After some analyze, I find the table statistics updated only when not within
a transaction. If you use COMMIT AND CHAIN, we still in a transaction, so the
statistics do not updated.

See src/backend/tcop/postgres.c:

            else if (IsTransactionOrTransactionBlock())  <-------- After call COMMIT AND CHAIN, we come here.
            {
                set_ps_display("idle in transaction");
                pgstat_report_activity(STATE_IDLEINTRANSACTION, NULL);

                /* Start the idle-in-transaction timer */
                if (IdleInTransactionSessionTimeout > 0)
                {
                    idle_in_transaction_timeout_enabled = true;
                    enable_timeout_after(IDLE_IN_TRANSACTION_SESSION_TIMEOUT,
                                         IdleInTransactionSessionTimeout);
                }
            }
            else         <-------- After call COMMIT, we come here.
            {
                /* Send out notify signals and transmit self-notifies */
                ProcessCompletedNotifies();

                /*
                 * Also process incoming notifies, if any.  This is mostly to
                 * ensure stable behavior in tests: if any notifies were
                 * received during the just-finished transaction, they'll be
                 * seen by the client before ReadyForQuery is.
                 */
                if (notifyInterruptPending)
                    ProcessNotifyInterrupt();

                pgstat_report_stat(false);                <-------- Update statistics.

                set_ps_display("idle");
                pgstat_report_activity(STATE_IDLE, NULL);

                /* Start the idle-session timer */
                if (IdleSessionTimeout > 0)
                {
                    idle_session_timeout_enabled = true;
                    enable_timeout_after(IDLE_SESSION_TIMEOUT,
                                         IdleSessionTimeout);
                }
            }


--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.



Re: Statistics updates is delayed when using `commit and chain`

От
Japin Li
Дата:
On Fri, 09 Jul 2021 at 19:02, Japin Li <japinli@hotmail.com> wrote:
> On Fri, 09 Jul 2021 at 17:05, Lætitia Avrot <laetitia.avrot@gmail.com> wrote:
>> Hello,
>>
>> With a customer, we found out that when using `commit and chain`, statistics on the table were not updated. Here are
thesteps to reproduce (My customer saw this on Postgres 13, I confirmed it under current main version): 
>>
>> laetitia=# laetitia=# select n_tup_ins from pg_stat_all_tables where relname = 'test';
>>
>>  n_tup_ins
>>
>> -----------
>>
>>         17
>>
>> (1 row)
>>
>> laetitia=# begin;
>>
>> BEGIN
>>
>> laetitia=*# insert into test (value) values ('bla');
>>
>> INSERT 0 1
>>
>> laetitia=*# commit and chain;
>>
>> COMMIT
>>
>> laetitia=*# select n_tup_ins from pg_stat_all_tables where relname = 'test';
>>
>>  n_tup_ins
>>
>> -----------
>>
>>         17
>>
>> (1 row)
>>
>> laetitia=*# commit;
>>
>> COMMIT
>>
>> laetitia=# select n_tup_ins from pg_stat_all_tables where relname = 'test';
>>
>>  n_tup_ins
>>
>> -----------
>>
>>         18
>>
>> (1 row)
>>
>> Before issuing the last `commit`, I used another connection to check the value of the statistics from another
transactionand it was not updated: 
>>
>> laetitia=# select n_tup_ins from pg_stat_all_tables where relname = 'test';
>>
>>  n_tup_ins
>>
>> -----------
>>
>>         17
>>
>> (1 row)
>>
>> Maybe it's not a bug and it's on purpose but I can't understand what would prevent the statistics collector from
beingtriggered after a `commti and chain`. 
>>
>
> After some analyze, I find the table statistics updated only when not within
> a transaction. If you use COMMIT AND CHAIN, we still in a transaction, so the
> statistics do not updated.
>
> See src/backend/tcop/postgres.c:
>
>             else if (IsTransactionOrTransactionBlock())  <-------- After call COMMIT AND CHAIN, we come here.
>             {
>                 set_ps_display("idle in transaction");
>                 pgstat_report_activity(STATE_IDLEINTRANSACTION, NULL);
>
>                 /* Start the idle-in-transaction timer */
>                 if (IdleInTransactionSessionTimeout > 0)
>                 {
>                     idle_in_transaction_timeout_enabled = true;
>                     enable_timeout_after(IDLE_IN_TRANSACTION_SESSION_TIMEOUT,
>                                          IdleInTransactionSessionTimeout);
>                 }
>             }
>             else         <-------- After call COMMIT, we come here.
>             {
>                 /* Send out notify signals and transmit self-notifies */
>                 ProcessCompletedNotifies();
>
>                 /*
>                  * Also process incoming notifies, if any.  This is mostly to
>                  * ensure stable behavior in tests: if any notifies were
>                  * received during the just-finished transaction, they'll be
>                  * seen by the client before ReadyForQuery is.
>                  */
>                 if (notifyInterruptPending)
>                     ProcessNotifyInterrupt();
>
>                 pgstat_report_stat(false);                <-------- Update statistics.
>
>                 set_ps_display("idle");
>                 pgstat_report_activity(STATE_IDLE, NULL);
>
>                 /* Start the idle-session timer */
>                 if (IdleSessionTimeout > 0)
>                 {
>                     idle_session_timeout_enabled = true;
>                     enable_timeout_after(IDLE_SESSION_TIMEOUT,
>                                          IdleSessionTimeout);
>                 }
>             }

Attached fixes it by call pgstat_report_stat() when we a in COMMIT AND CHAIN mode.
Any thoughts?

--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.


Вложения

Re: Statistics updates is delayed when using `commit and chain`

От
Japin Li
Дата:
On Fri, 09 Jul 2021 at 20:25, John Naylor <john.naylor@enterprisedb.com> wrote:
> On Fri, Jul 9, 2021 at 7:15 AM Japin Li <japinli@hotmail.com> wrote:
>> >                 /* Send out notify signals and transmit self-notifies */
>> >                 ProcessCompletedNotifies();
>> >
>> >                 /*
>> >                  * Also process incoming notifies, if any.  This is
> mostly to
>> >                  * ensure stable behavior in tests: if any notifies were
>> >                  * received during the just-finished transaction,
> they'll be
>> >                  * seen by the client before ReadyForQuery is.
>> >                  */
>> >                 if (notifyInterruptPending)
>> >                     ProcessNotifyInterrupt();
>
> It seems the above would also be skipped in chained transactions -- do we
> need to handle notifies as well?
>

Thanks for your review! Modified.

>> Attached fixes it by call pgstat_report_stat() when we a in COMMIT AND
> CHAIN mode.
>> Any thoughts?
>
> Do we need equivalent logic within the TBLOCK_SUBCOMMIT case also? Either
> way, a comment is probably in order.

Add a new function ProcessNotifiesAndStat() to process notifies and report
statistics, then call this function in TBLOCK_SUBCOMMIT, TBLOCK_END,
TBLOCK_ABORT_END, TBLOCK_ABORT_PENDING and TBLOCK_SUBCOMMIT cases.

Please consider v2 patch to review.

--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.


Вложения

Re: Statistics updates is delayed when using `commit and chain`

От
Lætitia Avrot
Дата:


Le sam. 10 juil. 2021 à 04:31, Japin Li <japinli@hotmail.com> a écrit :

On Fri, 09 Jul 2021 at 20:25, John Naylor <john.naylor@enterprisedb.com> wrote:
> On Fri, Jul 9, 2021 at 7:15 AM Japin Li <japinli@hotmail.com> wrote:
>> >                 /* Send out notify signals and transmit self-notifies */
>> >                 ProcessCompletedNotifies();
>> >
>> >                 /*
>> >                  * Also process incoming notifies, if any.  This is
> mostly to
>> >                  * ensure stable behavior in tests: if any notifies were
>> >                  * received during the just-finished transaction,
> they'll be
>> >                  * seen by the client before ReadyForQuery is.
>> >                  */
>> >                 if (notifyInterruptPending)
>> >                     ProcessNotifyInterrupt();
>
> It seems the above would also be skipped in chained transactions -- do we
> need to handle notifies as well?
>

Thanks for your review! Modified.

>> Attached fixes it by call pgstat_report_stat() when we a in COMMIT AND
> CHAIN mode.
>> Any thoughts?
>
> Do we need equivalent logic within the TBLOCK_SUBCOMMIT case also? Either
> way, a comment is probably in order.

Add a new function ProcessNotifiesAndStat() to process notifies and report
statistics, then call this function in TBLOCK_SUBCOMMIT, TBLOCK_END,
TBLOCK_ABORT_END, TBLOCK_ABORT_PENDING and TBLOCK_SUBCOMMIT cases.

Please consider v2 patch to review.

--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.


Hello Japin,

Thank you for the patch. I read it and I find myself with one question: why do we update statistics even though there was a rollback? I know that that was the behaviour before, but is it still worth it?

I tested it against Postgres 14 Beta 2 and Postgres 15 (commit hash 4c64b51dc51f8ff7e3e51eebfe8d10469a577df1) and it worked like a charm for my use case:

laetitia=# select n_tup_ins from pg_stat_all_tables where relname = 'test'; 

 n_tup_ins 

-----------

         1

(1 row)


laetitia=# begin; 

BEGIN

laetitia=*#   insert into test (value) values ('bla');

INSERT 0 1

laetitia=*#   select n_tup_ins from pg_stat_all_tables where relname = 'test';

 n_tup_ins 

-----------

         1

(1 row)


laetitia=*# commit and chain; 

COMMIT

laetitia=*# select n_tup_ins from pg_stat_all_tables where relname = 'test';

 n_tup_ins 

-----------

         2

(1 row)


laetitia=*# commit; 

COMMIT

laetitia=#  select n_tup_ins from pg_stat_all_tables where relname = 'test';

 n_tup_ins 

-----------

         2

(1 row)
 Have a nice day,

Lætitia

Re: Statistics updates is delayed when using `commit and chain`

От
Japin Li
Дата:
On Mon, 12 Jul 2021 at 20:56, Lætitia Avrot <laetitia.avrot@gmail.com> wrote:

> Hello Japin,
>
> Thank you for the patch. I read it and I find myself with one question: why do we update statistics even though there
wasa rollback? I know that that was the behaviour before, but is it still worth it? 
>

I test the following case, and found it also update the statistics even if we
rollback. So I update the statistics here. On the other hands, the insert tuple
rollbacked still in data block, so IMO we should update the statistics. I'm not
sure it is right.

postgres=# create table test (id serial, value text);
CREATE TABLE
postgres=# SELECT n_tup_ins FROM pg_stat_all_tables WHERE relname = 'test';
 n_tup_ins
-----------
         0
(1 row)

postgres=# BEGIN;
BEGIN
postgres=*# INSERT INTO test (value) VALUES ('bla');
INSERT 0 1
postgres=*# SELECT n_tup_ins FROM pg_stat_all_tables WHERE relname = 'test';
 n_tup_ins
-----------
         0
(1 row)

postgres=*# ABORT ;
ROLLBACK
postgres=# SELECT n_tup_ins FROM pg_stat_all_tables WHERE relname = 'test';
 n_tup_ins
-----------
         1
(1 row)

--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.



Re: Statistics updates is delayed when using `commit and chain`

От
Tom Lane
Дата:
John Naylor <john.naylor@enterprisedb.com> writes:
> I understand this functionality to be tracking what happens at the physical
> level, in which case it is correct. In any case, the bug reported is clear
> and changing that behavior is the focus here. About the patch:

I do not think this is a bug at all.  The behavior is, and always has
been, that we report stats when we are about to wait for client input
and are not inside a transaction.  Likewise for NOTIFY.  The proposed
patch randomly changes that in a way that is very likely to break
clients.

Maybe you can persuade me that there's a reason to move the responsibility
for stats reporting to some other place, but please keep your hands
OFF of NOTIFY.  You clearly haven't the faintest idea what the client
contract for that is.

            regards, tom lane



Re: Statistics updates is delayed when using `commit and chain`

От
Japin Li
Дата:
On Mon, 12 Jul 2021 at 23:30, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> John Naylor <john.naylor@enterprisedb.com> writes:
>> I understand this functionality to be tracking what happens at the physical
>> level, in which case it is correct. In any case, the bug reported is clear
>> and changing that behavior is the focus here. About the patch:
>
> I do not think this is a bug at all.  The behavior is, and always has
> been, that we report stats when we are about to wait for client input
> and are not inside a transaction.  Likewise for NOTIFY.  The proposed
> patch randomly changes that in a way that is very likely to break
> clients.
>
Sorry, I'm not very clearly why we can not process NOTIFY when a transaction is
end.  For example:

------------------------------------------------------------------------------
Step 1. Session 1:
postgres# LISTEN tn;
LISTEN

Step 2. Session 2:
postgers# CREATE TABLE tbl (id int)
CREATE TABLE
postgres=# BEGIN;
BEGIN
postgres=*# INSERT INTO tbl values (1);
INSERT 0 1
postgres=*# NOTIFY tn, 'hello';
NOTIFY

Step 3. Session 1:
postgres=# SELECT * FROM tbl;
 id
----
(0 rows)

Step 4. Session 2:
postgres=*# COMMIT AND CHAIN;
COMMIT

Step 5. Session 1:
postgres=# SELECT * FROM tbl;
 id
----
  1
(1 row)

Step 6. Session 2:
postgres=*# COMMIT;
COMMIT

Step 7. Session 1:
postgres=# SELECT * FROM tbl;
 id
----
  1
(1 row)

Asynchronous notification "tn" with payload "hello" received from server process with PID 16348.

------------------------------------------------------------------------------

Since we commit the data in step 4, however the notify doesn't handle, I think
this is a bit confused.  Why we must wait session is in idle to handle the notify?

> Maybe you can persuade me that there's a reason to move the responsibility
> for stats reporting to some other place, but please keep your hands
> OFF of NOTIFY.  You clearly haven't the faintest idea what the client
> contract for that is.
>
>             regards, tom lane


-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.



Re: Statistics updates is delayed when using `commit and chain`

От
Tom Lane
Дата:
Japin Li <japinli@hotmail.com> writes:
> On Mon, 12 Jul 2021 at 23:30, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I do not think this is a bug at all.  The behavior is, and always has
>> been, that we report stats when we are about to wait for client input
>> and are not inside a transaction.  Likewise for NOTIFY.  The proposed
>> patch randomly changes that in a way that is very likely to break
>> clients.

> Sorry, I'm not very clearly why we can not process NOTIFY when a transaction is
> end.

The protocol contract for that is that NOTIFY is delivered *between*
transactions.  With the proposed patch, the notification can be delivered
in the middle of the new transaction created by COMMIT AND CHAIN.  This
creates semantic issues --- in particular, if that second transaction is
later rolled back, does that mean we should re-send the notification?

The NOTIFY man page describes this explicitly:

    ... Secondly, if a listening session receives a notification signal
    while it is within a transaction, the notification event will not be
    delivered to its connected client until just after the transaction is
    completed (either committed or aborted). Again, the reasoning is that
    if a notification were delivered within a transaction that was later
    aborted, one would want the notification to be undone somehow — but
    the server cannot “take back” a notification once it has sent it to
    the client. So notification events are only delivered between
    transactions.

(protocol.sgml is a bit wishy-washy about this point, but I think we
should lock the wording there down harder.  We have never delivered
notifies intra-transaction, and now is no time to start.)

            regards, tom lane



Re: Statistics updates is delayed when using `commit and chain`

От
"David G. Johnston"
Дата:
On Tue, Jul 13, 2021 at 8:18 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Japin Li <japinli@hotmail.com> writes:
> On Mon, 12 Jul 2021 at 23:30, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I do not think this is a bug at all.  The behavior is, and always has
>> been, that we report stats when we are about to wait for client input
>> and are not inside a transaction.  Likewise for NOTIFY.  The proposed
>> patch randomly changes that in a way that is very likely to break
>> clients.

 
> Sorry, I'm not very clearly why we can not process NOTIFY when a transaction is
> end.

This statement I believe is saying: "Why can we not SEND a notification when the first transactions ends and before the second transaction starts."


The protocol contract for that is that NOTIFY is delivered *between*
transactions.

This is saying "We must not RECEIVE a notification unless the client is prepared to handle it - idle and not within a transaction."

In an example where a client is listening for its own notifications a notification sent in the first transaction would not be received by itself until after the second transaction completes (regardless of whether it committed or was rolled back).  But since the first transaction committed its notification can never be undone and so there is no harm from releasing the notification as soon as the first transaction commits (or, at least none of the documented harms).

So, the choice to delay releasing the notification upon committing the first transaction of a chain is a design choice.  Users using commit-and-chain are assumed to want all of their notifications in the chain to be sent as a single group (though we must not deduplicate notifications created in different parts of the chain).  This, however, I do not believe is documented, nor is it the only valid choice, though it is what we do today without providing the user a choice.

David J.

Re: Statistics updates is delayed when using `commit and chain`

От
Tom Lane
Дата:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Tue, Jul 13, 2021 at 8:18 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The protocol contract for that is that NOTIFY is delivered *between*
>> transactions.

> This is saying "We must not RECEIVE a notification unless the client is
> prepared to handle it - idle and not within a transaction."

Right.

> So, the choice to delay releasing the notification upon committing the
> first transaction of a chain is a design choice.

Sure, you could argue that.  But look at it from the client's side.
How would it detangle its response to the notify event from whatever
it's doing as part of the chained transaction?  You don't want to
risk the notify-response action failing because of a problem in the
chained transaction, nor vice versa, because they are independent things.
(Whatever the notify is about *did* commit, independently of what
you are doing.)  It's very hard to see how you could do that sanely,
other than by delaying handling the notify event till after you're out
of the chain.

Another angle is that (I think) the proposed patch would allow notifies
to be delivered during a commit that happens inside a procedure.  At
that point, the client doesn't even have control of the connection
with which to do something in response to the notify.

There might be value in the aspect of the patch that sends statistics
upon commit/rollback.  But NOTIFY is a *totally* different animal.
The fact that the code happens to be physically adjacent in the current
implementation doesn't mean that the semantic constraints are similar.

            regards, tom lane



Re: Statistics updates is delayed when using `commit and chain`

От
"David G. Johnston"
Дата:
On Tue, Jul 13, 2021 at 8:52 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
But look at it from the client's side.
How would it detangle its response to the notify event from whatever
it's doing as part of the chained transaction?

The chain writing client wouldn't receive its own notification until after the chain completes and thus would be unable to respond to it, avoiding all of this and having the same behavior as today.  It's any client except the chain authoring one that would receive the notification upon each chain link committing instead of waiting until the end of the chain.  I'm fine with those other clients having to wait, just as the chain authoring client does, until the chain is over as a matter of consistency - but I don't see how given those other clients earlier access to the notification risks being considered here.  We are already doing that now, though I argue we are in fact documenting the "other clients receive the notification as soon as committed - regardless of chaining" situation (the additional restrictions on receiving are then what cause the chain authoring client to wait).

David J.

Re: Statistics updates is delayed when using `commit and chain`

От
"David G. Johnston"
Дата:
On Tue, Jul 13, 2021 at 9:02 PM David G. Johnston <david.g.johnston@gmail.com> wrote:
We are already doing that now, though I argue we are in fact documenting the "other clients receive the notification as soon as committed - regardless of chaining" situation (the additional restrictions on receiving are then what cause the chain authoring client to wait).


Specifically:

Firstly, if a NOTIFY is executed inside a transaction, the notify events are not delivered until and unless the transaction is committed.

vs.

So notification events are only delivered between transactions.

I'll accept that "and-chain" means there is no "between transactions" period.  But there is no qualification to "until the transaction is committed".

David J.

Re: Statistics updates is delayed when using `commit and chain`

От
Japin Li
Дата:
On Wed, 14 Jul 2021 at 12:06, David G. Johnston <david.g.johnston@gmail.com> wrote:
> On Tue, Jul 13, 2021 at 9:02 PM David G. Johnston <
> david.g.johnston@gmail.com> wrote:
>
>> We are already doing that now, though I argue we are in fact documenting
>> the "other clients receive the notification as soon as committed -
>> regardless of chaining" situation (the additional restrictions on receiving
>> are then what cause the chain authoring client to wait).
>>
>>
> Specifically:
>
> Firstly, if a NOTIFY is executed inside a transaction, the notify events
> are not delivered until and unless the transaction is committed.
>
> vs.
>
> So notification events are only delivered between transactions.
>
> I'll accept that "and-chain" means there is no "between transactions"
> period.  But there is no qualification to "until the transaction is
> committed".
>

There are two cases:

1. Should we update the statistics when "and-chain" ends?
2. Should we deliver the notification when "and-chain" ends?

For the first one, I think we should update the statistics when "and-chain" ends
because it makes statistics more accurate. For the second one, maybe I misunderstand
the "and-chain", tested it aggin and find the notifications will not deliver if
the last transaction is aborted.

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.







Re: Statistics updates is delayed when using `commit and chain`

От
Lætitia Avrot
Дата:
Dear all,

Le mer. 14 juil. 2021 à 08:54, Japin Li <japinli@hotmail.com> a écrit :

On Wed, 14 Jul 2021 at 12:06, David G. Johnston <david.g.johnston@gmail.com> wrote:
> On Tue, Jul 13, 2021 at 9:02 PM David G. Johnston <
> david.g.johnston@gmail.com> wrote:
>
>> We are already doing that now, though I argue we are in fact documenting
>> the "other clients receive the notification as soon as committed -
>> regardless of chaining" situation (the additional restrictions on receiving
>> are then what cause the chain authoring client to wait).
>>
>>
> Specifically:
>
> Firstly, if a NOTIFY is executed inside a transaction, the notify events
> are not delivered until and unless the transaction is committed.
>
> vs.
>
> So notification events are only delivered between transactions.
>
> I'll accept that "and-chain" means there is no "between transactions"
> period.  But there is no qualification to "until the transaction is
> committed".
>

There are two cases:

1. Should we update the statistics when "and-chain" ends?
2. Should we deliver the notification when "and-chain" ends?

For the first one, I think we should update the statistics when "and-chain" ends
because it makes statistics more accurate. For the second one, maybe I misunderstand
the "and-chain", tested it aggin and find the notifications will not deliver if
the last transaction is aborted.

--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.


It seems to me that the important question is : are chained transactions normal transactions? 
If the answer is "yes", then we need to update the statistics and notify before creating a new transaction because we need to completely end the previous transaction before starting a new one with the same characteristics.
If the answer is "no", then we contradict the SQL standard which states there's nothing special with a transaction created by `and chain`:

>If <commit statement> contains AND CHAIN, then an SQL-transaction is initiated. Any branch transactions of
> the SQL-transaction are initiated with the same transaction access mode, transaction isolation level, and
> condition area limit as the corresponding branch of the SQL-transaction just termi- nated.

and

> If <rollback statement> contains AND CHAIN, then an SQL-transaction is initiated. Any branch transactions of
> the SQL-transaction are initiated with the same transaction access mode, transaction isolation level, and
> condition area limit as the corresponding branch of the SQL-transaction just terminated.

Have a nice day,

Lætitia
 

Re: Statistics updates is delayed when using `commit and chain`

От
Japin Li
Дата:
On Mon, 26 Jul 2021 at 23:53, Lætitia Avrot <laetitia.avrot@gmail.com> wrote:
> Dear all,
>
> Le mer. 14 juil. 2021 à 08:54, Japin Li <japinli@hotmail.com> a écrit :
>
>>
>> On Wed, 14 Jul 2021 at 12:06, David G. Johnston <
>> david.g.johnston@gmail.com> wrote:
>> > On Tue, Jul 13, 2021 at 9:02 PM David G. Johnston <
>> > david.g.johnston@gmail.com> wrote:
>> >
>> >> We are already doing that now, though I argue we are in fact documenting
>> >> the "other clients receive the notification as soon as committed -
>> >> regardless of chaining" situation (the additional restrictions on
>> receiving
>> >> are then what cause the chain authoring client to wait).
>> >>
>> >>
>> > Specifically:
>> >
>> > Firstly, if a NOTIFY is executed inside a transaction, the notify events
>> > are not delivered until and unless the transaction is committed.
>> >
>> > vs.
>> >
>> > So notification events are only delivered between transactions.
>> >
>> > I'll accept that "and-chain" means there is no "between transactions"
>> > period.  But there is no qualification to "until the transaction is
>> > committed".
>> >
>>
>> There are two cases:
>>
>> 1. Should we update the statistics when "and-chain" ends?
>> 2. Should we deliver the notification when "and-chain" ends?
>>
>> For the first one, I think we should update the statistics when
>> "and-chain" ends
>> because it makes statistics more accurate. For the second one, maybe I
>> misunderstand
>> the "and-chain", tested it aggin and find the notifications will not
>> deliver if
>> the last transaction is aborted.
>>
>> --
>> Regrads,
>> Japin Li.
>> ChengDu WenWu Information Technology Co.,Ltd.
>>
>>
> It seems to me that the important question is : are chained transactions
> normal transactions?
> If the answer is "yes", then we need to update the statistics and notify
> before creating a new transaction because we need to completely end the
> previous transaction before starting a new one with the same
> characteristics.
> If the answer is "no", then we contradict the SQL standard which states
> there's nothing special with a transaction created by `and chain`:
>

I didn't find a SQL standard about `and chain`. IMO, whether the answer is
"yes" or "no", we should update the statistics. For notify, however, I'm not
sure.

>>If <commit statement> contains AND CHAIN, then an SQL-transaction is
> initiated. Any branch transactions of
>> the SQL-transaction are initiated with the same transaction access mode,
> transaction isolation level, and
>> condition area limit as the corresponding branch of the SQL-transaction
> just termi- nated.
>
> and
>
>> If <rollback statement> contains AND CHAIN, then an SQL-transaction is
> initiated. Any branch transactions of
>> the SQL-transaction are initiated with the same transaction access mode,
> transaction isolation level, and
>> condition area limit as the corresponding branch of the SQL-transaction
> just terminated.
>
> Have a nice day,
>
> Lætitia


--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.



Re: Statistics updates is delayed when using `commit and chain`

От
Laetitia Avrot
Дата:
Hello,

Le mar. 27 juil. 2021 à 04:28, Japin Li <japinli@hotmail.com> a écrit :

I didn't find a SQL standard about `and chain`. IMO, whether the answer is
"yes" or "no", we should update the statistics. For notify, however, I'm not
sure.


Please find enclosed a new patch.
This patch includes the update of the statistics in case of commit and chain.

This patch does not notify in case of commit and chain. Notify will only be triggered should a proper commit be called. I updated the documentation to explain that behavior.

The patch has been compiled and tested successfully on the latest master branch.

Have a great day,

Lætitia
Вложения

Re: Statistics updates is delayed when using `commit and chain`

От
Tom Lane
Дата:
Laetitia Avrot <laetitia.avrot@gmail.com> writes:
> [ v3-0001-update-table-stats-after-commit-and-chain.patch ]

I took a quick look at this.  If we're going to move the responsibility
for pushing out stats to transaction commit, then I think what we want
to do is put the call somewhere near the end of CommitTransaction.
Maybe right after AtCommit_Notify, because that has the same spirit
of sending out info about our having committed to other backends.
(The pgstat_report_xact_timestamp call would need to be moved up to
before that, I think, but that sure looks like it was inserted with
the aid of a dartboard anyway.  It certainly doesn't square with the
immediately preceding comment about "purely internal" cleanup.)
AbortTransaction and probably PrepareTransaction the same.

Having done that, rather than adding more pgstat_report_stat calls
we could get rid of most of them.  I see a bunch of random calls right
after various CommitTransactionCommand calls, which we'd not need anymore.
We probably still want the one in PostgresMain, but it needs some
rethinking, because we only want that one to do something when the timer
fires saying that we've slept too long without sending stats.

As for the NOTIFY business, I'm not impressed with putting duplicate
boilerplate into half a dozen different places.  I think the discussion
in ref/notify.sgml covers this already, or if not, that's the one place
to fix it.  The place I actually thought could use more attention is the
wire protocol spec in protocol.sgml.  In any case, I'd rather see that
as a separate patch, because it's unrelated.

            regards, tom lane



Re: Statistics updates is delayed when using `commit and chain`

От
Laetitia Avrot
Дата:
Hello, 

Thank you for reviewing this patch.  I won't have time to take into account your comments before the end of this commit great but will try my best to make it for the next one.

I'll change the patch status to "moved to next CF".

Have a nice day, 

Lætitia 

Le jeu. 28 juil. 2022, 23:26, Tom Lane <tgl@sss.pgh.pa.us> a écrit :
Laetitia Avrot <laetitia.avrot@gmail.com> writes:
> [ v3-0001-update-table-stats-after-commit-and-chain.patch ]

I took a quick look at this.  If we're going to move the responsibility
for pushing out stats to transaction commit, then I think what we want
to do is put the call somewhere near the end of CommitTransaction.
Maybe right after AtCommit_Notify, because that has the same spirit
of sending out info about our having committed to other backends.
(The pgstat_report_xact_timestamp call would need to be moved up to
before that, I think, but that sure looks like it was inserted with
the aid of a dartboard anyway.  It certainly doesn't square with the
immediately preceding comment about "purely internal" cleanup.)
AbortTransaction and probably PrepareTransaction the same.

Having done that, rather than adding more pgstat_report_stat calls
we could get rid of most of them.  I see a bunch of random calls right
after various CommitTransactionCommand calls, which we'd not need anymore.
We probably still want the one in PostgresMain, but it needs some
rethinking, because we only want that one to do something when the timer
fires saying that we've slept too long without sending stats.

As for the NOTIFY business, I'm not impressed with putting duplicate
boilerplate into half a dozen different places.  I think the discussion
in ref/notify.sgml covers this already, or if not, that's the one place
to fix it.  The place I actually thought could use more attention is the
wire protocol spec in protocol.sgml.  In any case, I'd rather see that
as a separate patch, because it's unrelated.

                        regards, tom lane

Re: Statistics updates is delayed when using `commit and chain`

От
Andres Freund
Дата:
Hi,

On 2022-05-14 17:11:33 +0200, Laetitia Avrot wrote:
> This patch includes the update of the statistics in case of commit and
> chain.

Hm.

> @@ -3006,6 +3006,12 @@ CommitTransactionCommand(void)
>              s->blockState = TBLOCK_DEFAULT;
>              if (s->chain)
>              {
> +                /*
> +                 * Before starting the new transaction, we'll update the
> +                 * statistics so that autovacuum can be triggered without
> +                 * waiting for a `commit` or `rollback` without `and chain`.
> +                 */
> +                pgstat_report_stat(false);
>                  StartTransaction();
>                  s->blockState = TBLOCK_INPROGRESS;
>                  s->chain = false;
> @@ -3032,6 +3038,12 @@ CommitTransactionCommand(void)
>              s->blockState = TBLOCK_DEFAULT;
>              if (s->chain)
>              {
> +                /*
> +                 * Before starting the new transaction, we'll update the
> +                 * statistics so that autovacuum can be triggered without
> +                 * waiting for a `commit` or `rollback` without `and chain`.
> +                 */
> +                pgstat_report_stat(false);
>                  StartTransaction();
>                  s->blockState = TBLOCK_INPROGRESS;
>                  s->chain = false;

> @@ -3050,6 +3062,12 @@ CommitTransactionCommand(void)
>              s->blockState = TBLOCK_DEFAULT;
>              if (s->chain)
>              {
> +                /*
> +                 * Before starting the new transaction, we'll update the
> +                 * statistics so that autovacuum can be triggered without
> +                 * waiting for a `commit` or `rollback` without `and chain`.
> +                 */
> +                pgstat_report_stat(false);
>                  StartTransaction();
>                  s->blockState = TBLOCK_INPROGRESS;
>                  s->chain = false;
> @@ -3117,6 +3135,12 @@ CommitTransactionCommand(void)
>                  s->blockState = TBLOCK_DEFAULT;
>                  if (s->chain)
>                  {
> +                    /*
> +                     * Before starting the new transaction, we'll update the
> +                     * statistics so that autovacuum can be triggered without
> +                     * waiting for a `commit` or `rollback` without `and chain`.
> +                     */
> +                    pgstat_report_stat(false);
>                      StartTransaction();
>                      s->blockState = TBLOCK_INPROGRESS;
>                      s->chain = false;

I don't like copying the same comment etc into multiple places. Given the
number of identical blocks for if (s->chain) blocks, perhaps we should just
move it into a new helper?

I'm a bit worried about triggering pgstat_report_stat() in new places where it
previously couldn't be called. There's a few places in the relation stats code
(predating the shared memory stats patch) that don't cope well with being
called in the wrong state (see e.g. 0cf16cb8ca4), and I'm not sure how careful
the procedures patch was with ensuring that there aren't any cases of
relations being kept open across transactions or something like that.

Greetings,

Andres Freund



Re: Statistics updates is delayed when using `commit and chain`

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> I don't like copying the same comment etc into multiple places. Given the
> number of identical blocks for if (s->chain) blocks, perhaps we should just
> move it into a new helper?

What I suggested a few days ago is that it should be in CommitTransaction.

            regards, tom lane



Re: Statistics updates is delayed when using `commit and chain`

От
Andres Freund
Дата:
Hi,

On 2022-08-01 12:25:58 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > I don't like copying the same comment etc into multiple places. Given the
> > number of identical blocks for if (s->chain) blocks, perhaps we should just
> > move it into a new helper?
> 
> What I suggested a few days ago is that it should be in CommitTransaction.

I wonder how we'd best deal with the idle timer if we go for that. That only
makes sense in some contexts (normal backends), but not others (everything
else).

Greetings,

Andres Freund



Re: Statistics updates is delayed when using `commit and chain`

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2022-08-01 12:25:58 -0400, Tom Lane wrote:
>> What I suggested a few days ago is that it should be in CommitTransaction.

> I wonder how we'd best deal with the idle timer if we go for that. That only
> makes sense in some contexts (normal backends), but not others (everything
> else).

Yeah.  I think we'd need to get rid of the "bool force" argument
of pgstat_report_stat, and instead have it manage things internally
based on understanding whether the current process uses a reporting
timeout timer or not (if not, always send the report right away).
That seems like it'd be more robust than the current mechanism anyway,
as we're currently relying on every call site to get that right.

            regards, tom lane



Re: Statistics updates is delayed when using `commit and chain`

От
Andres Freund
Дата:
Hi,

On 2022-08-01 12:43:23 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2022-08-01 12:25:58 -0400, Tom Lane wrote:
> >> What I suggested a few days ago is that it should be in CommitTransaction.
> 
> > I wonder how we'd best deal with the idle timer if we go for that. That only
> > makes sense in some contexts (normal backends), but not others (everything
> > else).
> 
> Yeah.  I think we'd need to get rid of the "bool force" argument
> of pgstat_report_stat, and instead have it manage things internally
> based on understanding whether the current process uses a reporting
> timeout timer or not (if not, always send the report right away).
> That seems like it'd be more robust than the current mechanism anyway,
> as we're currently relying on every call site to get that right.

It's not as simple as looking at the backend type, I think. We'd not want to
enable a timer in the commit-and-chain context, even in a normal backend -
there's no chance we'll go idle.

I wonder if it was the wrong call to use timers for IdleSessionTimeout,
IdleInTransactionSessionTimeout and pgstats. We always use nonblocking socket
IO these days, so perhaps we should instead just compute a relevant timeout
for the WaitEventSetWait() call?

Greetings,

Andres Freund



Re: Statistics updates is delayed when using `commit and chain`

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2022-08-01 12:43:23 -0400, Tom Lane wrote:
>> Yeah.  I think we'd need to get rid of the "bool force" argument
>> of pgstat_report_stat, and instead have it manage things internally
>> based on understanding whether the current process uses a reporting
>> timeout timer or not (if not, always send the report right away).

> It's not as simple as looking at the backend type, I think. We'd not want to
> enable a timer in the commit-and-chain context, even in a normal backend -
> there's no chance we'll go idle.

No, but that's not the point.  During CommitTransaction we should check
to see if it's more than X amount of time since our last stats report,
and if so send a new one.  No timer interaction involved there.  When
we go idle, if not inside a transaction and a report needs to be sent,
then compute the wait time till it should be sent and set a timer for
that.

> I wonder if it was the wrong call to use timers for IdleSessionTimeout,
> IdleInTransactionSessionTimeout and pgstats. We always use nonblocking socket
> IO these days, so perhaps we should instead just compute a relevant timeout
> for the WaitEventSetWait() call?

Meh.  I think that that would end up with duplicative logic and
duplicative gettimeofday calls, unless your idea is to get rid of
the timeout.c facilities altogether.

            regards, tom lane



Re: Statistics updates is delayed when using `commit and chain`

От
Andres Freund
Дата:
Hi,

On 2022-08-01 13:22:35 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2022-08-01 12:43:23 -0400, Tom Lane wrote:
> >> Yeah.  I think we'd need to get rid of the "bool force" argument
> >> of pgstat_report_stat, and instead have it manage things internally
> >> based on understanding whether the current process uses a reporting
> >> timeout timer or not (if not, always send the report right away).
> 
> > It's not as simple as looking at the backend type, I think. We'd not want to
> > enable a timer in the commit-and-chain context, even in a normal backend -
> > there's no chance we'll go idle.
> 
> No, but that's not the point.  During CommitTransaction we should check
> to see if it's more than X amount of time since our last stats report,
> and if so send a new one.  No timer interaction involved there.  When
> we go idle, if not inside a transaction and a report needs to be sent,
> then compute the wait time till it should be sent and set a timer for
> that.

I don't quite see how that'd help us to get rid of the force argument -
somehow pgstat_report_stat() needs to know that it it should "block" reporting
stats. Which we e.g. want when the idle timer has elapsed.


> > I wonder if it was the wrong call to use timers for IdleSessionTimeout,
> > IdleInTransactionSessionTimeout and pgstats. We always use nonblocking socket
> > IO these days, so perhaps we should instead just compute a relevant timeout
> > for the WaitEventSetWait() call?
> 
> Meh.  I think that that would end up with duplicative logic and
> duplicative gettimeofday calls, unless your idea is to get rid of
> the timeout.c facilities altogether.

I suspect it'd reduce the number of gettimeofday calls, but yes, the
duplication wouldn't be nice.

I guess there's a chance we could get rid of most of the timeout.c stuff, but
it'd need changes in a fair number of places... Don't think it's worth doing
atm.

Greetings,

Andres Freund