Обсуждение: Sequence's value can be rollback after a crashed recovery.

Поиск
Список
Период
Сортировка

Sequence's value can be rollback after a crashed recovery.

От
Andy Fan
Дата:
Hi:

Should we guarantee the sequence's nextval should never be rolled back
even in a crashed recovery case?
I can produce the rollback in the following case:

Session 1:
CREATE SEQUENCE s;
BEGIN;
SELECT nextval('s'); \watch 0.01

Session 2:
kill -9 {sess1.pid}

After the restart, the  nextval('s') may be rolled back (less than the
last value  from session 1).

The reason is because we never flush the xlog for the nextval_internal
for the above case.  So if
the system crashes, there is nothing to redo from. It can be fixed
with the following online change
code.

@@ -810,6 +810,8 @@ nextval_internal(Oid relid, bool check_permissions)
                recptr = XLogInsert(RM_SEQ_ID, XLOG_SEQ_LOG);

                PageSetLSN(page, recptr);
+
+               XLogFlush(recptr);
        }


If a user uses sequence value for some external systems,  the
rollbacked value may surprise them.
[I didn't run into this issue in any real case,  I just studied xlog /
sequence stuff today and found this case].

-- 
Best Regards
Andy Fan



Re: Sequence's value can be rollback after a crashed recovery.

От
"David G. Johnston"
Дата:
On Sunday, November 21, 2021, Andy Fan <zhihui.fan1213@gmail.com> wrote:
Should we guarantee the sequence's nextval should never be rolled back
even in a crashed recovery case?
I can produce the rollback in the following case:

This seems to be the same observation that was made a little over a year ago.


I don’t think the suggested documentation ever got written but haven’t looked for it either.

David J.

Re: Sequence's value can be rollback after a crashed recovery.

От
Laurenz Albe
Дата:
On Mon, 2021-11-22 at 14:57 +0800, Andy Fan wrote:
> Should we guarantee the sequence's nextval should never be rolled back
> even in a crashed recovery case?
> I can produce the rollback in the following case:
> 
> Session 1:
> CREATE SEQUENCE s;
> BEGIN;
> SELECT nextval('s'); \watch 0.01
> 
> Session 2:
> kill -9 {sess1.pid}
> 
> After the restart, the  nextval('s') may be rolled back (less than the
> last value  from session 1).
> 
> The reason is because we never flush the xlog for the nextval_internal
> for the above case.  So if
> the system crashes, there is nothing to redo from. It can be fixed
> with the following online change
> code.
> 
> @@ -810,6 +810,8 @@ nextval_internal(Oid relid, bool check_permissions)
>                 recptr = XLogInsert(RM_SEQ_ID, XLOG_SEQ_LOG);
> 
>                 PageSetLSN(page, recptr);
> +
> +               XLogFlush(recptr);
>         }
> 
> 
> If a user uses sequence value for some external systems,  the
> rollbacked value may surprise them.
> [I didn't run into this issue in any real case,  I just studied xlog /
> sequence stuff today and found this case].

I think that is a bad idea.
It will have an intolerable performance impact on OLTP queries, doubling
the number of I/O requests for many cases.

Perhaps it would make sense to document that you should never rely on
sequence values from an uncommitted transaction.

Yours,
Laurenz Albe




Re: Sequence's value can be rollback after a crashed recovery.

От
Andy Fan
Дата:
> > The reason is because we never flush the xlog for the nextval_internal
> > for the above case.  So if
> > the system crashes, there is nothing to redo from. It can be fixed
> > with the following online change
> > code.
> >
> > @@ -810,6 +810,8 @@ nextval_internal(Oid relid, bool check_permissions)
> >                 recptr = XLogInsert(RM_SEQ_ID, XLOG_SEQ_LOG);
> >
> >                 PageSetLSN(page, recptr);
> > +
> > +               XLogFlush(recptr);
> >         }
> >
> >
> > If a user uses sequence value for some external systems,  the
> > rollbacked value may surprise them.
> > [I didn't run into this issue in any real case,  I just studied xlog /
> > sequence stuff today and found this case].
>
> I think that is a bad idea.
> It will have an intolerable performance impact on OLTP queries, doubling
> the number of I/O requests for many cases.
>

The performance argument was expected before this writing. If we look at the
nextval_interval more carefully, we can find it would not flush the xlog every
time even the sequence's cachesize is 1. Currently It happens every 32 times
on the nextval_internal at the worst case.

> Perhaps it would make sense to document that you should never rely on
> sequence values from an uncommitted transaction.

I am OK with this if more people think this is the solution.


-- 
Best Regards
Andy Fan



Re: Sequence's value can be rollback after a crashed recovery.

От
Laurenz Albe
Дата:
On Mon, 2021-11-22 at 15:43 +0800, Andy Fan wrote:
> > > The reason is because we never flush the xlog for the nextval_internal
> > > for the above case.  So if
> > > the system crashes, there is nothing to redo from. It can be fixed
> > > with the following online change
> > > code.
> > > 
> > > @@ -810,6 +810,8 @@ nextval_internal(Oid relid, bool check_permissions)
> > >                  recptr = XLogInsert(RM_SEQ_ID, XLOG_SEQ_LOG);
> > > 
> > >                  PageSetLSN(page, recptr);
> > > +
> > > +               XLogFlush(recptr);
> > >          }
> > > 
> > > 
> > > If a user uses sequence value for some external systems,  the
> > > rollbacked value may surprise them.
> > > [I didn't run into this issue in any real case,  I just studied xlog /
> > > sequence stuff today and found this case].
> > 
> > I think that is a bad idea.
> > It will have an intolerable performance impact on OLTP queries, doubling
> > the number of I/O requests for many cases.
> 
> The performance argument was expected before this writing. If we look at the
> nextval_interval more carefully, we can find it would not flush the xlog every
> time even the sequence's cachesize is 1. Currently It happens every 32 times
> on the nextval_internal at the worst case.

Right, I didn't think of that.  Still, I'm -1 on this performance regression.

Yours,
Laurenz Albe




Re: Sequence's value can be rollback after a crashed recovery.

От
"Bossart, Nathan"
Дата:
On 11/22/21, 5:10 AM, "Laurenz Albe" <laurenz.albe@cybertec.at> wrote:
> On Mon, 2021-11-22 at 15:43 +0800, Andy Fan wrote:
>> The performance argument was expected before this writing. If we look at the
>> nextval_interval more carefully, we can find it would not flush the xlog every
>> time even the sequence's cachesize is 1. Currently It happens every 32 times
>> on the nextval_internal at the worst case.
>
> Right, I didn't think of that.  Still, I'm -1 on this performance regression.

I periodically hear rumblings about this behavior as well.  At the
very least, it certainly ought to be documented if it isn't yet.  I
wouldn't mind trying my hand at that.  Perhaps we could also add a new
configuration parameter if users really want to take the performance
hit.

Nathan


Re: Sequence's value can be rollback after a crashed recovery.

От
Tom Lane
Дата:
"Bossart, Nathan" <bossartn@amazon.com> writes:
> I periodically hear rumblings about this behavior as well.  At the
> very least, it certainly ought to be documented if it isn't yet.  I
> wouldn't mind trying my hand at that.  Perhaps we could also add a new
> configuration parameter if users really want to take the performance
> hit.

A sequence's cache length is already configurable, no?

            regards, tom lane



Re: Sequence's value can be rollback after a crashed recovery.

От
Jeremy Schneider
Дата:
On 11/22/21 12:31, Tom Lane wrote:
> "Bossart, Nathan" <bossartn@amazon.com> writes:
>> I periodically hear rumblings about this behavior as well.  At the
>> very least, it certainly ought to be documented if it isn't yet.  I
>> wouldn't mind trying my hand at that.  Perhaps we could also add a new
>> configuration parameter if users really want to take the performance
>> hit.
> 
> A sequence's cache length is already configurable, no?
> 

Cache length isn't related to the problem here.

The problem is that PostgreSQL sequences are entirely unsafe to use from
a durability perspective, unless there's DML in the same transaction.

Users might normally think that "commit" makes things durable.
Unfortunately, IIUC, that's not true for sequences in PostgreSQL.

-Jeremy


PS. my bad on the documentation thing... I just noticed that I said a
year ago I'd take a swing at a doc update, and I never did that!!
Between Nate and I we'll get something proposed.


-- 
http://about.me/jeremy_schneider



Re: Sequence's value can be rollback after a crashed recovery.

От
Andy Fan
Дата:


On Tue, Nov 23, 2021 at 4:31 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
"Bossart, Nathan" <bossartn@amazon.com> writes:
> I periodically hear rumblings about this behavior as well.  At the
> very least, it certainly ought to be documented if it isn't yet.  I
> wouldn't mind trying my hand at that.  Perhaps we could also add a new
> configuration parameter if users really want to take the performance
> hit.

A sequence's cache length is already configurable, no?


We can hit this issue even cache=1. And even if we added the XLogFlush,
with _cachesize=1_,  the Xlog is still recorded/flushed every 32 values. 

I know your opinion about this at [1], IIUC you probably miss the SEQ_LOG_VALS
design, it was designed for the  performance reason to avoid frequent xlog updates already. 
But after that,  the XLogSync is still not called which caused this issue. 


--
Best Regards
Andy Fan

Re: Sequence's value can be rollback after a crashed recovery.

От
Tomas Vondra
Дата:

On 11/22/21 9:42 PM, Jeremy Schneider wrote:
> On 11/22/21 12:31, Tom Lane wrote:
>> "Bossart, Nathan" <bossartn@amazon.com> writes:
>>> I periodically hear rumblings about this behavior as well.  At the
>>> very least, it certainly ought to be documented if it isn't yet.  I
>>> wouldn't mind trying my hand at that.  Perhaps we could also add a new
>>> configuration parameter if users really want to take the performance
>>> hit.
>>
>> A sequence's cache length is already configurable, no?
>>
> 
> Cache length isn't related to the problem here.
> 
> The problem is that PostgreSQL sequences are entirely unsafe to use from
> a durability perspective, unless there's DML in the same transaction.
> 
> Users might normally think that "commit" makes things durable.
> Unfortunately, IIUC, that's not true for sequences in PostgreSQL.
> 

That's not what the example in this thread demonstrates, though. There's
no COMMIT in that example, so it shows that we may discard the nextval()
in uncommitted transactions. I fail to see how that's less durable than
any other DML (e.g. we don't complain about INSERT not being durable if
you don't commit the change).

If you can show that the sequence goes back after a commit, that'd be an
actual durability issue.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Sequence's value can be rollback after a crashed recovery.

От
Dilip Kumar
Дата:
On Tue, Nov 23, 2021 at 9:30 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:

> On 11/22/21 9:42 PM, Jeremy Schneider wrote:
> > On 11/22/21 12:31, Tom Lane wrote:
> >> "Bossart, Nathan" <bossartn@amazon.com> writes:
> >>> I periodically hear rumblings about this behavior as well.  At the
> >>> very least, it certainly ought to be documented if it isn't yet.  I
> >>> wouldn't mind trying my hand at that.  Perhaps we could also add a new
> >>> configuration parameter if users really want to take the performance
> >>> hit.
> >>
> >> A sequence's cache length is already configurable, no?
> >>
> >
> > Cache length isn't related to the problem here.
> >
> > The problem is that PostgreSQL sequences are entirely unsafe to use from
> > a durability perspective, unless there's DML in the same transaction.
> >
> > Users might normally think that "commit" makes things durable.
> > Unfortunately, IIUC, that's not true for sequences in PostgreSQL.
> >
>
> That's not what the example in this thread demonstrates, though. There's
> no COMMIT in that example, so it shows that we may discard the nextval()
> in uncommitted transactions. I fail to see how that's less durable than
> any other DML (e.g. we don't complain about INSERT not being durable if
> you don't commit the change).
>
> If you can show that the sequence goes back after a commit, that'd be an
> actual durability issue.

I think at this thread[1], which claimed to get this issue even after
commit, I haven't tried it myself though.

[1]
https://www.postgresql.org/message-id/flat/ea6485e3-98d0-24a7-094c-87f9d5f9b18f%40amazon.com#4cfe7217c829419b769339465e8c2915


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Sequence's value can be rollback after a crashed recovery.

От
Tomas Vondra
Дата:
On 11/23/21 5:22 AM, Dilip Kumar wrote:
> On Tue, Nov 23, 2021 at 9:30 AM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
> 
>> On 11/22/21 9:42 PM, Jeremy Schneider wrote:
>>> On 11/22/21 12:31, Tom Lane wrote:
>>>> "Bossart, Nathan" <bossartn@amazon.com> writes:
>>>>> I periodically hear rumblings about this behavior as well.  At the
>>>>> very least, it certainly ought to be documented if it isn't yet.  I
>>>>> wouldn't mind trying my hand at that.  Perhaps we could also add a new
>>>>> configuration parameter if users really want to take the performance
>>>>> hit.
>>>>
>>>> A sequence's cache length is already configurable, no?
>>>>
>>>
>>> Cache length isn't related to the problem here.
>>>
>>> The problem is that PostgreSQL sequences are entirely unsafe to use from
>>> a durability perspective, unless there's DML in the same transaction.
>>>
>>> Users might normally think that "commit" makes things durable.
>>> Unfortunately, IIUC, that's not true for sequences in PostgreSQL.
>>>
>>
>> That's not what the example in this thread demonstrates, though. There's
>> no COMMIT in that example, so it shows that we may discard the nextval()
>> in uncommitted transactions. I fail to see how that's less durable than
>> any other DML (e.g. we don't complain about INSERT not being durable if
>> you don't commit the change).
>>
>> If you can show that the sequence goes back after a commit, that'd be an
>> actual durability issue.
> 
> I think at this thread[1], which claimed to get this issue even after
> commit, I haven't tried it myself though.
> 
> [1]
https://www.postgresql.org/message-id/flat/ea6485e3-98d0-24a7-094c-87f9d5f9b18f%40amazon.com#4cfe7217c829419b769339465e8c2915
> 

I did try, and I haven't been able to reproduce that behavior (on
master, at least).

I see Tom speculated we may not flush WAL if a transaction only does
nextval() in that other thread, but I don't think that's true. AFAICS if
the nextval() call writes stuff to WAL, the RecordTransactionCommit will
have wrote_xlog=true and valid XID. And so it'll do the usual usual
XLogFlush() etc.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Sequence's value can be rollback after a crashed recovery.

От
Andy Fan
Дата:

> I think at this thread[1], which claimed to get this issue even after
> commit, I haven't tried it myself though.
>
> [1] https://www.postgresql.org/message-id/flat/ea6485e3-98d0-24a7-094c-87f9d5f9b18f%40amazon.com#4cfe7217c829419b769339465e8c2915
>

I did try, and I haven't been able to reproduce that behavior (on
master, at least).


I agree with this,  the commit would flush the xlog and persist the change. 
 
I see Tom speculated we may not flush WAL if a transaction only does
nextval() in that other thread, but I don't think that's true. AFAICS if
the nextval() call writes stuff to WAL, the RecordTransactionCommit will
have wrote_xlog=true and valid XID. And so it'll do the usual usual
XLogFlush() etc.


I agree with this as well.  or else, how can we replicate it to standby if
user only runs the SELECT nextval('s') in a transaction. 

>  I fail to see how that's less durable than any other DML (e.g. we don't 
> complain about INSERT not being durable if you don't commit the change).
> If you can show that the sequence goes back after a commit, that'd be an
actual durability issue.

This can't be called a tranaction's durability issue, but people usually think 
the value of sequence will not rollback.  so it may surprise people if that happens. 

--
Best Regards
Andy Fan 

Re: Sequence's value can be rollback after a crashed recovery.

От
Jeremy Schneider
Дата:
On 11/23/21 05:49, Andy Fan wrote:
> 
>     > I think at this thread[1], which claimed to get this issue even after
>     > commit, I haven't tried it myself though.
>     >
>     > [1]
>
https://www.postgresql.org/message-id/flat/ea6485e3-98d0-24a7-094c-87f9d5f9b18f%40amazon.com#4cfe7217c829419b769339465e8c2915
>
<https://www.postgresql.org/message-id/flat/ea6485e3-98d0-24a7-094c-87f9d5f9b18f%40amazon.com#4cfe7217c829419b769339465e8c2915>
>     >
> 
>     I did try, and I haven't been able to reproduce that behavior (on
>     master, at least).
> 
> 
> I agree with this,  the commit would flush the xlog and persist the change. 

On that older thread, there were exact reproductions in the first email
from Vini - two of them - available here:

https://gist.github.com/vinnix/2fe148e3c42e11269bac5fcc5c78a8d1

Nathan help me realize a mistake I've made here.

The second reproduction involved having psql run nextval() inside of an
explicit transaction. I had assumed that the transaction would be
committed when psql closed the session without error. This is because in
Oracle SQLPlus (my original RDBMS background), the "exitcommit" setting
has a default value giving this behavior.

This was a silly mistake on my part. When PostgreSQL psql closes the
connection with an open transaction, it turns out that the PostgreSQL
server will abort the transaction rather than committing it. (Oracle
DBAs be-aware!)

Nonetheless, Vini's first reproduction did not make this same mistake.
It involved 10 psql sessions in parallel using implicit transactions,
suspending I/O (via the linux device mapper), and killing PG while the
I/O is suspended.

Given my mistake on the second repro, I want to look a little closer at
this first reproduction and revisit whether it's actually demonstrating
a corner case where one could claim that durability isn't being handled
correctly - that "COMMIT" is returning successfully to the application,
and yet the sequence numbers are being repeated. Maybe there's something
involving the linux I/O path coming into play here.

-Jeremy


-- 
http://about.me/jeremy_schneider



Re: Sequence's value can be rollback after a crashed recovery.

От
Tom Lane
Дата:
Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
> I see Tom speculated we may not flush WAL if a transaction only does
> nextval() in that other thread, but I don't think that's true. AFAICS if
> the nextval() call writes stuff to WAL, the RecordTransactionCommit will
> have wrote_xlog=true and valid XID. And so it'll do the usual usual
> XLogFlush() etc.

Yeah.  I didn't look at the code during last year's discussion, but now
I have, and I see that if nextval_internal() decides it needs to make
a WAL entry, it is careful to ensure the xact has an XID:

    /*
     * If something needs to be WAL logged, acquire an xid, so this
     * transaction's commit will trigger a WAL flush and wait for syncrep.
     * It's sufficient to ensure the toplevel transaction has an xid, no need
     * to assign xids subxacts, that'll already trigger an appropriate wait.
     * (Have to do that here, so we're outside the critical section)
     */
    if (logit && RelationNeedsWAL(seqrel))
        GetTopTransactionId();

So that eliminates my worry that RecordTransactionCommit would decide
it doesn't need to do anything.  If there was a WAL record, we will
flush it at commit (and not before).

As you say, this is exactly as durable as any other DML operation.
So I don't feel a need to change the code behavior.

The problematic situation seems to be where an application gets
a nextval() result and uses it for some persistent outside-the-DB
state, without having made sure that the nextval() was committed.
You could say that that's the same rookie error as relying on the
persistence of any other uncommitted DML ... except that at [1]
we say

    To avoid blocking concurrent transactions that obtain numbers from the
    same sequence, a nextval operation is never rolled back; that is, once
    a value has been fetched it is considered used and will not be
    returned again. This is true even if the surrounding transaction later
    aborts, or if the calling query ends up not using the value.

It's not so unreasonable to read that as promising persistence over
crashes as well as xact aborts.  So I think we need to improve the docs
here.  A minimal fix would be to leave the existing text alone and add a
separate para to the <caution> block, along the lines of

    However, the above statements do not apply if the database cluster
    crashes before committing the transaction containing the nextval
    operation.  In that case the sequence advance might not have made its
    way to persistent storage, so that it is uncertain whether the same
    value can be returned again after the cluster restarts.  If you wish
    to use a nextval result for persistent outside-the-database purposes,
    make sure that the nextval has been committed before doing so.

I wonder though if we shouldn't try to improve the existing text.
The phrasing "never rolled back" seems like it's too easily
misinterpreted.  Maybe rewrite the <caution> block like

    To avoid blocking concurrent transactions that obtain numbers from the
    same sequence, the value obtained by nextval is not reclaimed for
    re-use if the calling transaction later aborts.  This means that
    transaction aborts or database crashes can result in gaps in the
    sequence of assigned values.  That can happen without a transaction
    abort, too.
    -- this text is unchanged: --
    For example an INSERT with an ON CONFLICT clause will compute the
    to-be-inserted tuple, including doing any required nextval calls,
    before detecting any conflict that would cause it to follow the ON
    CONFLICT rule instead. Such cases will leave unused “holes” in the
    sequence of assigned values. Thus, PostgreSQL sequence objects cannot
    be used to obtain “gapless” sequences.

    Likewise, any sequence state changes made by setval are not undone if
    the transaction rolls back.
    -- end unchanged text --

    If the database cluster crashes before committing the transaction
    containing a nextval operation, the sequence advance might not yet
    have made its way to persistent storage, so that it is uncertain
    whether the same value can be returned again after the cluster
    restarts.  This is harmless for usage of the nextval value within
    that transaction, since its other effects will not be visible either.
    However, if you wish to use a nextval result for persistent
    outside-the-database purposes, make sure that the nextval operation
    has been committed before doing so.

Thoughts?

            regards, tom lane

[1] https://www.postgresql.org/docs/current/functions-sequence.html



Re: Sequence's value can be rollback after a crashed recovery.

От
Tom Lane
Дата:
I wrote:
> I wonder though if we shouldn't try to improve the existing text.
> The phrasing "never rolled back" seems like it's too easily
> misinterpreted.  Maybe rewrite the <caution> block like
> ...

A bit of polishing later, maybe like the attached.

            regards, tom lane

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 74d3087a72..9e5ce3163a 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -17645,11 +17645,11 @@ SELECT setval('myseq', 42, false);    <lineannotation>Next <function>nextval</fu
   <caution>
    <para>
     To avoid blocking concurrent transactions that obtain numbers from
-    the same sequence, a <function>nextval</function> operation is never
-    rolled back; that is, once a value has been fetched it is considered
-    used and will not be returned again.  This is true even if the
-    surrounding transaction later aborts, or if the calling query ends
-    up not using the value.  For example an <command>INSERT</command> with
+    the same sequence, the value obtained by <function>nextval</function>
+    is not reclaimed for re-use if the calling transaction later aborts.
+    This means that transaction aborts or database crashes can result in
+    gaps in the sequence of assigned values.  That can happen without a
+    transaction abort, too.  For example an <command>INSERT</command> with
     an <literal>ON CONFLICT</literal> clause will compute the to-be-inserted
     tuple, including doing any required <function>nextval</function>
     calls, before detecting any conflict that would cause it to follow
@@ -17661,8 +17661,22 @@ SELECT setval('myseq', 42, false);    <lineannotation>Next <function>nextval</fu
    </para>

    <para>
-    Likewise, any sequence state changes made by <function>setval</function>
-    are not undone if the transaction rolls back.
+    Likewise, sequence state changes made by <function>setval</function>
+    are immediately visible to other transactions, and are not undone if
+    the calling transaction rolls back.
+   </para>
+
+   <para>
+    If the database cluster crashes before committing a transaction
+    containing a <function>nextval</function>
+    or <function>setval</function> call, the sequence state change might
+    not have made its way to persistent storage, so that it is uncertain
+    whether the sequence will have its original or updated state after the
+    cluster restarts.  This is harmless for usage of the sequence within
+    the database, since other effects of uncommitted transactions will not
+    be visible either.  However, if you wish to use a sequence value for
+    persistent outside-the-database purposes, make sure that the
+    <function>nextval</function> call has been committed before doing so.
    </para>
   </caution>


Re: Sequence's value can be rollback after a crashed recovery.

От
"Bossart, Nathan"
Дата:
On 11/23/21, 1:41 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
> I wrote:
>> I wonder though if we shouldn't try to improve the existing text.
>> The phrasing "never rolled back" seems like it's too easily
>> misinterpreted.  Maybe rewrite the <caution> block like
>> ...
>
> A bit of polishing later, maybe like the attached.

The doc updates look good to me.  Yesterday I suggested possibly
adding a way to ensure that nextval() called in an uncommitted
transaction was persistent, but I think we'd have to also ensure that
synchronous replication waits for those records, too.  Anyway, I don't
think it is unreasonable to require the transaction to be committed to
avoid duplicates from nextval().

Nathan


Re: Sequence's value can be rollback after a crashed recovery.

От
Andy Fan
Дата:

You could say that that's the same rookie error as relying on the
persistence of any other uncommitted DML ... 

IIUC, This is not the same as uncommitted DML exactly.   For any uncommitted 
DML, it is a rollback for sure.  But as for sequence, The xmax is not changed
during sequence's value update by design and we didn't maintain the multi versions
for sequence,  so sequence can't be rolled back clearly.  The fact is a dirty data page flush 
can persist the change no matter the txn is committed or aborted.  The below example
can show the difference: 

SELECT nextval('s'); -- 1
begin;
SELECT nextval('s');  \watch 0.1 for a while, many checkpointer or data flush happened. 
-- crashed. 

If we run nextval('s') from the recovered system, we probably will _not_ get
the 2 (assume cachesize=1) like uncommitted DML.


--
Best Regards
Andy Fan 

Re: Sequence's value can be rollback after a crashed recovery.

От
Laurenz Albe
Дата:
On Tue, 2021-11-23 at 16:41 -0500, Tom Lane wrote:
> I wrote:
> > I wonder though if we shouldn't try to improve the existing text.
> > The phrasing "never rolled back" seems like it's too easily
> > misinterpreted.  Maybe rewrite the <caution> block like
> > ...
> 
> A bit of polishing later, maybe like the attached.

That looks good to me.

Yours,
Laurenz Albe




Re: Sequence's value can be rollback after a crashed recovery.

От
Tom Lane
Дата:
Laurenz Albe <laurenz.albe@cybertec.at> writes:
> On Tue, 2021-11-23 at 16:41 -0500, Tom Lane wrote:
>> A bit of polishing later, maybe like the attached.

> That looks good to me.

Pushed, thanks.

            regards, tom lane