Обсуждение: [GENERAL] checkpoint clarifications needed

Поиск

Список

Период

Сортировка

[GENERAL] checkpoint clarifications needed

От

Tom DalPozzo

Дата:

09 января 2017 г., 17:14:32

Hi, I need some clarifications about checkpoints.

Below here a log from my standby server when started and then some parts of the interested WAL in the master's cluster obtained by pg_xlogdump.

Just to have an example to talk on.

1) I see: "LOG: redo starts at 1/F00A7448" . I was expecting a checkpoint record around 1/F00A7448 but the related checkpoint record is at lsn: 1/FCBD7510 instead.

I expected that because I read in the doc :"Checkpoints are points in the sequence of transactions at which it is guaranteed that the heap and index data files have been updated with ALL information written BEFORE that checkpoint".

And I interpreted that as "All information written before that checkpoint RECORD" but now I guess that one thing is a checkpoint point and one thing is a checkpoint record. Right?

2) I see that a checkpoint position can be right in the middle of a group of records related to a transaction (in the example, transaction id 10684). So a checkpoint position is NOT a consistency state point, right?

3) According to doc at 29.5, in pg_control the position of the last checkpoint record (not the checkpoint position itself) is saved right?

4) If I'm right at 2) then, between the checkpoint position (1/F00A7448 ) and the checkpoint record position (1/FCBD7510) there must be a point where the DB is in a consistency state. If not, in case of crash just after writing the checkpoint record to the WAL and its position to pg_control, the system would replay from the checkpoint position (known by last checkpoint record) without finding a consistency state. Right?

5) How can we define, in terms of log records, a consistency state position?

Best regards

Pupillo

STANDBY SERVER LOG

LOG: redo starts at 1/F00A7448

....

LOG: consistent recovery state reached at 2/426DF28

LOG: invalid record length at 2/426DF28: wanted 24, got 0

LOG: database system is ready to accept read only connections

LOG: started streaming WAL from primary at 2/4000000 on timeline 1

FROM PG_XLOGDUMP OF MASTER

........

rmgr: Heap len (rec/tot): 14/ 1186, tx: 10684, lsn: 1/F009EEE0, prev 1/F009EEA8, desc: UPDATE off 1 xmax 10684 ; new off 3 xmax 0, blkref #0: rel 1663/16384/16422 blk 20054, blkref #1: rel 1663/16384/16422 blk 19774

rmgr: Btree len (rec/tot): 2/ 64, tx: 10684, lsn: 1/F009F388, prev 1/F009EEE0, desc: INSERT_LEAF off 2, blkref #0: rel 1663/16384/16428 blk 711

rmgr: XLOG len (rec/tot): 0/ 8193, tx: 0, lsn: 1/F009F3C8, prev 1/F009F388, desc: FPI_FOR_HINT , blkref #0: rel 1663/16384/16484 blk 9663 FPW

rmgr: XLOG len (rec/tot): 0/ 8193, tx: 0, lsn: 1/F00A13E8, prev 1/F009F3C8, desc: FPI_FOR_HINT , blkref #0: rel 1663/16384/16484 blk 9664 FPW

rmgr: XLOG len (rec/tot): 0/ 8193, tx: 0, lsn: 1/F00A3408, prev 1/F00A13E8, desc: FPI_FOR_HINT , blkref #0: rel 1663/16384/16484 blk 9665 FPW

rmgr: XLOG len (rec/tot): 0/ 8193, tx: 0, lsn: 1/F00A5428, prev 1/F00A3408, desc: FPI_FOR_HINT , blkref #0: rel 1663/16384/16484 blk 9666 FPW

rmgr: XLOG len (rec/tot): 0/ 8193, tx: 0, lsn: 1/F00A7448, prev 1/F00A5428, desc: FPI_FOR_HINT , blkref #0: rel 1663/16384/16484 blk 9667 FPW

rmgr: XLOG len (rec/tot): 0/ 8193, tx: 0, lsn: 1/F00A9468, prev 1/F00A7448, desc: FPI_FOR_HINT , blkref #0: rel 1663/16384/16484 blk 9668 FPW

rmgr: XLOG len (rec/tot): 0/ 8193, tx: 0, lsn: 1/F00AB488, prev 1/F00A9468, desc: FPI_FOR_HINT , blkref #0: rel 1663/16384/16484 blk 9669 FPW

rmgr: Heap len (rec/tot): 8/ 8063, tx: 10682, lsn: 1/F00AD4A8, prev 1/F00AB488, desc: LOCK off 5: xid 10682: flags 0 LOCK_ONLY EXCL_LOCK , blkref #0: rel 1663/16384/16422 blk 19787 FPW

rmgr: Heap len (rec/tot): 8/ 8063, tx: 10684, lsn: 1/F00AF440, prev 1/F00AD4A8, desc: LOCK off 2: xid 10684: flags 0 LOCK_ONLY EXCL_LOCK , blkref #0: rel 1663/16384/16422 blk 19774 FPW

rmgr: Heap len (rec/tot): 14/ 4657, tx: 10682, lsn: 1/F00B13D8, prev 1/F00AF440, desc: UPDATE off 5 xmax 10682 ; new off 4 xmax 0, blkref #0: rel 1663/16384/16422 blk 20075 FPW, blkref #1: rel 1663/16384/16422 blk 19787

rmgr: Heap len (rec/tot): 14/ 4657, tx: 10684, lsn: 1/F00B2628, prev 1/F00B13D8, desc: UPDATE off 2 xmax 10684 ; new off 4 xmax 0, blkref #0: rel 1663/16384/16422 blk 20054 FPW, blkref #1: rel 1663/16384/16422 blk 19774

rmgr: Btr

.......

rmgr: Heap2 len (rec/tot): 8/ 68, tx: 0, lsn: 1/FCBD7448, prev 1/FCBD7400, desc: CLEAN remxid 10903, blkref #0: rel 1663/16384/16422 blk 2001

rmgr: Heap2 len (rec/tot): 8/ 66, tx: 0, lsn: 1/FCBD7490, prev 1/FCBD7448, desc: CLEAN remxid 10903, blkref #0: rel 1663/16384/16422 blk 2003

rmgr: Standby len (rec/tot): 24/ 50, tx: 0, lsn: 1/FCBD74D8, prev 1/FCBD7490, desc: RUNNING_XACTS nextXid 10907 latestCompletedXid 10906 oldestRunningXid 10907

rmgr: XLOG len (rec/tot): 80/ 106, tx: 0, lsn: 1/FCBD7510, prev 1/FCBD74D8, desc: CHECKPOINT_ONLINE redo 1/F00A7448; tli 1; prev tli 1; fpw true; xid 0:10685; oid 24665; multi 1; offset 0; oldest xid 584 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 10682; online

rmgr: Heap len (rec/tot): 3/ 164, tx: 10907, lsn: 1/FCBD7580, prev 1/FCBD7510, desc: INSERT off 25, blkref #0: rel 1663/16384/16484 blk 16398

rmgr: Btree len (rec/tot): 2/ 64, tx: 10907, lsn: 1/FCBD7628, prev 1/FCBD7580, desc: INSERT_LEAF off 60, blkref #0: rel 1663/16384/16490 blk 2722

........

Re: [GENERAL] checkpoint clarifications needed

От

Adrian Klaver

Дата:

09 января 2017 г., 17:26:29

On 01/09/2017 06:14 AM, Tom DalPozzo wrote:
> Hi, I need some clarifications about checkpoints.
> Below here a log from my standby server when started and then some parts
> of the interested WAL in the master's cluster  obtained by pg_xlogdump.
> Just to have an example to talk on.
>
> 1) I see: "LOG:  redo starts at 1/F00A7448" . I was expecting a
> checkpoint record around 1/F00A7448 but the related checkpoint record is
> at lsn: 1/FCBD7510 instead.
> I expected that because I read in the doc :"Checkpoints are points in
> the sequence of transactions at which it is guaranteed that the heap and
> index data files have been updated with ALL information written BEFORE
> that checkpoint".
> And I interpreted that as "All information written before that
> checkpoint RECORD" but now I guess that one thing is a checkpoint point
> and one thing is a checkpoint record. Right?
>
> 2) I see that a checkpoint position can be right in the middle of a
> group of records related to a transaction (in the example, transaction
> id 10684). So a checkpoint position is NOT a consistency state point, right?
>
> 3) According to doc at 29.5,  in pg_control the position of the last
> checkpoint record (not the checkpoint position itself) is saved  right?
>
> 4) If I'm right at 2) then, between the checkpoint position (1/F00A7448
> ) and the checkpoint record position (1/FCBD7510) there must be a point
> where the DB is in a consistency state. If not, in case of crash just
> after writing the checkpoint record to the WAL and its position to
> pg_control, the system would replay from the checkpoint position (known
> by  last checkpoint record) without finding a consistency state. Right?
>
> 5) How can we define, in terms of log records, a consistency state position?

https://www.postgresql.org/docs/9.5/static/wal-internals.html

"After a checkpoint has been made and the log flushed, the checkpoint's
position is saved in the file pg_control. Therefore, at the start of
recovery, the server first reads pg_control and then the checkpoint
record; then it performs the REDO operation by scanning forward from the
log position indicated in the checkpoint record. Because the entire
content of data pages is saved in the log on the first page modification
after a checkpoint (assuming full_page_writes is not disabled), all
pages changed since the checkpoint will be restored to a consistent state."

>
> Best regards
> Pupillo
>
> STANDBY SERVER LOG
> LOG:  redo starts at 1/F00A7448
> ....
> LOG:  consistent recovery state reached at 2/426DF28
> LOG:  invalid record length at 2/426DF28: wanted 24, got 0
> LOG:  database system is ready to accept read only connections
> LOG:  started streaming WAL from primary at 2/4000000 on timeline 1
>
>
> FROM PG_XLOGDUMP OF MASTER
> ........
> rmgr: Heap        len (rec/tot):     14/  1186, tx:      10684, lsn:
> 1/F009EEE0, prev 1/F009EEA8, desc: UPDATE off 1 xmax 10684 ; new off 3
> xmax 0, blkref #0: rel 1663/16384/16422 blk 20054, blkref #1: rel
> 1663/16384/16422 blk 19774
> rmgr: Btree       len (rec/tot):      2/    64, tx:      10684, lsn:
> 1/F009F388, prev 1/F009EEE0, desc: INSERT_LEAF off 2, blkref #0: rel
> 1663/16384/16428 blk 711
> rmgr: XLOG        len (rec/tot):      0/  8193, tx:          0, lsn:
> 1/F009F3C8, prev 1/F009F388, desc: FPI_FOR_HINT , blkref #0: rel
> 1663/16384/16484 blk 9663 FPW
> rmgr: XLOG        len (rec/tot):      0/  8193, tx:          0, lsn:
> 1/F00A13E8, prev 1/F009F3C8, desc: FPI_FOR_HINT , blkref #0: rel
> 1663/16384/16484 blk 9664 FPW
> rmgr: XLOG        len (rec/tot):      0/  8193, tx:          0, lsn:
> 1/F00A3408, prev 1/F00A13E8, desc: FPI_FOR_HINT , blkref #0: rel
> 1663/16384/16484 blk 9665 FPW
> rmgr: XLOG        len (rec/tot):      0/  8193, tx:          0, lsn:
> 1/F00A5428, prev 1/F00A3408, desc: FPI_FOR_HINT , blkref #0: rel
> 1663/16384/16484 blk 9666 FPW
> rmgr: XLOG        len (rec/tot):      0/  8193, tx:          0, lsn:
> 1/F00A7448, prev 1/F00A5428, desc: FPI_FOR_HINT , blkref #0: rel
> 1663/16384/16484 blk 9667 FPW
> rmgr: XLOG        len (rec/tot):      0/  8193, tx:          0, lsn:
> 1/F00A9468, prev 1/F00A7448, desc: FPI_FOR_HINT , blkref #0: rel
> 1663/16384/16484 blk 9668 FPW
> rmgr: XLOG        len (rec/tot):      0/  8193, tx:          0, lsn:
> 1/F00AB488, prev 1/F00A9468, desc: FPI_FOR_HINT , blkref #0: rel
> 1663/16384/16484 blk 9669 FPW
> rmgr: Heap        len (rec/tot):      8/  8063, tx:      10682, lsn:
> 1/F00AD4A8, prev 1/F00AB488, desc: LOCK off 5: xid 10682: flags 0
> LOCK_ONLY EXCL_LOCK , blkref #0: rel 1663/16384/16422 blk 19787 FPW
> rmgr: Heap        len (rec/tot):      8/  8063, tx:      10684, lsn:
> 1/F00AF440, prev 1/F00AD4A8, desc: LOCK off 2: xid 10684: flags 0
> LOCK_ONLY EXCL_LOCK , blkref #0: rel 1663/16384/16422 blk 19774 FPW
> rmgr: Heap        len (rec/tot):     14/  4657, tx:      10682, lsn:
> 1/F00B13D8, prev 1/F00AF440, desc: UPDATE off 5 xmax 10682 ; new off 4
> xmax 0, blkref #0: rel 1663/16384/16422 blk 20075 FPW, blkref #1: rel
> 1663/16384/16422 blk 19787
> rmgr: Heap        len (rec/tot):     14/  4657, tx:      10684, lsn:
> 1/F00B2628, prev 1/F00B13D8, desc: UPDATE off 2 xmax 10684 ; new off 4
> xmax 0, blkref #0: rel 1663/16384/16422 blk 20054 FPW, blkref #1: rel
> 1663/16384/16422 blk 19774
> rmgr: Btr
> .......
> rmgr: Heap2       len (rec/tot):      8/    68, tx:          0, lsn:
> 1/FCBD7448, prev 1/FCBD7400, desc: CLEAN remxid 10903, blkref #0: rel
> 1663/16384/16422 blk 2001
> rmgr: Heap2       len (rec/tot):      8/    66, tx:          0, lsn:
> 1/FCBD7490, prev 1/FCBD7448, desc: CLEAN remxid 10903, blkref #0: rel
> 1663/16384/16422 blk 2003
> rmgr: Standby     len (rec/tot):     24/    50, tx:          0, lsn:
> 1/FCBD74D8, prev 1/FCBD7490, desc: RUNNING_XACTS nextXid 10907
> latestCompletedXid 10906 oldestRunningXid 10907
> rmgr: XLOG        len (rec/tot):     80/   106, tx:          0, lsn:
> 1/FCBD7510, prev 1/FCBD74D8, desc: CHECKPOINT_ONLINE redo 1/F00A7448;
> tli 1; prev tli 1; fpw true; xid 0:10685; oid 24665; multi 1; offset 0;
> oldest xid 584 in DB 1; oldest multi 1 in DB 1; oldest/newest commit
> timestamp xid: 0/0; oldest running xid 10682; online
> rmgr: Heap        len (rec/tot):      3/   164, tx:      10907, lsn:
> 1/FCBD7580, prev 1/FCBD7510, desc: INSERT off 25, blkref #0: rel
> 1663/16384/16484 blk 16398
> rmgr: Btree       len (rec/tot):      2/    64, tx:      10907, lsn:
> 1/FCBD7628, prev 1/FCBD7580, desc: INSERT_LEAF off 60, blkref #0: rel
> 1663/16384/16490 blk 2722
> ........
>


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: [GENERAL] checkpoint clarifications needed

От

Tom DalPozzo

Дата:

09 января 2017 г., 17:47:37

https://www.postgresql.org/docs/9.5/static/wal-internals.html

"After a checkpoint has been made and the log flushed, the checkpoint's position is saved in the file pg_control. Therefore, at the start of recovery, the server first reads pg_control and then the checkpoint record; then it performs the REDO operation by scanning forward from the log position indicated in the checkpoint record. Because the entire content of data pages is saved in the log on the first page modification after a checkpoint (assuming full_page_writes is not disabled), all pages changed since the checkpoint will be restored to a consistent state."

Hi, yes I know that, it's what I meant in my point 3). As it says first "the checkpoint's position (NOT the record)is saved in the file pg_control" then I had some doubt.

Regards

Pupillo

Re: [GENERAL] checkpoint clarifications needed

От

Adrian Klaver

Дата:

09 января 2017 г., 18:11:19

On 01/09/2017 06:47 AM, Tom DalPozzo wrote:
>         https://www.postgresql.org/docs/9.5/static/wal-internals.html
>         <https://www.postgresql.org/docs/9.5/static/wal-internals.html>
>
>
>     "After a checkpoint has been made and the log flushed, the
>     checkpoint's position is saved in the file pg_control. Therefore, at
>     the start of recovery, the server first reads pg_control and then
>     the checkpoint record; then it performs the REDO operation by
>     scanning forward from the log position indicated in the checkpoint
>     record. Because the entire content of data pages is saved in the log
>     on the first page modification after a checkpoint (assuming
>     full_page_writes is not disabled), all pages changed since the
>     checkpoint will be restored to a consistent state."
>
>
> Hi, yes I know that, it's what I meant in my point 3). As it says first
>  "the checkpoint's position (NOT the record)is saved in the file
> pg_control" then I had some doubt.

Yes it is just one piece of information stored in the file.

To see what else is stored there do:

pg_controldata -D your_cluster_data_directory



> Regards
> Pupillo
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: [GENERAL] checkpoint clarifications needed

От

Adrian Klaver

Дата:

09 января 2017 г., 18:37:40

On 01/09/2017 06:47 AM, Tom DalPozzo wrote:
>         https://www.postgresql.org/docs/9.5/static/wal-internals.html
>         <https://www.postgresql.org/docs/9.5/static/wal-internals.html>
>
>
>     "After a checkpoint has been made and the log flushed, the
>     checkpoint's position is saved in the file pg_control. Therefore, at
>     the start of recovery, the server first reads pg_control and then
>     the checkpoint record; then it performs the REDO operation by
>     scanning forward from the log position indicated in the checkpoint
>     record. Because the entire content of data pages is saved in the log
>     on the first page modification after a checkpoint (assuming
>     full_page_writes is not disabled), all pages changed since the
>     checkpoint will be restored to a consistent state."
>
>
> Hi, yes I know that, it's what I meant in my point 3). As it says first
>  "the checkpoint's position (NOT the record)is saved in the file
> pg_control" then I had some doubt.

Reread your original post and realized you where also asking about
transaction consistency and WALs. The thumbnail version is that Postgres
writes transactions to the WALs before they are written to the data
files on disk. A checkpoint represents a point in the sequence when is
is known that the changes recorded in the WAL have been also recorded in
the disk data files. So Postgres then knows that in a recovery scenario
it needs to only redo/replay the WAL changes that are past the last
checkpoint. So the transactions are there it is just a matter of if they
need to be replayed or not. This is subject to caveats:

https://www.postgresql.org/docs/9.5/static/wal-reliability.html

> Regards
> Pupillo
>
>

--
Adrian Klaver
adrian.klaver@aklaver.com

Re: [GENERAL] checkpoint clarifications needed

От

Tom DalPozzo

Дата:

10 января 2017 г., 00:10:05

Reread your original post and realized you where also asking about transaction consistency and WALs. The thumbnail version is that Postgres writes transactions to the WALs before they are written to the data files on disk. A checkpoint represents a point in the sequence when is is known that the changes recorded in the WAL have been also recorded in the disk data files. So Postgres then knows that in a recovery scenario it needs to only redo/replay the WAL changes that are past the last checkpoint. So the transactions are there it is just a matter of if they need to be replayed or not. This is subject to caveats:

https://www.postgresql.org/docs/9.5/static/wal-reliability.html

Hi, I had already read that doc but I can't answer clearly to my questions 2,4 and 5.

Regards

Pupillo

Re: [GENERAL] checkpoint clarifications needed

От

Adrian Klaver

Дата:

10 января 2017 г., 00:24:24

On 01/09/2017 01:10 PM, Tom DalPozzo wrote:
>         Reread your original post and realized you where also asking
>         about transaction consistency and WALs. The thumbnail version is
>         that Postgres writes transactions to the WALs before they are
>         written to the data files on disk. A checkpoint represents a
>         point in the sequence when is is known that the changes recorded
>         in the WAL have been also recorded in the disk data files. So
>         Postgres then knows that in a recovery scenario it needs to only
>         redo/replay the WAL changes that are past the last checkpoint.
>         So the transactions are there it is just a matter of if they
>         need to be replayed or not. This is subject to caveats:
>
>
>     https://www.postgresql.org/docs/9.5/static/wal-reliability.html
>     <https://www.postgresql.org/docs/9.5/static/wal-reliability.html>
>
>
> Hi, I had already read that doc but I can't  answer clearly to my
> questions 2,4 and 5.

The answer would seem to depend on what you consider 'a consistency
state position'. Is it possible to be more explicit about what you mean?

> Regards
> Pupillo
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: [GENERAL] checkpoint clarifications needed

От

Alvaro Herrera

Дата:

10 января 2017 г., 01:01:23

Tom DalPozzo wrote:

> 2) I see that a checkpoint position can be right in the middle of a group
> of records related to a transaction (in the example, transaction id 10684).
> So a checkpoint position is NOT a consistency state point, right?

> 4) If I'm right at 2) then, between the checkpoint position (1/F00A7448 )
> and the checkpoint record position (1/FCBD7510) there must be a point where
> the DB is in a consistency state. If not, in case of crash just after
> writing the checkpoint record to the WAL and its position to pg_control,
> the system would replay from the checkpoint position (known by  last
> checkpoint record) without finding a consistency state. Right?
>
> 5) How can we define, in terms of log records, a consistency state position?

Whether any individual tuple in the data files is visible or not depends
not only on the data itself, but also on the commit status of the
transactions that created it (and deleted it, if any).  Replaying WAL
also updates the commit status of transactions, so if you're in the
middle of replaying WAL, you may be adding tuples to the data files, but
those tuples will not become visible until their commit records are also
updated.

You can stop replaying WAL at any point, and data will always be in a
consistent state.  Some data tuples might be "from the future" and those
will not be visible, which is what makes it all consistent.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [GENERAL] checkpoint clarifications needed

От

Tom DalPozzo

Дата:

10 января 2017 г., 01:23:27

Whether any individual tuple in the data files is visible or not depends
not only on the data itself, but also on the commit status of the
transactions that created it (and deleted it, if any). Replaying WAL
also updates the commit status of transactions, so if you're in the
middle of replaying WAL, you may be adding tuples to the data files, but
those tuples will not become visible until their commit records are also
updated.

You can stop replaying WAL at any point, and data will always be in a
consistent state. Some data tuples might be "from the future" and those
will not be visible, which is what makes it all consistent.

Hi,

so let's suppose that the WAL is:

LSN 10: start transaction 123

LSN 11: update tuple 100

checkpoint position here (not a record but just for understanding)

LSN 12: update tuple 100

LSN 13: update tuple 100

LSN 14: checkpoint record ( postion=11)

LSN 15: update tuple 100

and that the system crashes now, before ending to write all the transaction's recs to the WAL (other updates and commit record missing).

At the replay, starting from LSN 12, the entire page we had at LSN 11 is written to the disk, though carrying inconsistent data.

Then we can even replay up to the end of WAL but always getting inconsistent data.

BUT, you say, as the tuple is not commited in the WAL, only the old version of the tuple will be visible? Right?

Regards

Pupillo

Re: [GENERAL] checkpoint clarifications needed

От

Alvaro Herrera

Дата:

10 января 2017 г., 01:31:32

Tom DalPozzo wrote:

> Hi,
> so let's suppose that the WAL is:
> LSN 10: start transaction 123
> LSN 11: update tuple 100
>    checkpoint position here (not a record but just for understanding)
> LSN 12: update tuple 100
> LSN 13: update tuple 100
> LSN 14: checkpoint record ( postion=11)
> LSN 15: update tuple 100
> and that the system crashes now, before ending to write all the
> transaction's recs to the WAL  (other updates and commit record missing).
>
> At the replay, starting from LSN 12, the entire page we had at LSN 11 is
> written to the disk, though carrying inconsistent data.
> Then we can even replay up to the end of WAL but always getting
> inconsistent data.
> BUT, you say, as the tuple is not commited in the WAL, only the old version
> of the tuple will be visible? Right?

Yes -- all the updated tuples are invisible because the commit record
for transaction 123 does not appear in wal.  A future VACUUM will remove
all those tuples.  Note that precisely for this reason, the original
version of the tuple had not been removed yet.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [GENERAL] checkpoint clarifications needed

От

Tom DalPozzo

Дата:

10 января 2017 г., 01:45:41

> Hi,
> so let's suppose that the WAL is:
> LSN 10: start transaction 123
> LSN 11: update tuple 100
> checkpoint position here (not a record but just for understanding)
> LSN 12: update tuple 100
> LSN 13: update tuple 100
> LSN 14: checkpoint record ( postion=11)
> LSN 15: update tuple 100
> and that the system crashes now, before ending to write all the
> transaction's recs to the WAL (other updates and commit record missing).
>
> At the replay, starting from LSN 12, the entire page we had at LSN 11 is
> written to the disk, though carrying inconsistent data.
> Then we can even replay up to the end of WAL but always getting
> inconsistent data.
> BUT, you say, as the tuple is not commited in the WAL, only the old version
> of the tuple will be visible? Right?

Yes -- all the updated tuples are invisible because the commit record
for transaction 123 does not appear in wal. A future VACUUM will remove
all those tuples. Note that precisely for this reason, the original
version of the tuple had not been removed yet.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Ok! Now many things are clear to me....

Thank you very much

Pupillo

Re: [GENERAL] checkpoint clarifications needed

От

Tom DalPozzo

Дата:

10 января 2017 г., 01:47:52

Hi, I had already read that doc but I can't answer clearly to my
questions 2,4 and 5.

The answer would seem to depend on what you consider 'a consistency state position'. Is it possible to be more explicit about what you mean?

Hi, I meant a position such that, if you replay up to it, then the DB is in a consistent state (transactions done entirely or not a t all...).

But, as Alvaro wrote, any position is ok

Thank you very much

Pupillo

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: [GENERAL] checkpoint clarifications needed