Обсуждение: [HACKERS] Failed recovery with new faster 2PC code
After this commit, I get crash recovery failures when using prepared transactions.
commit 728bd991c3c4389fb39c45dcb0fe57e4a1dccd71
Author: Simon Riggs <simon@2ndQuadrant.com>
Date: Tue Apr 4 15:56:56 2017 -0400
Speedup 2PC recovery by skipping two phase state files in normal path
After the induced crash, I get this failure in recovery:
FATAL: could not access status of transaction 334419347
DETAIL: Could not open file "pg_xact/013E": No such file or directory.
LOG: startup process (PID 60106) exited with exit code 1
LOG: aborting startup due to startup process failure
LOG: database system is shut down
The earliest file which exists in pg_xact is 0176
Other examples:
FATAL: could not access status of transaction 121729737
DETAIL: Could not open file "pg_xact/0074": No such file or directory.
LOG: startup process (PID 23720) exited with exit code 1
FATAL: could not access status of transaction 181325554
DETAIL: Could not open file "pg_xact/00AC": No such file or directory.
LOG: startup process (PID 8375) exited with exit code 1
I experience this in about 1 out of 15 crash-recovery cycles on 8 CPUs.
The patch Pavan posted here did not make any difference:
I've attached the test harness, which I think will look familiar to y'all. It is the usual injection of torn-page-write crashes with consistency checks after recovery (which makes no difference, as the issue is that recovery does not happen), modified to include a very crude transaction manager to make use of 2PC.
Cheers,
Jeff
Вложения
On 15 April 2017 at 23:37, Jeff Janes <jeff.janes@gmail.com> wrote: > After this commit, I get crash recovery failures when using prepared > transactions. > > commit 728bd991c3c4389fb39c45dcb0fe57e4a1dccd71 > Author: Simon Riggs <simon@2ndQuadrant.com> > Date: Tue Apr 4 15:56:56 2017 -0400 > > Speedup 2PC recovery by skipping two phase state files in normal path Thanks Jeff for your tests. So that's now two crash bugs in as many days and lack of clarity about how to fix it. Stas, I thought this patch was very important to you, yet two releases in a row we are too-late-and-buggy. If anybody has a reason why I shouldn't revert this, please say so now fairly soon. Any further attempts to fix must run Jeff's tests. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> On 17 Apr 2017, at 12:14, Simon Riggs <simon@2ndquadrant.com> wrote: > > On 15 April 2017 at 23:37, Jeff Janes <jeff.janes@gmail.com> wrote: >> After this commit, I get crash recovery failures when using prepared >> transactions. >> >> commit 728bd991c3c4389fb39c45dcb0fe57e4a1dccd71 >> Author: Simon Riggs <simon@2ndQuadrant.com> >> Date: Tue Apr 4 15:56:56 2017 -0400 >> >> Speedup 2PC recovery by skipping two phase state files in normal path > > Thanks Jeff for your tests. > > So that's now two crash bugs in as many days and lack of clarity about > how to fix it. > > Stas, I thought this patch was very important to you, yet two releases > in a row we are too-late-and-buggy. I’m looking at pgstat issue in nearby thread right now and will switch to this shortly. If that’s possible, I’m asking to delay revert for several days. Stas Kelvich Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
>> commit 728bd991c3c4389fb39c45dcb0fe57e4a1dccd71
>> Author: Simon Riggs <simon@2ndQuadrant.com>
>> Date: Tue Apr 4 15:56:56 2017 -0400
>>
>> Speedup 2PC recovery by skipping two phase state files in normal path
>
> Thanks Jeff for your tests.
>
> So that's now two crash bugs in as many days and lack of clarity about
> how to fix it.
>
I am trying to reproduce and debug it from my end as well.
Regards,
Nikhils
On 17 April 2017 at 15:02, Nikhil Sontakke <nikhils@2ndquadrant.com> wrote:
>> commit 728bd991c3c4389fb39c45dcb0fe57e4a1dccd71
>> Author: Simon Riggs <simon@2ndQuadrant.com>
>> Date: Tue Apr 4 15:56:56 2017 -0400
>>
>> Speedup 2PC recovery by skipping two phase state files in normal path
>
> Thanks Jeff for your tests.
>
> So that's now two crash bugs in as many days and lack of clarity about
> how to fix it.
>
The issue seems to be that a prepared transaction is yet to be committed. But autovacuum comes in and causes the clog to be truncated beyond this prepared transaction ID in one of the runs.
We only add the corresponding pgproc entry for a surviving 2PC transaction on completion of recovery. So could be a race condition here. Digging in further.
Regards,
Nikhils
-- Nikhil Sontakke http://www.2ndQuadrant.com/
PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services
PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services
Hi,
There was a bug in the redo 2PC remove code path. Because of which, autovac would think that the 2PC is gone and cause removal of the corresponding clog entry earlier than needed.
Please find attached, the bug fix: 2pc_redo_remove_bug.patch.
I have been testing this on top of Michael's 2pc-restore-fix.patch and things seem to be ok for the past one+ hour. Will keep it running for long.
Jeff, thanks for these very useful scripts. I am going to make a habit to run these scripts on my side from now on. Do you have any other script that I could try against these patches? Please let me know.
Regards,
Nikhils
On 18 April 2017 at 12:09, Nikhil Sontakke <nikhils@2ndquadrant.com> wrote:
On 17 April 2017 at 15:02, Nikhil Sontakke <nikhils@2ndquadrant.com> wrote:>> commit 728bd991c3c4389fb39c45dcb0fe57e4a1dccd71
>> Author: Simon Riggs <simon@2ndQuadrant.com>
>> Date: Tue Apr 4 15:56:56 2017 -0400
>>
>> Speedup 2PC recovery by skipping two phase state files in normal path
>
> Thanks Jeff for your tests.
>
> So that's now two crash bugs in as many days and lack of clarity about
> how to fix it.
>The issue seems to be that a prepared transaction is yet to be committed. But autovacuum comes in and causes the clog to be truncated beyond this prepared transaction ID in one of the runs.We only add the corresponding pgproc entry for a surviving 2PC transaction on completion of recovery. So could be a race condition here. Digging in further.Regards,Nikhils--Nikhil Sontakke http://www.2ndQuadrant.com/
PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services
Nikhil Sontakke http://www.2ndQuadrant.com/
PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services
PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services
Вложения
Please find attached a second version of my bug fix which is stylistically better and clearer than the first one.
Regards,
Nikhils
On 18 April 2017 at 13:47, Nikhil Sontakke <nikhils@2ndquadrant.com> wrote:
Hi,There was a bug in the redo 2PC remove code path. Because of which, autovac would think that the 2PC is gone and cause removal of the corresponding clog entry earlier than needed.Please find attached, the bug fix: 2pc_redo_remove_bug.patch.I have been testing this on top of Michael's 2pc-restore-fix.patch and things seem to be ok for the past one+ hour. Will keep it running for long.Jeff, thanks for these very useful scripts. I am going to make a habit to run these scripts on my side from now on. Do you have any other script that I could try against these patches? Please let me know.Regards,NikhilsOn 18 April 2017 at 12:09, Nikhil Sontakke <nikhils@2ndquadrant.com> wrote:On 17 April 2017 at 15:02, Nikhil Sontakke <nikhils@2ndquadrant.com> wrote:>> commit 728bd991c3c4389fb39c45dcb0fe57e4a1dccd71
>> Author: Simon Riggs <simon@2ndQuadrant.com>
>> Date: Tue Apr 4 15:56:56 2017 -0400
>>
>> Speedup 2PC recovery by skipping two phase state files in normal path
>
> Thanks Jeff for your tests.
>
> So that's now two crash bugs in as many days and lack of clarity about
> how to fix it.
>The issue seems to be that a prepared transaction is yet to be committed. But autovacuum comes in and causes the clog to be truncated beyond this prepared transaction ID in one of the runs.We only add the corresponding pgproc entry for a surviving 2PC transaction on completion of recovery. So could be a race condition here. Digging in further.Regards,Nikhils--Nikhil Sontakke http://www.2ndQuadrant.com/
PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services--Nikhil Sontakke http://www.2ndQuadrant.com/
PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services
Nikhil Sontakke http://www.2ndQuadrant.com/
PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services
PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services
Вложения
On 18 April 2017 at 09:57, Nikhil Sontakke <nikhils@2ndquadrant.com> wrote: > Please find attached a second version of my bug fix which is stylistically > better and clearer than the first one. Yeh, this is better. Pushed. The bug was that the loop set gxact to be the last entry in the array, causing the exit condition to fail and us then to remove the last gxact from memory even when it didn't match the xid, removing a valid entry too early. That then allowed xmin to move forwards, which causes autovac to remove pg_xact entries earlier than needed. Well done for finding that one, thanks for the patch. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Apr 18, 2017 at 7:54 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > Yeh, this is better. Pushed. I have been outraced on this one, the error is obvious once you see it ;) Thanks for the investigation and the fix! I have spent a couple of hours reviewing the interactions between the shmem entries of 2PC state data created at the beginning of recovery and all the lookups in procarray.c and varsup.c, noticing nothing by the way. > The bug was that the loop set gxact to be the last entry in the array, > causing the exit condition to fail and us then to remove the last > gxact from memory even when it didn't match the xid, removing a valid > entry too early. That then allowed xmin to move forwards, which causes > autovac to remove pg_xact entries earlier than needed. > > Well done for finding that one, thanks for the patch. Running Jeff's test suite, I can confirm that there are no problems now. -- Michael
On 18 April 2017 at 13:12, Michael Paquier <michael.paquier@gmail.com> wrote: > On Tue, Apr 18, 2017 at 7:54 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> Yeh, this is better. Pushed. > > I have been outraced on this one, the error is obvious once you see it ;) Didn't realise you were working on it, nothing competitive about it. It's clear this needed fixing, whether or not it fixes Jeff's report. I do think it explains the report, so I'm hopeful. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Apr 18, 2017 at 5:38 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 18 April 2017 at 13:12, Michael Paquier <michael.paquier@gmail.com> wrote:
> On Tue, Apr 18, 2017 at 7:54 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> Yeh, this is better. Pushed.
>
> I have been outraced on this one, the error is obvious once you see it ;)
Didn't realise you were working on it, nothing competitive about it.
It's clear this needed fixing, whether or not it fixes Jeff's report.
I do think it explains the report, so I'm hopeful.
The git HEAD code (c727f12) has been surviving so far, with both test cases.
Thanks,
Jeff
On Tue, Apr 18, 2017 at 1:17 AM, Nikhil Sontakke <nikhils@2ndquadrant.com> wrote:
Hi,There was a bug in the redo 2PC remove code path. Because of which, autovac would think that the 2PC is gone and cause removal of the corresponding clog entry earlier than needed.Please find attached, the bug fix: 2pc_redo_remove_bug.patch.I have been testing this on top of Michael's 2pc-restore-fix.patch and things seem to be ok for the past one+ hour. Will keep it running for long.Jeff, thanks for these very useful scripts. I am going to make a habit to run these scripts on my side from now on. Do you have any other script that I could try against these patches? Please let me know.
This script is the only one I have that specifically targets 2PC. I wrote it last year when the previous round of speed-up code (which avoided writing the files upon "PREPARE" by delaying them until the next checkpoint) was developed. I just decided to dust that test off to try again here. I don't know how to change it to make it more targeted towards this set of patches. Would this bug have been seen in a replica server in the absence of crashes, or was it only vulnerable during crash recovery rather than streaming replication?
Cheers,
Jeff
On Wed, Apr 19, 2017 at 11:09 AM, Jeff Janes <jeff.janes@gmail.com> wrote: > Would this bug have been seen in a replica server in the absence of crashes, > or was it only vulnerable during crash recovery rather than streaming > replication? This issue could have been seen on a streaming standby as well, letting around a TwoPhaseState impacts as well their linked PGPROC so CLOG truncation would have been messed up as well. That's also the case of the first issue involving as well incorrect XID updates, though the chances to see it are I think lower as only the beginning of recovery was impacted. -- Michael