Обсуждение: BUG #14326: Unexpected status after crash during exclusive backup

Поиск
Список
Период
Сортировка

BUG #14326: Unexpected status after crash during exclusive backup

От
marco.nenciarini@2ndquadrant.it
Дата:
VGhlIGZvbGxvd2luZyBidWcgaGFzIGJlZW4gbG9nZ2VkIG9uIHRoZSB3ZWJz
aXRlOgoKQnVnIHJlZmVyZW5jZTogICAgICAxNDMyNgpMb2dnZWQgYnk6ICAg
ICAgICAgIE1hcmNvIE5lbmNpYXJpbmkKRW1haWwgYWRkcmVzczogICAgICBt
YXJjby5uZW5jaWFyaW5pQDJuZHF1YWRyYW50Lml0ClBvc3RncmVTUUwgdmVy
c2lvbjogOS42cmMxCk9wZXJhdGluZyBzeXN0ZW06ICAgQW55CkRlc2NyaXB0
aW9uOiAgICAgICAgCgpJIHdhcyBpbnZlc3RpZ2F0aW5nIGEgUG9zdGdyZXMg
c3RhbmRieSB0aGF0IHdhcyBuZXZlciByZWFjaGluZyB0aGUKY29uc2lzdGVu
dCByZWNvdmVyeSBzdGF0ZSBhbmQgSSBkaXNjb3ZlcmVkIHNvbWV0aGluZyB1
bmV4cGVjdGVkIGluIHRoZQpwZ19jb250cm9sZGF0YSBvdXRwdXQ6DQoNCkJh
Y2t1cCBzdGFydCBsb2NhdGlvbjogNUU0LzdDMDAwMDI4DQpCYWNrdXAgZW5k
IGxvY2F0aW9uOiAwLzANCg0KVGhlIHN0YW5kYnkgd2FzIGJ1aWx0IHVzaW5n
IGEgY29sZCBiYWNrdXAgb2YgdGhlIG1hc3RlciBkYXRhIGRpcmVjdG9yeSwg
c28gSQp3YXMgc3VycHJpc2VkIHRvIGZpbmQgIkJhY2t1cCBzdGFydCBsb2Nh
dGlvbiIgZGlmZmVyZW50IGZyb20gMC8wLg0KDQpUaGUgcmVwbGljYXRpb24g
d2FzIHdvcmtpbmcgY29ycmVjdGx5IGFuZCB0aGUgc3RhbmRieSB3YXMgcGVy
ZmVjdGx5IGFsaWduZWQKd2l0aCB0aGUgbWFzdGVyLCBtb3Jlb3ZlciwgdGhl
IHBvc2l0aW9uIDVFNC83QzAwMDAyOCB3YXMgdmVyeSBvbGQgY29tcGFyZWQK
dG8gdGhlIGxhdGVzdCBjaGVja3BvaW50IGxvY2F0aW9uLCB3aGljaCB3YXMg
Njc1Lzc4MzI5NzQ4Lg0KDQpBZnRlciBmdXJ0aGVyIGludmVzdGlnYXRpb24g
SSBkaXNjb3ZlcmVkIHRoYXQgdGhlIGNhdXNlIG9mIHRoZSBpc3N1ZSB3YXMg
YQpzeXN0ZW0gY3Jhc2ggd2hpY2ggaGFwcGVuZWQgYSBtb250aCBhZ28uIFVu
Zm9ydHVuYXRlbHkgd2hlbiB0aGUgc3lzdGVtCmNyYXNoZWQsIGFuIGV4Y2x1
c2l2ZSBiYWNrdXAgd2FzIHJ1bm5pbmcsIHNvIGF0IHJlc3RhcnQgaXQgZm91
bmQgYSB2YWxpZApiYWNrdXBfbGFiZWwgYW5kLCBnaXZlbiB0aGF0IHRoZSBX
QUwgZmlsZSBjb250YWluaW5nIHRoZSBiYWNrdXAgc3RhcnQgcG9pbnQKd2Fz
IHN0aWxsIGF2YWlsYWJsZSwgaXQgc3RhcnRlZCBhIGJhY2t1cCByZWNvdmVy
eS4NCg0KVGhlcmUgaXMgdGhlIGlzc3VlOiBQb3N0Z3JlcyB3aWxsIG5ldmVy
IGZpbmQgdGhlIFhMT0dfQkFDS1VQX0VORCByZWNvcmQKY29ycmVzcG9uZGlu
ZyB0byB0aGUgYmFja3VwU3RhcnRQb2ludCByZWNvcmRlZCBpbiBjb250cm9s
IGRhdGEsIGJlY2F1c2UgaXQKd2FzIG5ldmVyIHdyaXR0ZW4sIHNvIGl0IHdp
bGwgbmV2ZXIgcmVhY2ggdGhlIGNvbnNpc3RlbmN5IHBvaW50Lg0KDQpUaGlz
IGhhcyBubyB1c2VyLXZpc2libGUgZWZmZWN0cyB1bmxlc3MgdGhlIFBvc3Rn
cmVzIGluc3RhbmNlIGVudGVycyB0aGUKYXJjaGl2ZSByZWNvdmVyeSBzdGF0
ZSwgaW4gdGhhdCBjYXNlIGhvdCBzdGFuZGJ5IHdpbGwgbmV2ZXIgYmUgYWN0
aXZhdGVkLgpBbHNvLCBpdCBkb2Vzbid0IGltcGFjdCBhbnkgYmFja3VwIGV2
ZW50dWFsbHkgdGFrZW4gZnJvbSB0aGUgaW5zdGFuY2UKYmVjYXVzZSB0byBy
ZWNvdmVyIGZyb20gdGhlIGJhY2t1cCwgdGhlIGluc3RhbmNlIHdpbGwgZ28g
dGhyb3VnaCBhIGJhY2t1cApyZWNvdmVyeSB0aGF0IHdpbGwgcmVzZXQgdGhl
IGJhY2t1cFN0YXJ0UG9pbnQgdmFsdWUuDQoNClRoZSB3b3JrYXJvdW5kIEkg
Zm91bmQgdG8gcmVzZXQgdGhpcyBzdGF0ZSBpcyB0byBmb3JjZSB0aGUgaW5z
dGFuY2UgdGhyb3VnaAphbm90aGVyIGJhY2t1cCByZWNvdmVyeSwgYnkgc3Rh
cnRpbmcgYW4gZXhjbHVzaXZlIGJhY2t1cCwgc2F2aW5nIHRoZQpiYWNrdXBf
bGFiZWwsIHN0b3BwaW5nIHRoZSBiYWNrdXAgYW5kIHJlc3RhcnRpbmcgdGhl
IGluc3RhbmNlIHdpdGggdGhlIHNhdmVkCmJhY2t1cF9sYWJlbCBpbiBwbGFj
ZS4NCg0KSSBkb24ndCBrbm93IHRoZSBiZXN0IHdheSB0byBoYW5kbGUgdGhp
cyBzaXR1YXRpb24sIGJ1dCBhdCBsZWFzdCwgSSdkIGxpa2UgYQp3YXJuaW5n
IG1lc3NhZ2Ugd2hlbiB0aGUgaW5zdGFuY2UgZXhpdHMgZnJvbSB0aGUgY3Jh
c2ggcmVjb3Zlcnkgd2hpbGUKYmFja3VwU3RhcnRQb2ludCBpcyBzdGlsbCBz
ZXQuDQoNClRoaXMgYmVoYXZpb3VyIGlzIHByZXNlbnQgaW4gZXZlcnkgc3Vw
cG9ydGVkIFBvc3RncmVzIHJlbGVhc2UgYW5kIG9uIG1hc3RlcgphcyB3ZWxs
Lg0KDQpSZWdhcmRzLA0KTWFyY28KCg==

Re: BUG #14326: Unexpected status after crash during exclusive backup

От
Michael Paquier
Дата:
On Fri, Sep 16, 2016 at 6:54 PM,  <marco.nenciarini@2ndquadrant.it> wrote:
> The workaround I found to reset this state is to force the instance through
> another backup recovery, by starting an exclusive backup, saving the
> backup_label, stopping the backup and restarting the instance with the saved
> backup_label in place.

That's not user-friendly.

> I don't know the best way to handle this situation, but at least, I'd like a
> warning message when the instance exits from the crash recovery while
> backupStartPoint is still set.

So you would get such a warning even when you restore from a backup
willingly, no? That may confuse users. Now, the case you are referring
to is unfortunately a known problem with exclusive backups... There is
no way to make the difference between a node restored from a backup
and a node that crashed while a backup is taken. And that may be a
reason to make non-exclusive backups more popular because they are
more reliable.
--
Michael

Re: BUG #14326: Unexpected status after crash during exclusive backup

От
Marco Nenciarini
Дата:
On 21/09/16 08:50, Michael Paquier wrote:
> On Fri, Sep 16, 2016 at 6:54 PM,  <marco.nenciarini@2ndquadrant.it> wrote:
>> The workaround I found to reset this state is to force the instance thro=
ugh
>> another backup recovery, by starting an exclusive backup, saving the
>> backup_label, stopping the backup and restarting the instance with the s=
aved
>> backup_label in place.
>=20
> That's not user-friendly.
>=20

I agree, it isn't. But it's the only way to reset that state with the
current available tools. Probably, the pg_resetxlog tool could be
modified to allow the user to reset that value only.


>> I don't know the best way to handle this situation, but at least, I'd li=
ke a
>> warning message when the instance exits from the crash recovery while
>> backupStartPoint is still set.
>=20
> So you would get such a warning even when you restore from a backup
> willingly, no? That may confuse users. Now, the case you are referring
> to is unfortunately a known problem with exclusive backups... There is
> no way to make the difference between a node restored from a backup
> and a node that crashed while a backup is taken.
> And that may be a
> reason to make non-exclusive backups more popular because they are
> more reliable.

You are right, an eventual solution to this issue must not interfere
with normal recovery from a backup.

To mitigate the effect we could to reset the state of the
backupStartPoint field during the pg_start_backup invocation. So, if a
backup will be interrupted by a reboot, the instance state will be
cleaned during the next backup.

Another possibility could be to emit a warning (and maybe reset
backupStartPoint value) during the shutdown of an instance that is fully
"in production".

Regards,
Marco

--=20
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it