Re: BUG #17064: Parallel VACUUM operations cause the error "global/pg_filenode.map contains incorrect checksum"

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: BUG #17064: Parallel VACUUM operations cause the error "global/pg_filenode.map contains incorrect checksum"
Дата
Msg-id 19190f79-cf37-ff18-1b40-07a1a66a1d9e@iki.fi
обсуждение исходный текст
Ответ на Re: BUG #17064: Parallel VACUUM operations cause the error "global/pg_filenode.map contains incorrect checksum"  (Thomas Munro <thomas.munro@gmail.com>)
Ответы Re: BUG #17064: Parallel VACUUM operations cause the error "global/pg_filenode.map contains incorrect checksum"
Список pgsql-bugs
On 23/06/2021 12:45, Thomas Munro wrote:
> On Wed, Jun 23, 2021 at 7:46 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> Let's just add the lock there.
> 
> +1, no doubt about that.

Committed that. Thanks for the report, Alexander!

>> ... What about the new kid on the block:
>> Persistent Memory? I found this article:
>> https://lwn.net/Articles/686150/. So at hardware level, Persistent
>> Memory only guarantees atomicity at cache line level (64 bytes). To
>> provide the traditional 512 byte sector atomicity, there's a feature in
>> Linux called BTT. Perhaps we should add a note to the docs that you
>> should enable that.
> 
> Right, also called sector mode.  I don't know enough about that to
> comment really, but... if my google-fu is serving me, you can't
> actually use interesting sector sizes like 8KB (you have to choose 512
> or 4096 bytes), so you'll have to pay for *two* synthetic atomic page
> schemes: BTT and our full page writes.  That makes me wonder... if you
> need to leave full page writes on anyway, maybe it would be a better
> trade-off to do double writes of our special atomic files (relmapper
> files and control file) so that we could safely turn BTT off and avoid
> double-taxation for relation data.  Just a thought.  No pmem
> experience here, I could be way off.

Yeah, you wouldn't want to turn on BTT for anything else than the 
pg_control file. That's the only place where we rely on sector 
atomicity, I believe. For everything else, it just adds overhead. Not 
sure how much overhead; maybe it doesn't matter in practice.

>> We haven't heard of broken control files from the field, so that doesn't
>> seem to be a problem in practice, at least not yet. Still, I would sleep
>> better if the control file had more redundancy. For example, have two
>> copies of it on disk. At startup, read both copies, and if they're both
>> valid, ignore the one with older timestamp. When updating it, write over
>> the older copy. That way, if you crash in the middle of updating it, the
>> old copy is still intact.
> 
> +1, with a flush in between so that only one can be borked no matter
> how the storage works.  It is interesting how few reports there are on
> the mailing list of a control file CRC check failures though, if I'm
> searching for the right thing[1].
> 
> [1] https://www.postgresql.org/search/?m=1&q=calculated+CRC+checksum+does+not+match+value+stored+in+file&l=&d=-1&s=r

If anyone wants a write a patch for that, I'd be happy to review it. And 
if anyone has access to a system with pmem hardware, it would be 
interesting to try to reproduce a torn sector and broken control file by 
pulling the power plug.

- Heikki



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Vladimir Shvartsgor
Дата:
Сообщение: Re: Example in "42.8. Transaction Management" doesn't work for PostgreSQL v 12.7
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: BUG #17064: Parallel VACUUM operations cause the error "global/pg_filenode.map contains incorrect checksum"