Re: BUG #17245: Index corruption involving deduplicated entries

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: BUG #17245: Index corruption involving deduplicated entries
Дата
Msg-id 20211028224831.bj7ew3j74tw4cmvh@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: BUG #17245: Index corruption involving deduplicated entries  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: BUG #17245: Index corruption involving deduplicated entries  (Peter Geoghegan <pg@bowt.ie>)
Re: BUG #17245: Index corruption involving deduplicated entries  (Kamigishi Rei <iijima.yun@koumakan.jp>)
Список pgsql-bugs
Hi,

On 2021-10-28 15:23:38 -0700, Peter Geoghegan wrote:
> Anything is possible. But Kamigishi Rei has said that this database
> has never had a hard crash or unclean shut down, which I definitely
> believe. Also, they are using ECC on a Xeon processor. This is the
> kind of hardware that is generally assumed to be very reliable.

That wouldn't protect against e.g. a logic bug in ZFS. Given its copy-on-write
nature corruption could very well manifest as seeing an older version of the
data when re-reading data from disk. Which could very well lead to the type of
corruption we're seeing here.

A few years back I tried to help somebody investigate corruption that turned
out to be caused by something roughly along those lines (IIRC several bugs in
ZFS on linux, although I don't remember the details anymore).

Not saying that that is the most likely explanation, just something worth
checking.


> Kamigishi Rei has been an exemplary example of how to report a bug to
> an open source community. I want to thank him again. Thanks!

+1


> A second similar complaint from Herman Verschooten on Slack didn't
> mention ZFS at all. A third similar-seeming report on Slack was from
> somebody named Brandon Ros, who used Ubuntu (I believe 20.04, like
> Herman Verschooten). Also no indication that ZFS was used.
> 
> I find it slightly hard to believe that it's ZFS, simply because all 3
> complaints involve Postgres 14. And have a lot of common factors. For
> example, Herman also used foreign keys -- a lot of users never bother
> with them. And like Kamigishi Rei, Herman found that a REINDEX (or was
> it VACUUM FULL?) seemingly made the problem go away.

Didn't 14 change the logic when index vacuums are done? That could cause
previously existing issues to manifest with a higher likelihood.


Greetings,

Andres Freund



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: BUG #17245: Index corruption involving deduplicated entries
Следующее
От: Andres Freund
Дата:
Сообщение: Re: BUG #17241: llvm::install_bad_alloc_error_handler error