Re: pg_upgrade instructions involving "rsync --size-only" might lead to standby corruption?

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: pg_upgrade instructions involving "rsync --size-only" might lead to standby corruption?
Дата
Msg-id ZJ9KD9MgDB9pERAr@momjian.us
обсуждение исходный текст
Ответ на Re: pg_upgrade instructions involving "rsync --size-only" might lead to standby corruption?  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: pg_upgrade instructions involving "rsync --size-only" might lead to standby corruption?  (Nikolay Samokhvalov <nik@postgres.ai>)
Список pgsql-hackers
On Fri, Jun 30, 2023 at 04:16:31PM -0400, Robert Haas wrote:
> On Fri, Jun 30, 2023 at 1:41 PM Bruce Momjian <bruce@momjian.us> wrote:
> > I think --size-only was chosen only because it is the minimal comparison
> > option.
> 
> I think it's worse than that. I think that the procedure relies on
> using the --size-only option to intentionally trick rsync into
> thinking that files are identical when they're not.
> 
> Say we have a file like base/23246/78901 on the primary. Unless
> wal_log_hints=on, the standby version is very likely different, but
> only in ways that don't matter to WAL replay. So the procedure aims to
> trick rsync into hard-linking the version of that file that exists on
> the standby in the old cluster into the new cluster on the standby,
> instead of copying the slightly-different version from the master,
> thus making the upgrade very fast. If rsync actually checksummed the
> files, it would realize that they're different and copy the file from
> the original primary, which the person who wrote this procedure does
> not want.

What is the problem with having different hint bits between the two
servers?

> That's kind of a crazy thing for us to be documenting. I think we
> really ought to consider removing from this documentation. If somebody
> wants to write a reliable tool for this to ship as part of PostgreSQL,
> well and good. But this procedure has no real sanity checks and is
> based on very fragile assumptions. That doesn't seem suitable for
> end-user use.
> 
> I'm not quite clear on how Nikolay got into trouble here. I don't
> think I understand under exactly what conditions the procedure is
> reliable and under what conditions it isn't. But there is no way in
> heck I would ever advise anyone to use this procedure on a database
> they actually care about. This is a great party trick or something to
> show off in a lightning talk at PGCon, not something you ought to be
> doing with valuable data that you actually care about.

Well, it does get used, and if we remove it perhaps we can have it on
our wiki and point to it from our docs.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Christensen
Дата:
Сообщение: Re: Initdb-time block size specification
Следующее
От: David Christensen
Дата:
Сообщение: Re: Initdb-time block size specification