Re: Offline enabling/disabling of data checksums

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Offline enabling/disabling of data checksums
Дата
Msg-id 348aef85-a967-80d2-7bef-7274909475cf@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: Offline enabling/disabling of data checksums  (Magnus Hagander <magnus@hagander.net>)
Ответы Re: Offline enabling/disabling of data checksums  (Michael Paquier <michael@paquier.xyz>)
Re: Offline enabling/disabling of data checksums  (Stephen Frost <sfrost@snowman.net>)
Список pgsql-hackers
On 12/27/18 11:43 AM, Magnus Hagander wrote:
> 
> 
> On Sat, Dec 22, 2018 at 12:28 AM Michael Paquier <michael@paquier.xyz
> <mailto:michael@paquier.xyz>> wrote:
> 
>     On Fri, Dec 21, 2018 at 09:16:16PM +0100, Michael Banck wrote:
> 
>     I think that this is independently useful, I got this stuff part of an
>     upgrade workflow where the user is ready to accept some extra one-time
>     offline time so as checksums are enabled.
> 
> 
> Very much so, IMHO.
> 
> 
>     > Things I have not done so far:
>     >
>     > 1. Rename pg_verify_checksums to e.g. pg_checksums as it will no
>     longer
>     > only verify checksums.
> 
>     Check.  That sounds right to me.
> 
> 
> Should we double-check with packagers that this won't cause a problem?
> Though the fact that it's done in a major release should make it
> perfectly fine I think -- and it's a smaller change than when we did all
> those xlog->wal changes...
> 

I think it makes little sense to not rename the tool now. I'm pretty
sure we'd end up doing that sooner or later anyway, and we'll just live
with a misnamed tool until then.

> 
>     > 3. Once that patch is in, there would be a way to disable checksums so
>     > there'd be a case to also change the initdb default to enabled,
>     but that
>     > required further discussion (and maybe another round of benchmarks).
> 
>     Perhaps, that's unrelated to this thread though.  I am not sure that
>     all users would be ready to pay the extra cost of checksums enabled by
>     default.
> 
> 
> I'd be a strong +1 for changing the default once we have a painless way
> to turn them off.
> 
> It remains super-cheap to turn them off (stop database, one command,
> turn them on). So those people that aren't willing to pay the overhead
> of checksums, can very cheaply get away from it.
> 
> It's a lot more expensive to turn them on once your database has grown
> to some size (definitely in offline mode, but also in an online mode
> when we get that one in).
> 
> Plus, the majority of people *should* want them on :) We don't run with
> say synchronous_commit=off by default either to make it easier on those
> that don't want to pay the overhead of full data safety :P (I know it's
> not a direct match, but you get the idea)
> 

I don't know, TBH. I agree making the on/off change cheaper makes moves
us closer to 'on' by default, because they may disable it if needed. But
it's not the whole story.

If we enable checksums by default, 99% users will have them enabled.
That means more people will actually observe data corruption cases that
went unnoticed so far. What shall we do with that? We don't have very
good answers to that (tooling, docs) and I'd say "disable checksums" is
not a particularly amazing response in this case :-(

FWIW I don't know what to do about that. We certainly can't prevent the
data corruption, but maybe we could help with fixing it (although that's
bound to be low-level work).

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Jonathan S. Katz"
Дата:
Сообщение: Re: tickling the lesser contributor's withering ego
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: Offline enabling/disabling of data checksums