Re: Online enabling of checksums

Поиск
Список
Период
Сортировка
От Magnus Hagander
Тема Re: Online enabling of checksums
Дата
Msg-id CABUevEw0UjNRTyqz-1K79BY0U-_9M-ec2ujXuLOPwq2tQ9RGag@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Online enabling of checksums  (Andres Freund <andres@anarazel.de>)
Ответы Re: Online enabling of checksums  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers


On Sat, Feb 24, 2018 at 11:06 PM, Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2018-02-24 22:56:57 +0100, Magnus Hagander wrote:
> On Sat, Feb 24, 2018 at 10:49 PM, Andres Freund <andres@anarazel.de> wrote:
> > > We did consider doing it at a per-table basis as well. But this is also
> > an
> > > overhead that has to be paid forever, whereas the risk of having to read
> > > the database files more than once (because it'd only have to read them on
> > > the second pass, not write anything) is a one-off operation. And for all
> > > those that have initialized with checksums in the first place don't have
> > to
> > > pay any overhead at all in the current design.
> >
> > Why does it have to be paid forever?
> >
>
> The size of the pg_class row would be there forever. Granted, it's not that
> big an overhead given that there are already plenty of columns there. But
> the point being you can never remove that column, and it will be there for
> users who never even considered running without checksums. It's certainly
> not a large overhead, but it's also not zero.

But it can be removed in the next major version, if we decide it's a
good idea? We're not bound on compatibility for catalog layout.

Sure.

But we can also *add* it in the next major version, if we decide it's a good idea?


FWIW' there's some padding space available where we currently could
store two booleans without any space overhead. Also, If we decide that
the boolean columns (which don't matter much in comparison to the rest
of the data, particularly relname), we can compress them into a
bitfield.

I don't think this is a valid reason for not supporting
interrupability. You can make fair arguments about adding incremental
support incrementally and whatnot, but the catalog size argument doesn't
seem part of a valid argument.

Fair enough.
 


> > I very strongly doubg it's a "very noticeable operational problem". People
> > > don't restart their databases very often... Let's say it takes 2-3 weeks
> > to
> > > complete a run in a fairly large database. How many such large databases
> > > actually restart that frequently? I'm not sure I know of any. And the
> > only
> > > effect of it is you have to start the process over (but read-only for the
> > > part you have already done). It's certainly not ideal, but I don't agree
> > > it's in any form a "very noticeable problem".
> >
> > I definitely know large databases that fail over more frequently than
> > that.
> >
>
> I would argue that they have bigger issues than enabling checksums... By
> far.

In one case it's intentional, to make sure the overall system copes. Not
that insane.

That I can understand. But in a scenario like that, you can also stop doing that for the period of time when you're rebuilding checksums, if re-reading the database over and over again is an issue.

Note, I'm not saying it wouldn't be nice to have the incremental functionality. I'm just saying it's not needed in a first version.

--

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Magnus Hagander
Дата:
Сообщение: Re: Online enabling of checksums
Следующее
От: Magnus Hagander
Дата:
Сообщение: Re: Online enabling of checksums