Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL

Поиск
Список
Период
Сортировка
От David Christensen
Тема Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL
Дата
Msg-id CAOxo6X+DYF7bRLhY7vCfAvQLv7ZQnNOAfroNqrg_dgwjBdbRfA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL  (Michael Paquier <michael@paquier.xyz>)
Ответы Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL  (David Christensen <david.christensen@crunchydata.com>)
Список pgsql-hackers
On Mon, Apr 25, 2022 at 9:54 PM Michael Paquier <michael@paquier.xyz> wrote:
> On Mon, Apr 25, 2022 at 10:11:10AM -0500, David Christensen wrote:
> > On Mon, Apr 25, 2022 at 1:11 AM Michael Paquier <michael@paquier.xyz> wrote:
> >> I don't think that there is any need to rely on a new logic if there
> >> is already some code in place able to do the same work.  See
> >> verify_dir_is_empty_or_create() in pg_basebackup.c, as one example,
> >> that relies on pg_check_dir().  I think that you'd better rely at
> >> least on what pgcheckdir.c offers.
> >
> > Yeah, though I am tending towards what another user had suggested and
> > just accepting an existing directory rather than requiring it to be
> > empty, so thinking I might just skip this one. (Will review the file
> > you've pointed out to see if there is a relevant function though.)
>
> OK.  FWIW, pg_check_dir() is used in initdb and pg_basebackup because
> these care about the behavior to use when specifying a target path.
> You could reuse it, but use a different policy depending on its
> returned result for the needs of what you see fit in this case.

I have a new version of the patch (pending tests) that uses
pg_check_dir's return value to handle things appropriately, so at
least some code reuse now.  It did end up simplifying a lot.

> >> +                   PageSetLSN(page, record->ReadRecPtr);
> >> +                   /* if checksum field is non-zero then we have checksums enabled,
> >> +                    * so recalculate the checksum with new LSN (yes, this is a hack)
> >> +                    */
> >> Yeah, that looks like a hack, but putting in place a page on a cluster
> >> that has checksums enabled would be more annoying with
> >> zero_damaged_pages enabled if we don't do that, so that's fine by me
> >> as-is.  Perhaps you should mention that FPWs don't have their
> >> pd_checksum updated when written.
> >
> > Can make a mention; you thinking just a comment in the code is
> > sufficient, or want there to be user-visible docs as well?
>
> I was thinking about a comment, at least.

New patch version has significantly more comments.

> > This was to make the page as extracted equivalent to the effect of
> > applying the WAL record block on replay (the LSN and checksum both);
> > since recovery calls this function to mark the LSN where the page came
> > from this is why I did this as well.  (I do see perhaps a case for
> > --raw output that doesn't munge the page whatsoever, just made
> > comparisons easier.)
>
> Hm.  Okay.  The argument goes both ways, I guess, depending on what we
> want to do with those raw pages.  Still you should not need pd_lsn if
> the point is to be able to stick the pages back in place to attempt to
> get back as much data as possible when loading it back to the shared
> buffers?

Yeah, I can see that too; I think there's at least enough of an
argument for a flag to apply the fixups or just extract only the raw
page pre-modification.  Not sure which should be the "default"
behavior; either `--raw` or `--fixup-metadata` or something could
work. (Naming suggestions welcomed.)

David



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: DBT-5 Stored Procedure Development (2022)
Следующее
От: David Christensen
Дата:
Сообщение: Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL