Re: display offset along with block number in vacuum errors

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: display offset along with block number in vacuum errors
Дата
Msg-id CA+fd4k4HcbhPnCs7paRTw1K-AHin8y4xKomB9Ru0ATw0UeTy2w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: display offset along with block number in vacuum errors  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: display offset along with block number in vacuum errors  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Tue, 18 Aug 2020 at 13:06, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Aug 17, 2020 at 11:38 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Sat, 15 Aug 2020 at 12:19, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > The reason why I have not included heap_page_prune related change in
> > > the patch is that I don't want to sprinkle this in every possible
> > > function (code path) called via vacuum especially if the probability
> > > of an error in that code path is low. But, I am fine if you and or
> > > others think that it is a good idea to update offset in
> > > heap_page_prune as well.
> >
> > I agree to not try sprinkling it many places than necessity.
> >
> > Regarding heap_page_prune(), I'm concerned a bit that
> > heap_page_prune() is typically the first function to check the tuple
> > visibility within the vacuum code. I've sometimes observed an error
> > with the message like "DETAIL: could not open file “pg_xact/00AB”: No
> > such file or directory" due to a tuple header corruption. I suspect
> > this message was emitted while checking tuple visibility in
> > heap_page_prune(). So I guess the likelihood of an error in that code
> > is not so low.
> >
>
> Fair point.
>
> > On the other hand, if we want to update the offset number in
> > heap_page_prune() we will need to expose some vacuum structs defined
> > as a static including LVRelStats.
> >
>
> I don't think we need to expose LVRelStats. We can just pass the
> address of vacrelstats->offset_num to achieve what we want. I have
> tried that and it works, see the
> v6-0002-additinal-error-context-information-in-heap_page_ patch
> attached. Do you see any problem with it?

Yes, you're right. I'm concerned a bit the number of arguments passing
to heap_page_prune() might get higher when we need other values to
update for errcontext, but I'm okay with the current patch.

Currently, we're in SCAN_HEAP phase in heap_page_prune() but should it
be VACUUM_HEAP instead?

Also, I've tested the patch with log_min_messages = 'info' and get the
following sever logs:

2020-08-19 14:28:09.917 JST [72912] INFO:  launched 1 parallel vacuum
worker for index vacuuming (planned: 1)
2020-08-19 14:28:09.917 JST [72912] CONTEXT:  while scanning block 973
of relation "public.tbl"
2020-08-19 14:28:09.959 JST [72912] INFO:  scanned index "i1" to
remove 109872 row versions
2020-08-19 14:28:09.959 JST [72912] DETAIL:  CPU: user: 0.04 s,
system: 0.00 s, elapsed: 0.04 s
2020-08-19 14:28:09.959 JST [72912] CONTEXT:  while vacuuming index
"i1" of relation "public.tbl"
2020-08-19 14:28:09.967 JST [72936] INFO:  scanned index "i2" to
remove 109872 row versions by parallel vacuum worker
2020-08-19 14:28:09.967 JST [72936] DETAIL:  CPU: user: 0.03 s,
system: 0.00 s, elapsed: 0.04 s
2020-08-19 14:28:09.967 JST [72936] CONTEXT:  while vacuuming index
"i2" of relation "public.tbl"
2020-08-19 14:28:09.967 JST [72912] INFO:  scanned index "i2" to
remove 109872 row versions by parallel vacuum worker
2020-08-19 14:28:09.967 JST [72912] DETAIL:  CPU: user: 0.03 s,
system: 0.00 s, elapsed: 0.04 s
2020-08-19 14:28:09.967 JST [72912] CONTEXT:  while vacuuming index
"i2" of relation "public.tbl"
        parallel worker
        while scanning block 973 of relation "public.tbl"
2020-08-19 14:28:09.968 JST [72912] INFO:  "tbl": removed 109872 row
versions in 487 pages
2020-08-19 14:28:09.968 JST [72912] DETAIL:  CPU: user: 0.00 s,
system: 0.00 s, elapsed: 0.00 s
2020-08-19 14:28:09.968 JST [72912] CONTEXT:  while vacuuming block
973 of relation "public.tbl"
2020-08-19 14:28:09.968 JST [72912] INFO:  index "i1" now contains
110000 row versions in 578 pages
2020-08-19 14:28:09.968 JST [72912] DETAIL:  109872 index row versions
were removed.
        0 index pages have been deleted, 0 are currently reusable.
        CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
2020-08-19 14:28:09.968 JST [72912] CONTEXT:  while scanning block 973
of relation "public.tbl"
2020-08-19 14:28:09.968 JST [72912] INFO:  index "i2" now contains
110000 row versions in 578 pages
2020-08-19 14:28:09.968 JST [72912] DETAIL:  109872 index row versions
were removed.
        0 index pages have been deleted, 0 are currently reusable.
        CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
2020-08-19 14:28:09.968 JST [72912] CONTEXT:  while scanning block 973
of relation "public.tbl"
2020-08-19 14:28:09.969 JST [72912] INFO:  "tbl": found 110000
removable, 110000 nonremovable row versions in 974 out of 974 pages
2020-08-19 14:28:09.969 JST [72912] DETAIL:  0 dead row versions
cannot be removed yet, oldest xmin: 519
        There were 372 unused item identifiers.
        Skipped 0 pages due to buffer pins, 0 frozen pages.
        0 pages are entirely empty.
        CPU: user: 0.05 s, system: 0.00 s, elapsed: 0.06 s.
2020-08-19 14:28:09.969 JST [72912] CONTEXT:  while scanning block 973
of relation "public.tbl"

This is not directly related to the patch but it looks like we can
improve the current errcontext settings. For instance, the message
from lazy_vacuum_index(): there are two messages reporting the phases.
I've attached the patch that improves the current errcontext setting,
which can be applied before the patch adding offset number.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Heikki Linnakangas
Дата:
Сообщение: Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays
Следующее
От: Dilip Kumar
Дата:
Сообщение: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions