Re: error context for vacuum to include block number

Поиск
Список
Период
Сортировка
От Justin Pryzby
Тема Re: error context for vacuum to include block number
Дата
Msg-id 20200217035731.GL31889@telsasoft.com
обсуждение исходный текст
Ответ на Re: error context for vacuum to include block number  (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>)
Ответы Re: error context for vacuum to include block number
Список pgsql-hackers
On Mon, Feb 17, 2020 at 10:47:47AM +0900, Masahiko Sawada wrote:
> Thank you for updating the patch!
> 
> 1.
> The above lines need a new line.

Done, thanks.

> 2.
> In lazy_vacuum_heap, we set the error context and then call
> pg_rusage_init whereas lazy_vacuum_index and lazy_cleanup_index does
> the opposite. And lazy_scan_heap also call pg_rusage first. I think
> lazy_vacuum_heap should follow them for consistency. That is, we can
> set error context after pages = 0.

Right. Maybe I did it the other way because the two uses of
PROGRESS_VACUUM_PHASE_VACUUM_HEAP were right next to each other.

> 3.
> We have 2 other phases: PROGRESS_VACUUM_PHASE_TRUNCATE and
> PROGRESS_VACUUM_PHASE_FINAL_CLEANUP. I think it's better to set the
> error context in lazy_truncate_heap as well. What do you think?
> 
> I'm not sure it's worth to set the error context for FINAL_CLENAUP but
> we should add the case of FINAL_CLENAUP to vacuum_error_callback as
> no-op or explain it as a comment even if we don't.

I don't have strong feelings either way.

I looked a bit at the truncation phase.  It also truncates the FSM and VM
forks, which could be misleading if the error was in one of those files and not
the main filenode.

I'd have to find a way to test it... 
...and was pleasantly surprised to see that earlier phases don't choke if I
(re)create a fake 4GB table like:

postgres=# CREATE TABLE trunc(i int);
CREATE TABLE
postgres=# \x\t
Expanded display is on.
Tuples only is on.
postgres=# SELECT relfilenode FROM pg_class WHERE oid='trunc'::regclass;
relfilenode | 59068

postgres=# \! bash -xc 'truncate -s 1G ./pgsql13.dat111/base/12689/59068{,.{1..3}}'
+ truncate -s 1G ./pgsql13.dat111/base/12689/59074 ./pgsql13.dat111/base/12689/59074.1
./pgsql13.dat111/base/12689/59074.2./pgsql13.dat111/base/12689/59074.3
 

postgres=# \timing 
Timing is on.
postgres=# SET client_min_messages=debug; SET statement_timeout='13s'; VACUUM VERBOSE trunc;
INFO:  vacuuming "public.trunc"
INFO:  "trunc": found 0 removable, 0 nonremovable row versions in 524288 out of 524288 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 2098
There were 0 unused item identifiers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
524288 pages are entirely empty.
CPU: user: 5.00 s, system: 1.50 s, elapsed: 6.52 s.
ERROR:  canceling statement due to statement timeout
CONTEXT:  while truncating relation "public.trunc" to 0 blocks

The callback surrounding RelationTruncate() seems hard to hit unless you add
CHECK_FOR_INTERRUPTS(); the same was true for index cleanup.

The truncation uses a prefetch, which is more likely to hit any lowlevel error,
so I added callback there, too.

BTW, for the index cases, I didn't like repeating the namespace here, but WDYT ?
|CONTEXT:  while vacuuming index "public.t_i_idx3" of relation "t"

Thanks for rerere-reviewing.

-- 
Justin

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kasahara Tatsuhito
Дата:
Сообщение: Re: small improvement of the elapsed time for truncating heap in vacuum
Следующее
От: Masahiko Sawada
Дата:
Сообщение: Re: small improvement of the elapsed time for truncating heap in vacuum