Re: error context for vacuum to include block number

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: error context for vacuum to include block number
Дата
Msg-id 20191211165425.4ewww2s5k5cafi4l@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: error context for vacuum to include block number  (Justin Pryzby <pryzby@telsasoft.com>)
Ответы Re: error context for vacuum to include block number  (Justin Pryzby <pryzby@telsasoft.com>)
Список pgsql-hackers
Hi,

Thanks for working on this!

On 2019-12-11 08:36:48 -0600, Justin Pryzby wrote:
> On Wed, Dec 11, 2019 at 09:15:07PM +0900, Michael Paquier wrote:
> > On Fri, Dec 06, 2019 at 10:23:25AM -0600, Justin Pryzby wrote:
> > > Find attached updated patch:
> > >  . Use structure to include relation name.
> > >  . Split into a separate patch rename of "StringInfoData buf".
> > > 
> > > 2019-11-27 20:04:53.640 CST [14244] ERROR:  canceling statement due to statement timeout
> > > 2019-11-27 20:04:53.640 CST [14244] CONTEXT:  block 2314 of relation t
> > > 2019-11-27 20:04:53.640 CST [14244] STATEMENT:  vacuum t;
> > > 
> > > I tried to use BufferGetTag() to avoid using a 2ndary structure, but fails if
> > > the buffer is not pinned.

The tag will not add all that informative details, because the
relfilenode isn't easily mappable to the table name or such.


> diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
> index 043ebb4..9376989 100644
> --- a/src/backend/access/heap/vacuumlazy.c
> +++ b/src/backend/access/heap/vacuumlazy.c
> @@ -138,6 +138,12 @@ typedef struct LVRelStats
>      bool        lock_waiter_detected;
>  } LVRelStats;
>  
> +typedef struct
> +{
> +    char *relname;
> +    char *relnamespace;
> +    BlockNumber blkno;
> +} vacuum_error_callback_arg;

Hm, wonder if could be worthwhile to not use a separate struct here, but
instead extend one of the existing structs to contain the necessary
information. Or perhaps have one new struct with all the necessary
information. There's already quite a few places that do
get_namespace_name(), for example.



> @@ -524,6 +531,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
>          PROGRESS_VACUUM_MAX_DEAD_TUPLES
>      };
>      int64        initprog_val[3];
> +    ErrorContextCallback errcallback;
> +    vacuum_error_callback_arg cbarg;

Not a fan of "cbarg", too generic.

>      pg_rusage_init(&ru0);
>  
> @@ -635,6 +644,15 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
>      else
>          skipping_blocks = false;
>  
> +    /* Setup error traceback support for ereport() */
> +    errcallback.callback = vacuum_error_callback;
> +    cbarg.relname = relname;
> +    cbarg.relnamespace = get_namespace_name(RelationGetNamespace(onerel));
> +    cbarg.blkno = 0; /* Not known yet */
> +    errcallback.arg = (void *) &cbarg;
> +    errcallback.previous = error_context_stack;
> +    error_context_stack = &errcallback;
> +
>      for (blkno = 0; blkno < nblocks; blkno++)
>      {
>          Buffer        buf;
> @@ -658,6 +676,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
>  
>          pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
>  
> +        cbarg.blkno = blkno;
> +
>          if (blkno == next_unskippable_block)
>          {
>              /* Time to advance next_unskippable_block */
> @@ -817,7 +837,6 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
>  
>          buf = ReadBufferExtended(onerel, MAIN_FORKNUM, blkno,
>                                   RBM_NORMAL, vac_strategy);
> -
>          /* We need buffer cleanup lock so that we can prune HOT chains. */
>          if (!ConditionalLockBufferForCleanup(buf))
>          {
> @@ -1388,6 +1407,9 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
>              RecordPageWithFreeSpace(onerel, blkno, freespace);
>      }
>  
> +    /* Pop the error context stack */
> +    error_context_stack = errcallback.previous;
> +
>      /* report that everything is scanned and vacuumed */
>      pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
>  
> @@ -2354,3 +2376,15 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
>  
>      return all_visible;
>  }

I think this will misattribute errors that happen when in the:
        /*
         * If we are close to overrunning the available space for dead-tuple
         * TIDs, pause and do a cycle of vacuuming before we tackle this page.
         */
section of lazy_scan_heap(). That will

a) scan the index, during which we presumably don't want the same error
   context, as it'd be quite misleading: The block that was just scanned
   in the loop isn't actually likely to be the culprit for an index
   problem. And we'd not mention the fact that the problem is occurring
   in the index.
b) will report the wrong block, when in lazy_vacuum_heap().

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: BUG #16059: Tab-completion of filenames in COPY commands removes required quotes
Следующее
От: Andres Freund
Дата:
Сообщение: Re: global / super barriers (for checksums)