Re: Sync scan & regression tests

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Sync scan & regression tests
Дата
Msg-id 6f991389-ae22-d844-a9d8-9aceb7c01a9a@iki.fi
обсуждение исходный текст
Ответ на Re: Sync scan & regression tests  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Sync scan & regression tests  (David Rowley <dgrowleyml@gmail.com>)
Re: Sync scan & regression tests  (Heikki Linnakangas <hlinnaka@iki.fi>)
Список pgsql-hackers
(noticed this thread just now)

On 07/08/2023 03:55, Tom Lane wrote:
> Having said that ... I just noticed that chipmunk, which I'd been
> ignoring because it had been having configuration-related failures
> ever since it came back to life about three months ago, has gotten
> past those problems

Yes, I finally got around to fix the configuration...

> and is now failing with what looks suspiciously like syncscan
> problems:
> 
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=chipmunk&dt=2023-08-06%2008%3A20%3A21
> 
> This is possibly explained by the fact that it uses (per its
> extra_config)
>                    'shared_buffers = 10MB',
> although it's done that for a long time and portals.out hasn't changed
> since before chipmunk's latest success.  Perhaps some change in an
> earlier test script affected this?

I changed the config yesterday to 'shared_buffers = 20MB', before seeing 
this thread. If we expect the regression tests to work with 
'shared_buffers=10MB' - and I think that would be nice - I'll change it 
back. But let's see what happens with 'shared_buffers=20MB'. It will 
take a few days for the tests to complete.

> I think it'd be worth running to ground exactly what is causing these
> failures and when they started.

I bisected it to this commit:

commit 7ae0ab0ad9704b10400a611a9af56c63304c27a5
Author: David Rowley <drowley@postgresql.org>
Date:   Fri Feb 3 16:20:43 2023 +1300

     Reduce code duplication between heapgettup and heapgettup_pagemode

     The code to get the next block number was exactly the same between 
these
     two functions, so let's just put it into a helper function and call 
that
     from both locations.

     Author: Melanie Plageman
     Reviewed-by: Andres Freund, David Rowley
     Discussion: 
https://postgr.es/m/CAAKRu_bvkhka0CZQun28KTqhuUh5ZqY=_T8QEqZqOL02rpi2bw@mail.gmail.com

 From that description, that was supposed to be just code refactoring, 
with no change in behavior.

Looking the new heapgettup_advance_block() function and the code that it 
replaced, it's now skipping this ss_report_location() on the last call, 
when it has reached the end of the scan:

> 
>     /*
>      * Report our new scan position for synchronization purposes. We
>      * don't do that when moving backwards, however. That would just
>      * mess up any other forward-moving scanners.
>      *
>      * Note: we do this before checking for end of scan so that the
>      * final state of the position hint is back at the start of the
>      * rel.  That's not strictly necessary, but otherwise when you run
>      * the same query multiple times the starting position would shift
>      * a little bit backwards on every invocation, which is confusing.
>      * We don't guarantee any specific ordering in general, though.
>      */
>     if (scan->rs_base.rs_flags & SO_ALLOW_SYNC)
>         ss_report_location(scan->rs_base.rs_rd, block);

The comment explicitly says that we do that before checking for 
end-of-scan, but that commit moved it to after. I propose the attached 
to move it back to before the end-of-scan checks.

Melanie, David, any thoughts?

> But assuming that they are triggered by syncscan, my first thought
> about dealing with it is to disable syncscan within the affected
> tests.  However ... do we exercise the syncscan logic at all within
> the regression tests, normally?  Is there a coverage issue to be
> dealt with?
Good question..

-- 
Heikki Linnakangas
Neon (https://neon.tech)

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Daniel Gustafsson
Дата:
Сообщение: Re: Prevent psql \watch from running queries that return no rows
Следующее
От: Richard Guo
Дата:
Сообщение: Re: Support run-time partition pruning for hash join