Re: Streaming read-ready sequential scan code

Поиск
Список
Период
Сортировка
От Melanie Plageman
Тема Re: Streaming read-ready sequential scan code
Дата
Msg-id CAAKRu_Z4hjfHvqomhjr0McYPYbtw7iW-5-gAyFCbLOfzTCOrbQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Streaming read-ready sequential scan code  (Thomas Munro <thomas.munro@gmail.com>)
Ответы Re: Streaming read-ready sequential scan code  (Thomas Munro <thomas.munro@gmail.com>)
Список pgsql-hackers
On Fri, Apr 5, 2024 at 12:15 AM Thomas Munro <thomas.munro@gmail.com> wrote:
>
> Yeah, I plead benchmarking myopia, sorry.  The fastpath as committed
> is only reached when distance goes 2->1, as pg_prewarm does.  Oops.
> With the attached minor rearrangement, it works fine.  I also poked
> some more at that memory prefetcher.  Here are the numbers I got on a
> desktop system (Intel i9-9900 @ 3.1GHz, Linux 6.1, turbo disabled,
> cpufreq governor=performance, 2MB huge pages, SB=8GB, consumer NMVe,
> GCC -O3).
>
> create table t (i int, filler text) with (fillfactor=10);
> insert into t
> select g, repeat('x', 900) from generate_series(1, 560000) g;
> vacuum freeze t;
> set max_parallel_workers_per_gather = 0;
>
> select count(*) from t;
>
> cold = must be read from actual disk (Linux drop_caches)
> warm = read from linux page cache
> hot = already in pg cache via pg_prewarm
>
>                                     cold   warm    hot
> master                            2479ms  886ms  200ms
> seqscan                           2498ms  716ms  211ms <-- regression
> seqscan + fastpath                2493ms  711ms  200ms <-- fixed, I think?
> seqscan + memprefetch             2499ms  716ms  182ms
> seqscan + fastpath + memprefetch  2505ms  710ms  170ms <-- \O/
>
> Cold has no difference.  That's just my disk demonstrating Linux RA at
> 128kB (default); random I/O is obviously a more interesting story.
> It's consistently a smidgen faster with Linux RA set to 2MB (as in
> blockdev --setra 4096 /dev/nvmeXXX), and I believe this effect
> probably also increases on fancier faster storage than what I have on
> hand:
>
>                                     cold
> master                            1775ms
> seqscan + fastpath + memprefetch  1700ms
>
> Warm is faster as expected (fewer system calls schlepping data
> kernel->userspace).
>
> The interesting column is hot.  The 200ms->211ms regression is due to
> the extra bookkeeping in the slow path.  The rejiggered fastpath code
> fixes it for me, or maybe sometimes shows an extra 1ms.  Phew.  Can
> you reproduce that?

I am able to reproduce the fast path solving the issue using Heikki's
example here [1] but in shared buffers (hot).

master:                           25 ms
stream read:                   29 ms
stream read + fast path: 25 ms

I haven't looked into or reviewed the memory prefetching part.

While reviewing 0002, I realized that I don't quite see how
read_stream_get_block() will be used in the fastpath -- which it
claims in its comments.
read_stream_next_buffer() is the only caller of
read_stream_look_ahead()->read_stream_get_block(), and if fast_path is
true, read_stream_next_buffer() always returns before calling
read_stream_look_ahead(). Maybe I am missing something. I see
fast_path uses read_stream_fill_blocknums() to invoke the callback.

Oh and why does READ_STREAM_DISABLE_FAST_PATH macro exist?

Otherwise 0002 looks good to me.

I haven't reviewed 0003 or 0004. I attached a new version (v11)
because I noticed an outdated comment in my seq scan streaming read
user patch (0001). The other patches in the set are untouched from
your versions besides adding author/reviewer info in commit message
for 0002.

- Melanie

[1] https://www.postgresql.org/message-id/3b0f3701-addd-4629-9257-cf28e1a6e6a1%40iki.fi

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: LogwrtResult contended spinlock
Следующее
От: "Regina Obe"
Дата:
Сообщение: RE: Can't compile PG 17 (master) from git under Msys2 autoconf