Re: index prefetching

Поиск
Список
Период
Сортировка
От Gregory Smith
Тема Re: index prefetching
Дата
Msg-id CAHLJuCXz62q1bpqzKNCNSzOg2LJ1c-UVpcCgxcxuNiaPykVp3w@mail.gmail.com
обсуждение исходный текст
Ответ на index prefetching  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Список pgsql-hackers
On Thu, Jun 8, 2023 at 11:40 AM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
We already do prefetching for bitmap index scans, where the bitmap heap
scan prefetches future pages based on effective_io_concurrency. I'm not
sure why exactly was prefetching implemented only for bitmap scans

At the point Greg Stark was hacking on this, the underlying OS async I/O features were tricky to fix into PG's I/O model, and both of us did much review work just to find working common ground that PG could plug into.  Linux POSIX advisories were completely different from Solaris's async model, the other OS used for validation that the feature worked, with the hope being that designing against two APIs would be better than just focusing on Linux.  Since that foundation was all so brittle and limited, scope was limited to just the heap scan, since it seemed to have the best return on time invested given the parts of async I/O that did and didn't scale as expected.

As I remember it, the idea was to get the basic feature out the door and gather feedback about things like whether the effective_io_concurrency knob worked as expected before moving onto other prefetching.  Then that got lost in filesystem upheaval land, with so much drama around Solaris/ZFS and Oracle's btrfs work.  I think it's just that no one ever got back to it.

I have all the workloads that I use for testing automated into pgbench-tools now, and this change would be easy to fit into testing on them as I'm very heavy on block I/O tests.  To get PG to reach full read speed on newer storage I've had to do some strange tests, like doing index range scans that touch 25+ pages.  Here's that one as a pgbench script:

\set range 67 * (:multiplier + 1)
\set limit 100000 * :scale
\set limit :limit - :range
\set aid random(1, :limit)
SELECT aid,abalance FROM pgbench_accounts WHERE aid >= :aid ORDER BY aid LIMIT :range;

And then you use '-Dmultiplier=10' or such to crank it up.  Database 4X RAM, multiplier=25 with 16 clients is my starting point on it when I want to saturate storage.  Anything that lets me bring those numbers down would be valuable.

--
Greg Smith  greg.smith@crunchydata.com
Director of Open Source Strategy

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeff Davis
Дата:
Сообщение: Re: Fix search_path for all maintenance commands
Следующее
От: Stephen Frost
Дата:
Сообщение: Re: Let's make PostgreSQL multi-threaded