Re: Some read stream improvements
От | Thomas Munro |
---|---|
Тема | Re: Some read stream improvements |
Дата | |
Msg-id | CA+hUKG+MvgpRRdxq5GgB=TdDHhAyyimp75raGX7siBO6=VpLdA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Some read stream improvements (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: Some read stream improvements
|
Список | pgsql-hackers |
On Thu, Feb 27, 2025 at 11:20 PM Andres Freund <andres@anarazel.de> wrote: > On 2025-02-27 11:19:55 +1300, Thomas Munro wrote: > > On Wed, Feb 26, 2025 at 10:55 PM Andres Freund <andres@anarazel.de> wrote: > > > I was working on expanding tests for AIO and as part of that wrote a test for > > > temp tables -- our coverage is fairly awful, there were many times during AIO > > > development where I knew I had trivially reachable temp table specific bugs > > > but all tests passed. > > > > > > The test for that does trigger the problem described above and is fixed by the > > > patches in this thread (which I included in the other thread): Here is a subset of those patches again: 1. Per-backend buffer limit, take III. Now the check is in read_stream_start_pending_read() so TOC == TOU. Annoyingly, test cases like the one below still fail, despite following the rules. The other streams eat all the buffers and then one gets an allowance of zero, but uses its right to take one pin anyway to make progress, and there isn't one. I wonder if we should use temp_buffers - 100? Then leave the minimum GUC value at 100 still, so you have an easy way to test with 0, 1, ... additional buffers? 2. It shouldn't give up issuing random advice immediately after a jump, or it could stall on (say) the second 128kB of a 256kB sequential chunk (ie the strace you showed on the BHS thread). It only makes sense to assume kernel readahead takes over once you've actually *read* sequentially. In practice this makes it a lot more aggressive about advice (like the BHS code in master): it only gives up if the whole look-ahead window is sequential. 3. Change the distance algorithm to care only about hits and misses, not sequential heuristics. It made at least some sense before, but it doesn't make sense for AIO, and even in synchronous mode it means that you hit random jumps with insufficient look-ahead, so I don't think we should keep it. I also realised that the sequential heuristics are confused by that hidden trailing block thing, so in contrived pattern testing with hit-miss-hit-miss... would be considered sequential, and even if you fix that (the forwarding patches above fix that), an exact hit-miss-hit-miss pattern also gets stuck between distances 1 and 2 (double, decrement, double, ... might be worth waiting a bit longer before decrementing, IDK. I'll rebase the others and post soon. set io_combine_limit = 32; set temp_buffers = 100; create temp table t1 as select generate_series(1, 10000); create temp table t2 as select generate_series(1, 10000); create temp table t3 as select generate_series(1, 10000); create temp table t4 as select generate_series(1, 10000); create temp table t5 as select generate_series(1, 10000); do $$ declare c1 cursor for select * from t1; c2 cursor for select * from t2; c3 cursor for select * from t3; c4 cursor for select * from t4; c5 cursor for select * from t5; x record; begin open c1; open c2; open c3; open c4; open c5; loop fetch next from c1 into x; exit when not found; fetch next from c2 into x; exit when not found; fetch next from c3 into x; exit when not found; fetch next from c4 into x; exit when not found; fetch next from c5 into x; exit when not found; end loop; end; $$;
Вложения
В списке pgsql-hackers по дате отправления: