Re: Use streaming read API in ANALYZE

Поиск
Список
Период
Сортировка
От Nazir Bilal Yavuz
Тема Re: Use streaming read API in ANALYZE
Дата
Msg-id CAN55FZ31AoypGkAGH_kx4q7J1fVNi-1XNmvdME3Ye6KbCyJDNw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Use streaming read API in ANALYZE  (Nazir Bilal Yavuz <byavuz81@gmail.com>)
Ответы Re: Use streaming read API in ANALYZE  (Melanie Plageman <melanieplageman@gmail.com>)
Список pgsql-hackers
Hi,

On Mon, 29 Apr 2024 at 18:41, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
>
> Hi,
>
> On Mon, 8 Apr 2024 at 04:21, Thomas Munro <thomas.munro@gmail.com> wrote:
> >
> > Pushed.  Thanks Bilal and reviewers!
>
> I wanted to discuss what will happen to this patch now that
> 27bc1772fc8 is reverted. I am continuing this thread but I can create
> another thread if you prefer so.

041b96802ef is discussed in the 'Table AM Interface Enhancements'
thread [1]. The main problems discussed about this commit is that the
read stream API is not pushed to the heap-specific code and, because
of that, the other AM implementations need to use read streams. To
push read stream API to the heap-specific code, it is pretty much
required to pass BufferAccessStrategy and BlockSamplerData to the
initscan().

I am sharing the alternative version of this patch. The first patch
just reverts 041b96802ef and the second patch is the alternative
version.

In this alternative version, the read stream API is not pushed to the
heap-specific code, but it is controlled by the heap-specific code.
The SO_USE_READ_STREAMS_IN_ANALYZE flag is introduced and set in the
heap-specific code if the scan type is 'ANALYZE'. This flag is used to
decide whether streaming API in ANALYZE will be used or not. If this
flag is set, this means heap AMs and read stream API will be used. If
it is not set, this means heap AMs will not be used and code falls
back to the version before read streams.

Pros of the alternative version:

* The existing AM implementations other than heap AM can continue to
use their AMs without any change.
* AM implementations other than heap do not need to use read streams.
* Upstream code uses the read stream API and benefits from that.

Cons of the alternative version:

* 6 if cases are added to the acquire_sample_rows() function and 3 of
them are in the while loop.
* Because of these changes, the code looks messy.

Any kind of feedback would be appreciated.

[1] https://www.postgresql.org/message-id/flat/CAPpHfdurb9ycV8udYqM%3Do0sPS66PJ4RCBM1g-bBpvzUfogY0EA%40mail.gmail.com

-- 
Regards,
Nazir Bilal Yavuz
Microsoft

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Josef Šimánek
Дата:
Сообщение: Re: [PATCH] Add --syntax to postgres for SQL syntax checking
Следующее
От: Jacob Champion
Дата:
Сообщение: Re: pgsql: Fix overread in JSON parsing errors for incomplete byte sequence