Обсуждение: Re: pgsql: Use pre-fetching for ANALYZE
Hi,
On 2021-03-16 18:48:08 +0000, Stephen Frost wrote:
> Use pre-fetching for ANALYZE
>
> When we have posix_fadvise() available, we can improve the performance
> of an ANALYZE by quite a bit by using it to inform the kernel of the
> blocks that we're going to be asking for. Similar to bitmap index
> scans, the number of buffers pre-fetched is based off of the
> maintenance_io_concurrency setting (for the particular tablespace or,
> if not set, globally, via get_tablespace_maintenance_io_concurrency()).
I just looked at this as part of debugging a crash / hang in the AIO patch.
The code does:
block_accepted = table_scan_analyze_next_block(scan, targblock, vac_strategy);
#ifdef USE_PREFETCH
/*
* When pre-fetching, after we get a block, tell the kernel about the
* next one we will want, if there's any left.
*
* We want to do this even if the table_scan_analyze_next_block() call
* above decides against analyzing the block it picked.
*/
if (prefetch_maximum && prefetch_targblock != InvalidBlockNumber)
PrefetchBuffer(scan->rs_rd, MAIN_FORKNUM, prefetch_targblock);
#endif
I.e. we lock a buffer and *then* we prefetch another buffer. That seems like a
quite bad idea to me. Why are we doing IO while holding a content lock, if we
can avoid it?
Greetings,
Andres Freund
Hi, On 2022-06-02 19:30:16 -0700, Andres Freund wrote: > On 2021-03-16 18:48:08 +0000, Stephen Frost wrote: > > Use pre-fetching for ANALYZE > > > > When we have posix_fadvise() available, we can improve the performance > > of an ANALYZE by quite a bit by using it to inform the kernel of the > > blocks that we're going to be asking for. Similar to bitmap index > > scans, the number of buffers pre-fetched is based off of the > > maintenance_io_concurrency setting (for the particular tablespace or, > > if not set, globally, via get_tablespace_maintenance_io_concurrency()). > > I just looked at this as part of debugging a crash / hang in the AIO patch. > > The code does: > > block_accepted = table_scan_analyze_next_block(scan, targblock, vac_strategy); > > #ifdef USE_PREFETCH > > /* > * When pre-fetching, after we get a block, tell the kernel about the > * next one we will want, if there's any left. > * > * We want to do this even if the table_scan_analyze_next_block() call > * above decides against analyzing the block it picked. > */ > if (prefetch_maximum && prefetch_targblock != InvalidBlockNumber) > PrefetchBuffer(scan->rs_rd, MAIN_FORKNUM, prefetch_targblock); > #endif > > I.e. we lock a buffer and *then* we prefetch another buffer. That seems like a > quite bad idea to me. Why are we doing IO while holding a content lock, if we > can avoid it? It also seems decidedly not great from a layering POV to do the IO in analyze.c. There's no guarantee that the tableam maps blocks in a way that's compatible with PrefetchBuffer(). Yes, the bitmap heap scan code does something similar, but a) that is opt in by the AM, b) there's a comment saying it's quite crufty and should be fixed. Greetings, Andres Freund
Greetings, * Andres Freund (andres@anarazel.de) wrote: > On 2022-06-02 19:30:16 -0700, Andres Freund wrote: > > On 2021-03-16 18:48:08 +0000, Stephen Frost wrote: > > > Use pre-fetching for ANALYZE > > > > > > When we have posix_fadvise() available, we can improve the performance > > > of an ANALYZE by quite a bit by using it to inform the kernel of the > > > blocks that we're going to be asking for. Similar to bitmap index > > > scans, the number of buffers pre-fetched is based off of the > > > maintenance_io_concurrency setting (for the particular tablespace or, > > > if not set, globally, via get_tablespace_maintenance_io_concurrency()). > > > > I just looked at this as part of debugging a crash / hang in the AIO patch. > > > > The code does: > > > > block_accepted = table_scan_analyze_next_block(scan, targblock, vac_strategy); > > > > #ifdef USE_PREFETCH > > > > /* > > * When pre-fetching, after we get a block, tell the kernel about the > > * next one we will want, if there's any left. > > * > > * We want to do this even if the table_scan_analyze_next_block() call > > * above decides against analyzing the block it picked. > > */ > > if (prefetch_maximum && prefetch_targblock != InvalidBlockNumber) > > PrefetchBuffer(scan->rs_rd, MAIN_FORKNUM, prefetch_targblock); > > #endif > > > > I.e. we lock a buffer and *then* we prefetch another buffer. That seems like a > > quite bad idea to me. Why are we doing IO while holding a content lock, if we > > can avoid it? At the end, we're doing a posix_fadvise() which is a kernel call but hopefully wouldn't do actual IO when we call it. Still, agreed that it'd be better to do that without holding locks and no objection to making such a change. > It also seems decidedly not great from a layering POV to do the IO in > analyze.c. There's no guarantee that the tableam maps blocks in a way that's > compatible with PrefetchBuffer(). Yes, the bitmap heap scan code does > something similar, but a) that is opt in by the AM, b) there's a comment > saying it's quite crufty and should be fixed. Certainly open to suggestions. Are you thinking it'd make sense to add a 'prefetch_block' method to TableAmRoutine? Or did you have another thought? Thanks! Stephen