Re: track needed attributes in plan nodes for executor use

Поиск
Список
Период
Сортировка
От Amit Langote
Тема Re: track needed attributes in plan nodes for executor use
Дата
Msg-id CA+HiwqHc1EJ1_LSu491fmXS9CqrFmGOKh2Z7udYBH19zBTLKLA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: track needed attributes in plan nodes for executor use  (Japin Li <japinli@hotmail.com>)
Список pgsql-hackers
On Fri, Jul 11, 2025 at 6:58 PM Japin Li <japinli@hotmail.com> wrote:
> On Fri, 11 Jul 2025 at 17:16, Amit Langote <amitlangote09@gmail.com> wrote:
> > Hi,
> >
> > I’ve been experimenting with an optimization that reduces executor
> > overhead by avoiding unnecessary attribute deformation. Specifically,
> > if the executor knows which attributes are actually needed by a plan
> > node’s targetlist and qual, it can skip deforming unused columns
> > entirely.
> >
> > In a proof-of-concept patch, I initially computed the needed
> > attributes during ExecInitSeqScan by walking the plan’s qual and
> > targetlist to support deforming only what’s needed when evaluating
> > expressions in ExecSeqScan() or the variant thereof (I started with
> > SeqScan to keep the initial patch minimal). However, adding more work
> > to ExecInit* adds to executor startup cost, which we should generally
> > try to reduce. It also makes it harder to apply the optimization
> > uniformly across plan types.
> >
> > I’d now like to propose computing the needed attributes at planning
> > time instead. This can be done at the bottom of create_plan_recurse,
> > after the plan node has been constructed. A small helper like
> > record_needed_attrs(plan) can walk the node’s targetlist and qual
> > using pull_varattnos() and store the result in a new Bitmapset
> > *attr_used field in the Plan struct. System attributes returned by
> > pull_varattnos() can be filtered out during this step, since they're
> > either not relevant to deformation or not performance sensitive.
> >
> > This also lays the groundwork for a related executor-side optimization
> > that David Rowley suggested to me off-list. The idea is to remember,
> > in the TupleDesc, either the attribute number or the byte offset of
> > the first variable-length attribute. Then, if the minimum required
> > attribute (as provided by attr_used) lies before that, the executor
> > can safely jump directly to it using the cached offset, rather than
> > starting deformation from attno 0 as it currently does. That avoids
> > walking through fixed-length attributes that aren't needed --
> > specifically, skipping per-attribute alignment, null checking, and
> > offset tracking for unused columns -- which reduces CPU work and
> > avoids loading irrelevant tuple bytes into cache.
> >
> > With both patches in place, heap tuple deforming can skip over unused
> > attributes entirely. For example, on a 30-column table where the first
> > 15 columns are fixed-width, the query:
> >
> > select sum(a_1) from foo where a_10 = $1;
> >
> > which references only two fixed-width columns, ran nearly 2x faster
> > with the optimization in place (with heap pages prewarmed into
> > shared_buffers).
> >
> > In more complex plans, for example those involving a Sort or Join
> > between the scan and aggregation, the CPU cost of the intermediate
> > node may dominate, making deforming-related savings at the top less
> > visible in overall performance. Still, I don't think that's a reason
> > to avoid enabling this optimization more broadly across plan nodes.
> >
> > I'll post the PoC patches and performance measurements. Posting this
> > in advance to get feedback on the proposed direction and where best to
> > place attr_used.
> >
>
> That's interesting. If I understand correctly, this approach wouldn't work if
> the first attribute is variable-length, right?

That is correct.

--
Thanks, Amit Langote



В списке pgsql-hackers по дате отправления: