Обсуждение: BUG #19372: Scan operator maybe output unnecessary columns to the upper-layer operators
BUG #19372: Scan operator maybe output unnecessary columns to the upper-layer operators
От
PG Bug reporting form
Дата:
The following bug has been logged on the website:
Bug reference: 19372
Logged by: yao jia
Email address: yaojia_0809@163.com
PostgreSQL version: 18.0
Operating system: linux centos
Description:
The scan operator maybe output unnecessary columns to the upper-layer
operators. This seems like redundant work because the upper-layer operators
don't need these column data at all. This undoubtedly increases execution
overhead.
Is this behavior intentionally designed this way, or is it an unintended
side effect? Should the unnecessary column output be eliminated?
Here is a specific example:columns [jname, c, d, e] are useless for group
and final output, but they are in seqscan's output
postgres=# create table hash_hash_jade22 (hjid int,rjid int,jname varchar, c
varchar, d varchar, e varchar);
CREATE TABLE
postgres=# explain verbose select min(hjid) from hash_hash_jade22 group by
rjid;
QUERY PLAN
--------------------------------------------------------------------------------
HashAggregate (cost=17.35..19.35 rows=200 width=8)
Output: min(hjid), rjid
Group Key: hash_hash_jade22.rjid
-> Seq Scan on public.hash_hash_jade22 (cost=0.00..14.90 rows=490
width=8)
Output: hjid, rjid, jname, c, d, e
On Wed, 7 Jan 2026 at 21:49, PG Bug reporting form <noreply@postgresql.org> wrote: > The scan operator maybe output unnecessary columns to the upper-layer > operators. This seems like redundant work because the upper-layer operators > don't need these column data at all. This undoubtedly increases execution > overhead. > Is this behavior intentionally designed this way, or is it an unintended > side effect? Should the unnecessary column output be eliminated? It's working as intended. This is useful for HeapAm as TupleTableSlots can carry a pointer to the tuple on the page and deform only up to the last column that's required. If you want the Seq Scan to only have the columns required at the scan level, then that requires a projection operation at the scan level. In your query, only columns 1 and 2 will be deformed from the tuple and that'll happen during the HashAggregate. No tuple deformation is required before that in your query. Where this behaviour isn't ideal is for table AMs such as column stores. Ideally, we'd leave this up to the table AM to state if it's useful or not, but we currently don't have that ability. I'm surprised nobody has asked for that yet. David
On Wed, 7 Jan 2026 at 23:39, 贾耀 <yaojia_0809@163.com> wrote:
I debugged the behavior of the statement using gdb, part of stack is:
#0 create_scan_plan
#1 create_plan_recurse
#2 create_projection_plan
#3 create_plan_recurse
#4 create_agg_plan
...
I think the point is flags in create_plan.
In frame 4, it give the flags CP_LABLE_TLIST; so use_physical_tlist of create_projection_plan return true in frame 2, and give the flags 0; so use_physical_tlist of create_scan_plan return true in frame 0, and build tlist with build_physical_tlist rather than build_path_tlist. build_physical_tlist will build all columns of table, build_path_tlist will build necessary columns with best_path's path target.
If I force it to go through build_path_tlist, I can get the simplified output of seqscan(just hjid and rjid)
So, it's a bug about flags? Will you fix it?
I've already mentioned, the current behaviour is intentional. So there are no changes to be made.
By all means, try your flag hack locally and see if performance is better or worse.
David