Re: Disk-based hash aggregate's cost model
От | Tomas Vondra |
---|---|
Тема | Re: Disk-based hash aggregate's cost model |
Дата | |
Msg-id | 20200904190137.scrucxyjkxyc2bmk@development обсуждение исходный текст |
Ответ на | Re: Disk-based hash aggregate's cost model (Jeff Davis <pgsql@j-davis.com>) |
Ответы |
Re: Disk-based hash aggregate's cost model
|
Список | pgsql-hackers |
On Fri, Sep 04, 2020 at 11:31:36AM -0700, Jeff Davis wrote: >On Fri, 2020-09-04 at 14:56 +0200, Tomas Vondra wrote: >> Those charts show that the CP_SMALL_TLIST resulted in smaller temp >> files >> (per EXPLAIN ANALYZE the difference is ~25%) and also lower query >> durations (also in the ~25% range). > >I was able to reproduce the problem, thank you. > >Only two attributes are needed, so the CP_SMALL_TLIST projected schema >only needs a single-byte null bitmap. > >But if just setting the attributes to NULL rather than projecting them, >the null bitmap size is based on all 16 attributes, bumping the bitmap >size to two bytes. > >MAXALIGN(23 + 1) = 24 >MAXALIGN(23 + 2) = 32 > >I think that explains it. It's not ideal, but projection has a cost as >well, so I don't think we necessarily need to do something here. > >If we are motivated to improve this in v14, we could potentially have a >different schema for spilled tuples, and perform real projection at >spill time. But I don't know if that's worth the extra complexity. > Thanks for the investigation and explanation. Wouldn't it be enough to just use a slot with smaller tuple descriptor? All we'd need to do is creating the descriptor in ExecInitAgg after calling find_hash_columns, and using it for rslot/wslot, and then "mapping" the attributes in hashagg_spill_tuple (which already almost does that, to the extra cost should be 0) and when reading the spilled tuples. So I'm not quite buying the argument that this would make measurable difference ... That being said, I won't insist on fixing this in v13 - at least we know what the issue is and we can fix it later. The costing seems like a more serious open item. OTOH I don't think this example is particularly extreme, and I wouldn't be surprised if we se even worse examples in practice - tables tend to be quite wide and aggregation of just a few columns seems likely. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
В списке pgsql-hackers по дате отправления: