Re: accounting for memory used for BufFile during hash joins
От | Tomas Vondra |
---|---|
Тема | Re: accounting for memory used for BufFile during hash joins |
Дата | |
Msg-id | 20190507150931.g74bhqqzb6y6ivxj@development обсуждение исходный текст |
Ответ на | Re: accounting for memory used for BufFile during hash joins (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
On Tue, May 07, 2019 at 10:42:36AM -0400, Tom Lane wrote: >Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> On Mon, May 06, 2019 at 11:18:28PM -0400, Tom Lane wrote: >>> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >>>> Do we actually check how many duplicates are there during planning? > >>> Certainly that's part of the planner's cost estimates ... but it's >>> only as good as the planner's statistical knowledge. > >> I'm looking at the code, and the only place where I see code dealing with >> MCVs (probably the best place for info about duplicate values) is >> estimate_hash_bucketsize in final_cost_hashjoin. > >What I'm thinking of is this bit in final_cost_hashjoin: > > /* > * If the bucket holding the inner MCV would exceed work_mem, we don't > * want to hash unless there is really no other alternative, so apply > * disable_cost. (The executor normally copes with excessive memory usage > * by splitting batches, but obviously it cannot separate equal values > * that way, so it will be unable to drive the batch size below work_mem > * when this is true.) > */ > if (relation_byte_size(clamp_row_est(inner_path_rows * innermcvfreq), > inner_path->pathtarget->width) > > (work_mem * 1024L)) > startup_cost += disable_cost; > >It's certainly likely that that logic needs improvement in view of this >discussion --- I was just pushing back on the claim that we weren't >considering the issue at all. > Ah, this code is new in 11, and I was looking at code from 10 for some reason. I don't think we can do much better than this, except perhaps falling back to (1/ndistinct) when there's no MCV available. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
В списке pgsql-hackers по дате отправления: