Re: Default setting for enable_hashagg_disk (hash_mem)
От | Justin Pryzby |
---|---|
Тема | Re: Default setting for enable_hashagg_disk (hash_mem) |
Дата | |
Msg-id | 20200703145620.GK4107@telsasoft.com обсуждение исходный текст |
Ответ на | Re: Default setting for enable_hashagg_disk (hash_mem) (Bruce Momjian <bruce@momjian.us>) |
Список | pgsql-hackers |
On Fri, Jul 03, 2020 at 10:08:08AM -0400, Bruce Momjian wrote: > On Thu, Jul 2, 2020 at 08:35:40PM -0700, Peter Geoghegan wrote: > > But the problem isn't really the hashaggs-that-spill patch itself. > > Rather, the problem is the way that work_mem is supposed to behave in > > general, and the impact that that has on hash aggregate now that it > > has finally been brought into line with every other kind of executor > > node. There just isn't much reason to think that we should give the > > same amount of memory to a groupagg + sort as a hash aggregate. The > > patch more or less broke an existing behavior that is itself > > officially broken. That is, the problem that we're trying to fix here > > is only a problem to the extent that the previous scheme isn't really > > operating as intended (because grouping estimates are inherently very > > hard). A revert doesn't seem like it helps anyone. > > > > I accept that the idea of inventing hash_mem to fix this problem now > > is unorthodox. In a certain sense it solves problems beyond the > > problems that we're theoretically obligated to solve now. But any > > "more conservative" approach that I can think of seems like a big > > mess. > > Well, the bottom line is that we are designing features during beta. > People are supposed to be testing PG 13 behavior during beta, including > optimizer behavior. We don't even have a user report yet of a > regression compared to PG 12, or one that can't be fixed by increasing > work_mem. > > If we add a new behavior to PG 13, we then have the pre-PG 13 behavior, > the pre-patch behavior, and the post-patch behavior. How are people > supposed to test all of that? Add to that that some don't even feel we > need a new behavior, which is delaying any patch from being applied. If we default hash_mem=-1, the post-patch behavior by default would be same as the pre-patch behavior. Actually, another reason it should be -1 is simply to reduce the minimum, essential number of GUCs everyone has to change or review on a new installs of a dedicated or nontrivial instance. shared_buffers, max_wal_size, checkpoint_timeout, eff_cache_size, work_mem. I don't think anybody said it before, but now it occurs to me that one advantage of making hash_mem a multiplier (I'm thinking of hash_mem_scale_factor) rather than an absolute is that one wouldn't need to remember to increase hash_mem every time they increase work_mem. Otherwise, this is kind of a foot-gun: hash_mem would default to 16MB, and people experiencing poor performance would increase work_mem to 256MB like they've been doing for decades, and see no effect. Or someone would increase work_mem from 4MB to 256MB, which exceeds hash_mem default of 16MB, so then (if Peter has his way) hash_mem is ignored. Due to these behaviors, I'll retract my previous preference: | "I feel it should same as work_mem, as it's written, and not a multiplier." I think the better ideas are: - hash_mem=-1 - hash_mem_scale_factor=1 ? Maybe as a separate patch we'd set default hash_mem_scale_factor=4, possibly only in master not and v13. -- Justin
В списке pgsql-hackers по дате отправления:
Следующее
От: Magnus HaganderДата:
Сообщение: Re: Parallell hashjoin sometimes ignores temp_tablespaces