Re: Use generation context to speed up tuplesorts
| От | Ronan Dunklau | 
|---|---|
| Тема | Re: Use generation context to speed up tuplesorts | 
| Дата | |
| Msg-id | 3082578.5fSG56mABF@aivenronan обсуждение исходный текст | 
| Ответ на | Re: Use generation context to speed up tuplesorts (David Rowley <dgrowleyml@gmail.com>) | 
| Ответы | Re: Use generation context to speed up tuplesorts | 
| Список | pgsql-hackers | 
Le vendredi 31 décembre 2021, 22:26:37 CET David Rowley a écrit : > I've attached some benchmark results that I took recently. The > spreadsheet contains results from 3 versions. master, master + 0001 - > 0002, then master + 0001 - 0003. The 0003 patch makes the code a bit > more conservative about the chunk sizes it allocates and also tries to > allocate the tuple array according to the number of tuples we expect > to be able to sort in a single batch for when the sort is not > estimated to fit inside work_mem. (Sorry for trying to merge back the discussion on the two sides of the thread) In https://www.postgresql.org/message-id/4776839.iZASKD2KPV%40aivenronan, I expressed the idea of being able to tune glibc's malloc behaviour. I implemented that (patch 0001) to provide a new hook which is called on backend startup, and anytime we set work_mem. This hook is # defined depending on the malloc implementation: currently a default, no-op implementation is provided as well as a glibc's malloc implementation. The glibc's malloc implementation relies on a new GUC, glibc_malloc_max_trim_threshold. When set to it's default value of -1, we don't tune malloc at all, exactly as in HEAD. If a different value is provided, we set M_MMAP_THRESHOLD to half this value, and M_TRIM_TRESHOLD to this value, capped by work_mem / 2 and work_mem respectively. The net result is that we can then allow to keep more unused memory at the top of the heap, and to use mmap less frequently, if the DBA chooses too. A possible other use case would be to on the contrary, limit the allocated memory in idle backends to a minimum. The reasoning behind this is that glibc's malloc default way of handling those two thresholds is to adapt to the size of the last freed mmaped block. I've run the same "up to 32 columns" benchmark as you did, with this new patch applied on top of both HEAD and your v2 patchset incorporating planner estimates for the block sizez. Those are called "aset" and "generation" in the attached spreadsheet. For each, I've run it with glibc_malloc_max_trim_threshold set to -1, 1MB, 4MB and 64MB. In each case I've measured two things: - query latency, as reported by pgbench - total memory allocated by malloc at backend ext after running each query three times. This represents the "idle" memory consumption, and thus what we waste in malloc inside of releasing back to the system. This measurement has been performed using the very small module presented in patch 0002. Please note that I in no way propose that we include this module, it was just a convenient way for me to measure memory footprint. My conclusion is that the impressive gains you see from using the generation context with bigger blocks mostly comes from the fact that we allocate bigger blocks, and that this moves the mmap thresholds accordingly. I wonder how much of a difference it would make on other malloc implementation: I'm afraid the optimisation presented here would in fact be specific to glibc's malloc, since we have almost the same gains with both allocators when tuning malloc to keep more memory. I still think both approaches are useful, and would be necessary. Since this affects all memory allocations, I need to come up with other meaningful scenarios to benchmarks. -- Ronan Dunklau
Вложения
В списке pgsql-hackers по дате отправления: