Re: Proposal: Limitations of palloc inside checkpointer
От | Xuneng Zhou |
---|---|
Тема | Re: Proposal: Limitations of palloc inside checkpointer |
Дата | |
Msg-id | CABPTF7UBRAFHbd5iM=QYLDeSVSwJxqE=XYJKxH8D58x8+B79mg@mail.gmail.com обсуждение исходный текст |
Ответ на | Proposal: Limitations of palloc inside checkpointer (Ekaterina Sokolova <e.sokolova@postgrespro.ru>) |
Список | pgsql-hackers |
Hi all, Sorry—I forgot to Cc on my previous message. Resending here so they’re on the thread: On Wed, Jun 4, 2025 at 11:07 AM Xuneng Zhou <xunengzhou@gmail.com> wrote: > > Hi Alexander, > > Thanks again for the feedback! > > 1) Batch-processing CompactCheckpointerRequestQueue() and AbsorbSyncRequests()? > > After some thoughts, I realized my previous take was incomplete—sorry > for the confusion. Heikki suggested capping num_requests at 10 million > [1]. With that limit, the largest hash table is ~500 MB and the > skip_slot[] array is ~10 MB in CompactCheckpointerRequestQueue and the > max size of request array in AbsorbSyncRequests is well under 400 MB, > so we never exceed 1 GB. Even without batching, compaction stays under > the cap. Batching in AbsorbSyncRequests may still help by amortizing > memory allocation, but it adds extra lock/unlock overhead. Not sure if > that overhead is worth it under the cap. > > Of course, all of this depends on having a cap in place. Picking the > right cap size can be tricky (see point 2). If we decide not to > enforce a cap now or in future versions, then batching both > CompactCheckpointerRequestQueue(maybe?) and AbsorbSyncRequests become > essential. We also need to consider the batch size—Heikki suggested 10 > k for AbsorbSyncRequests—but I’m not sure whether that suits typical > or extreme workloads. > > > Right, but another point is to avoid lengthy holding of > > CheckpointerCommLock. What do you think about that? > > I am not clear on this. Could you elaborate on it? > > [1] https://www.postgresql.org/message-id/c1993b75-a5bc-42fd-bbf1-6f06a1b37107%40iki.fi > > > 2) Back-branch fixes with MAX_CHECKPOINT_REQUESTS? > > This is simple and effective, but can be hard to get the value right. > I think we should think more of it. For very large-scale use cases, > like hundreds of GB shared_buffers, 10 million seems small if the > checkpointer is not able to absorb the changes before the queue fills > up. In this case, making compaction more efficient like 3) would be > helpful. However, if we do this for back-branch as well, the solution > is not that simple any more. > > > 3) Fill gaps by pulling from the tail instead of rewriting the whole queue? > > I misunderstood at first—this is a generally helpful optimization. > I'll integrate it into the current patch.
В списке pgsql-hackers по дате отправления: