Re: [PoC] Improve dead tuple storage for lazy vacuum
От | Masahiko Sawada |
---|---|
Тема | Re: [PoC] Improve dead tuple storage for lazy vacuum |
Дата | |
Msg-id | CAD21AoDD5VXYMyWdrSjtN01HdYq3Ja-yzTP=rD9315-5bdLzpA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [PoC] Improve dead tuple storage for lazy vacuum (John Naylor <john.naylor@enterprisedb.com>) |
Ответы |
Re: [PoC] Improve dead tuple storage for lazy vacuum
|
Список | pgsql-hackers |
On Wed, Oct 5, 2022 at 6:40 PM John Naylor <john.naylor@enterprisedb.com> wrote: > > > On Wed, Oct 5, 2022 at 1:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Wed, Sep 28, 2022 at 12:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > On Fri, Sep 23, 2022 at 12:11 AM John Naylor > > > <john.naylor@enterprisedb.com> wrote: > > > Yeah, node31 and node256 are bloated. We probably could use slab for > > > node256 independently. It's worth trying a benchmark to see how it > > > affects the performance and the tree size. > > This wasn't the focus of your current email, but while experimenting with v6 I had another thought about local allocation:If we use the default slab block size of 8192 bytes, then only 3 chunks of size 2088 can fit, right? If so, sinceaset and DSA also waste at least a few hundred bytes, we could store a useless 256-byte slot array within node256. Thatway, node128 and node256 share the same start of pointers/values array, so there would be one less branch for gettingthat address. In v6, rt_node_get_values and rt_node_get_children are not inlined (asde: gcc uses a jump table for5 kinds but not for 4), but possibly should be, and the smaller the better. It would be good for performance but I'm a bit concerned that it's highly optimized to the design of aset and DSA. Since size 2088 will be currently classed as 2616 in DSA, DSA wastes 528 bytes. However, if we introduce a new class of 2304 (=2048 + 256) bytes we cannot store a useless 256-byte and the assumption will be broken. > > > Regarding DSA support, IIUC we need to use dsa_pointer in inner nodes > > to point to its child nodes, instead of C pointers (ig, backend-local > > address). I'm thinking of a straightforward approach as the first > > step; inner nodes have a union of rt_node* and dsa_pointer and we > > choose either one based on whether the radix tree is shared or not. We > > allocate and free the shared memory for individual nodes by > > dsa_allocate() and dsa_free(), respectively. Therefore we need to get > > a C pointer from dsa_pointer by using dsa_get_address() while > > descending the tree. I'm a bit concerned that calling > > dsa_get_address() for every descent could be performance overhead but > > I'm going to measure it anyway. > > Are dsa pointers aligned the same as pointers to locally allocated memory? Meaning, is the offset portion always a multipleof 4 (or 8)? I think so. > It seems that way from a glance, but I can't say for sure. If the lower 2 bits of a DSA pointer are never set, we can tagthem the same way as a regular pointer. That same technique could help hide the latency of converting the pointer, bythe same way it would hide the latency of loading parts of a node into CPU registers. > > One concern is, handling both local and dsa cases in the same code requires more (predictable) branches and reduces codedensity. That might be a reason in favor of templating to handle each case in its own translation unit. Right. We also need to support locking for shared radix tree, which would require more branches. Regards, -- Masahiko Sawada PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
В списке pgsql-hackers по дате отправления:
Следующее
От: Bharath RupireddyДата:
Сообщение: Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication