Re: Reducing the WAL overhead of freezing in VACUUM by deduplicating per-tuple freeze plans
| От | Nathan Bossart | 
|---|---|
| Тема | Re: Reducing the WAL overhead of freezing in VACUUM by deduplicating per-tuple freeze plans | 
| Дата | |
| Msg-id | 20220922042104.GB464247@nathanxps13 обсуждение исходный текст | 
| Ответ на | Re: Reducing the WAL overhead of freezing in VACUUM by deduplicating per-tuple freeze plans (Peter Geoghegan <pg@bowt.ie>) | 
| Ответы | Re: Reducing the WAL overhead of freezing in VACUUM by deduplicating per-tuple freeze plans | 
| Список | pgsql-hackers | 
On Wed, Sep 21, 2022 at 02:41:28PM -0700, Peter Geoghegan wrote: > On Wed, Sep 21, 2022 at 2:11 PM Peter Geoghegan <pg@bowt.ie> wrote: >> > Presumably a >> > generic WAL record compression mechanism could be reused for other large >> > records, too. That could be much easier than devising a deduplication >> > strategy for every record type. >> >> It's quite possible that that's a good idea, but that should probably >> work as an additive thing. That's something that I think of as a >> "clever technique", whereas I'm focussed on just not being naive in >> how we represent this one specific WAL record type. > > BTW, if you wanted to pursue something like this, that would work with > many different types of WAL record, ISTM that a "medium level" (not > low level) approach might be the best place to start. In particular, > the way that page offset numbers are represented in many WAL records > is quite space inefficient. A domain-specific approach built with > some understanding of how page offset numbers tend to look in practice > seems promising. I wouldn't mind giving this a try. > The representation of page offset numbers in PRUNE and VACUUM heapam > WAL records (and in index WAL records) always just stores an array of > 2 byte OffsetNumber elements. It probably wouldn't be all that > difficult to come up with a simple scheme for compressing an array of > OffsetNumbers in WAL records. It certainly doesn't seem like it would > be all that difficult to get it down to 1 byte per offset number in > most cases (even greater improvements seem doable). > > That could also be used for the xl_heap_freeze_page record type -- > though only after this patch is committed. The patch makes the WAL > record use a simple array of page offset numbers, just like in > PRUNE/VACUUM records. That's another reason why the approach > implemented by the patch seems like "the natural approach" to me. It's > much closer to how heapam PRUNE records work (we have a variable > number of arrays of page offset numbers in both cases). Yeah, it seems likely that we could pack offsets in single bytes in many cases. A more sophisticated approach could even choose how many bits to use per offset based on the maximum in the array. Furthermore, we might be able to make use of SIMD instructions to mitigate any performance penalty. I'm tempted to start by just using single-byte offsets when possible since that should be relatively simple while still yielding a decent improvement for many workloads. What do you think? -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
В списке pgsql-hackers по дате отправления: