Re: Compress prune/freeze records with Delta Frame of Reference algorithm
| От | Tomas Vondra |
|---|---|
| Тема | Re: Compress prune/freeze records with Delta Frame of Reference algorithm |
| Дата | |
| Msg-id | 5a2f3df2-a736-4ada-8aa3-aa6e20b2e067@vondra.me обсуждение |
| Ответ на | Re: Compress prune/freeze records with Delta Frame of Reference algorithm (Evgeny Voropaev <evgeny.voropaev@tantorlabs.com>) |
| Ответы |
Re: Compress prune/freeze records with Delta Frame of Reference algorithm
|
| Список | pgsql-hackers |
On 3/24/26 15:28, Evgeny Voropaev wrote: > Hello Andres, > >> I'm unconvinced that this is a serious problem - typically the >> overhead of WAL >> volume due to pruning / freezing is due to the full page images >> emitted, not >> the raw size of the records. Once an FPI is emitted, this doesn't matter. >> >> What gains have you measured in somewhat realistic workloads? > > So far, we have had no tests in any real production environment. > Moreover, the load in the new test (recovery/ > t/052_prune_dfor_compression.pl) is quite contrived. However, it > demonstrates a compression ratio of more than 5, and it is measured for > an overall size of all prune/freeze records with no filtering. > > Further development is the implementation of compression of unsorted > sequences. This is going to allow PostgreSQL to compress also the > 'frozen' and the 'redirected' offset sequences, which should result in a > greater compression ratio. > > But I agree with you, Andres, we need practical results to estimate a > profit. I wish we would test it on some real load soon. > > Also I hope, independently of its usage in prune/freeze records, the > DFoR itself might be used for compression sequences in other places of PG. > IMHO Andres is right. A ~170kB patch really should present some numbers quantifying the expected benefit. It doesn't need to be a real workload from production, but something plausible enough. Even some basic back-of-the-envelope calculations might be enough to show the promise. Without this, the cost/benefit is so unclear most senior contributors will probably review something else. You need to make the case why this is worth it. I only quickly skimmed the patches, for exactly this reason. I'm a bit confused why this needs to add the whole libtap thing in 0001, instead of just testing this through the SQL interface (same as test_aio etc.). Also, I find it somewhat unlikely we'd import a GPLv3 library like this, even if it's just a testing framework. Even ignoring the question of having a different license for some of the code, it'd mean maintenance burden (maybe libtap is stable/mature, no idea). I don't see why this would be better than "write a SQL callable test module". regards -- Tomas Vondra
В списке pgsql-hackers по дате отправления: