Re: [PoC] Non-volatile WAL buffer
От | Tomas Vondra |
---|---|
Тема | Re: [PoC] Non-volatile WAL buffer |
Дата | |
Msg-id | 256f6556-b517-e81e-0b5d-df60b2fcbdef@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: [PoC] Non-volatile WAL buffer (Heikki Linnakangas <hlinnaka@iki.fi>) |
Ответы |
Re: [PoC] Non-volatile WAL buffer
(Tomas Vondra <tomas.vondra@enterprisedb.com>)
|
Список | pgsql-hackers |
On 11/26/20 9:59 PM, Heikki Linnakangas wrote: > On 26/11/2020 21:27, Tomas Vondra wrote: >> Hi, >> >> Here's the "simple patch" that I'm currently experimenting with. It >> essentially replaces open/close/write/fsync with pmem calls >> (map/unmap/memcpy/persist variants), and it's by no means committable. >> But it works well enough for experiments / measurements, etc. >> >> The numbers (5-minute pgbench runs on scale 500) look like this: >> >> master/btt master/dax ntt simple >> ----------------------------------------------------------- >> 1 5469 7402 7977 6746 >> 16 48222 80869 107025 82343 >> 32 73974 158189 214718 158348 >> 64 85921 154540 225715 164248 >> 96 150602 221159 237008 217253 >> >> A chart illustrating these results is attached. The four columns are >> showing unpatched master with WAL on a pmem device, in BTT or DAX modes, >> "ntt" is the patch submitted to this thread, and "simple" is the patch >> I've hacked together. >> >> As expected, the BTT case performs poorly (compared to the rest). >> >> The "master/dax" and "simple" perform about the same. There are some >> differences, but those may be attributed to noise. The NTT patch does >> outperform these cases by ~20-40% in some cases. >> >> The question is why. I recall suggestions this is due to page faults >> when writing data into the WAL, but I did experiment with various >> settings that I think should prevent that (e.g. disabling WAL reuse >> and/or disabling zeroing the segments) but that made no measurable >> difference. > > The page faults are only a problem when mmap() is used *without* DAX. > > Takashi tried a patch earlier to mmap() WAL segments and insert WAL to > them directly. See 0002-Use-WAL-segments-as-WAL-buffers.patch at > https://www.postgresql.org/message-id/000001d5dff4%24995ed180%24cc1c7480%24%40hco.ntt.co.jp_1. > Could you test that patch too, please? Using your nomenclature, that > patch skips wal_buffers and does: > > clients -> wal segments (PMEM DAX) > > He got good results with that with DAX, but otherwise it performed > worse. And then we discussed why that might be, and the page fault > hypothesis was brought up. > D'oh, I haven't noticed there's a patch doing that. This thread has so many different patches - which is good, but a bit confusing. > I think 0002-Use-WAL-segments-as-WAL-buffers.patch is the most promising > approach here. But because it's slower without DAX, we need to keep the > current code for non-DAX systems. Unfortunately it means that we need to > maintain both implementations, selectable with a GUC or some DAX > detection magic. The question then is whether the code complexity is > worth the performance gin on DAX-enabled systems. > Sure, I can give it a spin. The question is whether it applies to current master, or whether some sort of rebase is needed. I'll try. > Andres was not excited about mmapping the WAL segments because of > performance reasons. I'm not sure how much of his critique applies if we > keep supporting both methods and only use mmap() if so configured. > Yeah. I don't think we can just discard the current approach, there are far too many OS variants that even if Linux is happy one of the other critters won't be. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: