Early hint bit setting
От | Ants Aasma |
---|---|
Тема | Early hint bit setting |
Дата | |
Msg-id | CA+CSw_twkPMHv3KoC3Kc_1e+Wt7Vcdix8bBDUnyMft+QxDPimw@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: Early hint bit setting
(Merlin Moncure <mmoncure@gmail.com>)
Re: Early hint bit setting (Jim Nasby <jim@nasby.net>) |
Список | pgsql-hackers |
I was thinking about what is the earliest time where we could set hint bits. This would be just after the commit has been made visible. When the transaction completes and commit confirmation is sent to the client the backend will usually go to sleep waiting on the network socket waiting for further commands. Because most clients wait for the commit confirmation before proceeding this means that we have atleast one network RTT before this backend is expected to respond again. The idea is to keep a small backend local ring buffer of pages that have been modified. When a transaction has just committed, we do a non-blocking read on the socket. When nothing is available we take the opportunity to go and set hint bits in the recently modified buffers. Hurting latency for single-threaded workloads using lots of transactions is bad. It follows that it would be a bad idea to do anything that could take a long time while waiting for the next command. Because early hinting is a performance optimisation we can safely skip it if it becomes bothersome. Anything that causes IO can take too long. So we only set the hint bits when the page is still in shared buffers to avoid reading in the page. Furthermore, we only hint the tuples that the recently completed transaction modified to avoid IO from CLOG (we could hint other tuples if their xid happens to be in the SLRU, but it probably won't be very useful). Hint bits are set sooner or later. Setting them earlier is a throughput win for any workload because we avoid generating extra load. We avoid doing any IO and we might save some so for IO this is a pure win. The hinting CPU work needs to be done sooner or later, so that's a tie, except for extremely bursty write heavy loads with lots of transactions. Memory loads could in principle hurt other backends. Refilling the whole last level cache of modern processors takes a few hundred microseconds at peak speed. If the WAL is on fast storage (BBWC, SSD) there's a pretty good chance that the page being hinted is still in the cpu cache, avoiding the memory bandwidth overhead. Abstraction wise, I think we need to set up a mechanism to run very short maintenance jobs from backends waiting for new commands. SocketBackend could check if there's anything to do, and call pq_getbyte_if_available if there is anything to do before proceeding to do it. Setting hint bits early would help workloads with small synchronously writing transactions. Async commits could also benefit from proactive hint bit setting, but this would require some global cooperation and isn't as clear of a win. One idea would be to copy the local ring buffer entries to a global one tagged with the LSN when the transaction has been made visible. When someone flushes xlog, they also check if it enables some background hinting and set the corresponding flag for any backend with spare cycles to pick up. Comments? Ants Aasma -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de
В списке pgsql-hackers по дате отправления: