Re: checkpointer continuous flushing - V18
От | Fabien COELHO |
---|---|
Тема | Re: checkpointer continuous flushing - V18 |
Дата | |
Msg-id | alpine.DEB.2.10.1602210853540.3927@sto обсуждение исходный текст |
Ответ на | Re: checkpointer continuous flushing - V18 (Fabien COELHO <coelho@cri.ensmp.fr>) |
Список | pgsql-hackers |
Hallo Andres, >>> [...] I do think that this whole writeback logic really does make sense >>> *per table space*, >> >> Leads to less regular IO, because if your tablespaces are evenly sized >> (somewhat common) you'll sometimes end up issuing sync_file_range's >> shortly after each other. For latency outside checkpoints it's >> important to control the total amount of dirty buffers, and that's >> obviously independent of tablespaces. > > I do not understand/buy this argument. > > The underlying IO queue is per device, and table spaces should be per device > as well (otherwise what the point?), so you should want to coalesce and > "writeback" pages per device as wel. Calling sync_file_range on distinct > devices should probably be issued more or less randomly, and should not > interfere one with the other. > > If you use just one context, the more table spaces the less performance > gains, because there is less and less aggregation thus sequential writes per > device. > > So for me there should really be one context per tablespace. That would > suggest a hashtable or some other structure to keep and retrieve them, which > would not be that bad, and I think that it is what is needed. Note: I think that an easy way to do that in the "checkpoint sort" patch is simply to keep a WritebackContext in CkptTsStatus structure which is per table space in the checkpointer. For bgwriter & backends it can wait, there is few "writeback" coalescing because IO should be pretty random, so it does not matter much. -- Fabien.
В списке pgsql-hackers по дате отправления: