Re: Dynamic Shared Memory stuff
От | Heikki Linnakangas |
---|---|
Тема | Re: Dynamic Shared Memory stuff |
Дата | |
Msg-id | 52A0A600.5080805@vmware.com обсуждение исходный текст |
Ответ на | Re: Dynamic Shared Memory stuff (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: Dynamic Shared Memory stuff
Re: Dynamic Shared Memory stuff |
Список | pgsql-hackers |
On 11/20/2013 09:58 PM, Robert Haas wrote: > On Wed, Nov 20, 2013 at 8:32 AM, Heikki Linnakangas > <hlinnakangas@vmware.com> wrote: >> How many allocations? What size will they have have typically, minimum and >> maximum? > > The facility is intended to be general, so the answer could vary > widely by application. The testing that I have done so far suggests > that for message-passing, relatively small queue sizes (a few kB, > perhaps 1 MB at the outside) should be sufficient. However, > applications such as parallel sort could require vast amounts of > shared memory. Consider a machine with 1TB of memory performing a > 512GB internal sort. You're going to need 512GB of shared memory for > that. Hmm. Those two use cases are quite different. For message-passing, you want a lot of small queues, but for parallel sort, you want one huge allocation. I wonder if we shouldn't even try a one-size-fits-all solution. For message-passing, there isn't much need to even use dynamic shared memory. You could just assign one fixed-sized, single-reader multiple-writer queue for each backend. For parallel sort, you'll want to utilize all the available memory and all CPUs for one huge sort. So all you really need is a single huge shared memory segment. If one process is already using that 512GB segment to do a sort, you do *not* want to allocate a second 512GB segment. You'll want to wait for the first operation to finish first. Or maybe you'll want to have 3-4 somewhat smaller segments in use at the same time, but not more than that. >> * As discussed in the "Something fishy happening on frogmouth" thread, I >> don't like the fact that the dynamic shared memory segments will be >> permanently leaked if you kill -9 postmaster and destroy the data directory. > > Your test elicited different behavior for the dsm code vs. the main > shared memory segment because it involved running a new postmaster > with a different data directory but the same port number on the same > machine, and expecting that that new - and completely unrelated - > postmaster would clean up the resources left behind by the old, > now-destroyed cluster. I tend to view that as a defect in your test > case more than anything else, but as I suggested previously, we could > potentially change the code to use something like 1000000 + (port * > 100) with a forward search for the control segment identifier, instead > of using a state file, mimicking the behavior of the main shared > memory segment. I'm not sure we ever reached consensus on whether > that was overall better than what we have now. I really think we need to do something about it. To use your earlier example of parallel sort, it's not acceptable to permanently leak a 512 GB segment on a system with 1 TB of RAM. One idea is to create the shared memory object with shm_open, and wait until all the worker processes that need it have attached to it. Then, shm_unlink() it, before using it for anything. That way the segment will be automatically released once all the processes close() it, or die. In particular, kill -9 will release it. (This is a variant of my earlier idea to create a small number of anonymous shared memory file descriptors in postmaster startup with shm_open(), and pass them down to child processes with fork()). I think you could use that approach with SysV shared memory as well, by destroying the segment with sgmget(IPC_RMID) immediately after all processes have attached to it. - Heikki
В списке pgsql-hackers по дате отправления: