Обсуждение: fork/exec for backend
I am still wondering why postmaster fork/exec instead of just forking when receiving a new connection. Fork on modern unices (linux and (a think) *BSD) cost almost nothing (in time and memory) thanks to COW (copy-on-write). Exec in expensive as it breaks COW. I know this is not the time (have too wait 'til after 6.3), but shouldn't this be on the ToDo-list. best regards, -- --------------------------------------------- G�ran Thyni, sysadm, JMS Bildbasen, Kiruna
> > > I am still wondering why postmaster fork/exec instead of > just forking when receiving a new connection. > > Fork on modern unices (linux and (a think) *BSD) cost > almost nothing (in time and memory) thanks to COW (copy-on-write). > Exec in expensive as it breaks COW. > > I know this is not the time (have too wait 'til after 6.3), > but shouldn't this be on the ToDo-list. It was on my personal TODO. It is on the main one now: * remove fork()/exec() of backend and make it just fork() I had hoped to do this fir 6.3 as it will save 0.01 seconds on startup, but no luck. -- Bruce Momjian maillist@candle.pha.pa.us
On 24 Jan 1998, Goran Thyni wrote: > Fork on modern unices (linux and (a think) *BSD) cost > almost nothing (in time and memory) thanks to COW (copy-on-write). > Exec in expensive as it breaks COW. Not so. Modern Unixs will share executable address space between processes. So if you fork and exec 10 identical programs, they will share most address space. If you want to speed this up, link postgresql static. This makes exec() cost almost nothing too. postgresql becomes its own best shared library. Again, this only applies to "modern" systems, but FreeBSD definitely has this behaviour. Tom
On 24 Jan 1998, Goran Thyni wrote: > Fork on modern unices (linux and (a think) *BSD) cost > almost nothing (in time and memory) thanks to COW (copy-on-write). > Exec in expensive as it breaks COW. Not so. Modern Unixs will share executable address space between processes. So if you fork and exec 10 identical programs, they will share most address space. 1. Code is probably not shared between postmaster and postgres processes. 2. Some inits may be done once (by postmaster) and not repeated by every child. 3. (and most important) With no exec COW is in action, meaning: data pages in shared until changed. COW is the key to how Linux can fork faster than most unices starts a new thread. :-) Again, this only applies to "modern" systems, but FreeBSD definitely has this behaviour. I don't know if *BSD has COW, but if should think so. best regards, -- --------------------------------------------- G�ran Thyni, sysadm, JMS Bildbasen, Kiruna
On 24 Jan 1998, Goran Thyni wrote: > On 24 Jan 1998, Goran Thyni wrote: > > > Fork on modern unices (linux and (a think) *BSD) cost > > almost nothing (in time and memory) thanks to COW (copy-on-write). > > Exec in expensive as it breaks COW. > > Not so. Modern Unixs will share executable address space between > processes. So if you fork and exec 10 identical programs, they will share > most address space. > > 1. Code is probably not shared between postmaster and postgres > processes. A backend is execed for every connection. All the backends will share code space. > 2. Some inits may be done once (by postmaster) and not repeated > by every child. Not relevant. I'm only concerned with the children. > 3. (and most important) > With no exec COW is in action, meaning: > data pages in shared until changed. > > COW is the key to how Linux can fork faster than most unices > starts a new thread. :-) COW is old news. Perhaps you can find some old SCO systems that don't do COW :) > Again, this only applies to "modern" systems, but FreeBSD definitely has > this behaviour. > > I don't know if *BSD has COW, but if should think so. I'm not speaking just about COW, but about being able share code between separately execed processes. > best regards, > -- > --------------------------------------------- > G�ran Thyni, sysadm, JMS Bildbasen, Kiruna Tom
> > > On 24 Jan 1998, Goran Thyni wrote: > > > Fork on modern unices (linux and (a think) *BSD) cost > > almost nothing (in time and memory) thanks to COW (copy-on-write). > > Exec in expensive as it breaks COW. > > Not so. Modern Unixs will share executable address space between > processes. So if you fork and exec 10 identical programs, they will share > most address space. > > 1. Code is probably not shared between postmaster and postgres > processes. I think it is shared. postmaster is a symlink to postgres, so by the time it gets to the kernel exec routines, both processes are mapped to the same inode number. > > 2. Some inits may be done once (by postmaster) and not repeated > by every child. Maybe. > > 3. (and most important) > With no exec COW is in action, meaning: > data pages in shared until changed. This would also prevent us from attaching to shared memory because it would already be in the address space. > > COW is the key to how Linux can fork faster than most unices > starts a new thread. :-) > > Again, this only applies to "modern" systems, but FreeBSD definitely has > this behaviour. > > I don't know if *BSD has COW, but if should think so. All modern Unixes have COW. -- Bruce Momjian maillist@candle.pha.pa.us
> > > On 24 Jan 1998, Goran Thyni wrote: > > > Fork on modern unices (linux and (a think) *BSD) cost > > almost nothing (in time and memory) thanks to COW (copy-on-write). > > Exec in expensive as it breaks COW. > > Not so. Modern Unixs will share executable address space between > processes. So if you fork and exec 10 identical programs, they will share > most address space. > > If you want to speed this up, link postgresql static. This makes exec() > cost almost nothing too. postgresql becomes its own best shared library. > > Again, this only applies to "modern" systems, but FreeBSD definitely has > this behaviour. This is very OS-specific. SunOS-style shared libraries do have a noticable overhead for each function call. In fact, even though these are part of BSD44 source, BSDI does not use them, and uses a more crude shared library jump table, similar to SVr3 shared libraries because of the SunOS shared library overhead. I think FreeBSD and Lunix use SunOS style shared libraries, often called dynamic shared libraries because you can change the function while the binary is running if you are realy careful. -- Bruce Momjian maillist@candle.pha.pa.us
On Sat, 24 Jan 1998, Bruce Momjian wrote: > > On 24 Jan 1998, Goran Thyni wrote: > > > > > Fork on modern unices (linux and (a think) *BSD) cost > > > almost nothing (in time and memory) thanks to COW (copy-on-write). > > > Exec in expensive as it breaks COW. > > > > Not so. Modern Unixs will share executable address space between > > processes. So if you fork and exec 10 identical programs, they will share > > most address space. > > > > If you want to speed this up, link postgresql static. This makes exec() > > cost almost nothing too. postgresql becomes its own best shared library. > > > > Again, this only applies to "modern" systems, but FreeBSD definitely has > > this behaviour. > > This is very OS-specific. SunOS-style shared libraries do have a > noticable overhead for each function call. In fact, even though these > are part of BSD44 source, BSDI does not use them, and uses a more crude > shared library jump table, similar to SVr3 shared libraries because of > the SunOS shared library overhead. Regardless on the method used, the dynamic executables need to undergo a link step during exec(). Linking static reduces that. > I think FreeBSD and Lunix use SunOS style shared libraries, often called > dynamic shared libraries because you can change the function while the > binary is running if you are realy careful. Linux uses ELF shared libraries. I don't know how those work. I don't FreeBSD to have a high call overhead for dynamic libs at all. Static executables just start faster thats all. > -- > Bruce Momjian > maillist@candle.pha.pa.us > > Tom
> Regardless on the method used, the dynamic executables need to undergo a > link step during exec(). Linking static reduces that. True, but the SVr3/BSDI shared libraries set up the jump table in the binary at binary link time. It has to map into the shared library at exec time, but it is a single mapping per shared library, not a mapping per function or per function call. -- Bruce Momjian maillist@candle.pha.pa.us
> I'm not speaking just about COW, but about being able share code > between separately execed processes. This we already have (as long as the OS is sane). I was think about reducing child startup time by not breaking COW. I do not seem to be a big issue in normal usage, but there are several situations where one will do a : connect/simple query/disconnect sequence for instance in simple CGI-queries where it could be a noticable speedup. best regards, -- --------------------------------------------- G�ran Thyni, sysadm, JMS Bildbasen, Kiruna
> This would also prevent us from attaching to shared memory because it > would already be in the address space. With no exec we could use mmap instead of shm*. Have to clock them to see which one is faster first. I think the mmap API is cleaner. regards, -- --------------------------------------------- G�ran Thyni, sysadm, JMS Bildbasen, Kiruna