Interesting. Here is the patch I just applied:
http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/postmaster/pgstat.c.diff?r1=1.116&r2=1.117
The only guess I have is that select() is modifying the timeout
structure on return, but I didn't think it did that, does it?
Googling shows Linux does modify the structure (see bottom):
http://groups.google.com/group/comp.unix.programmer/browse_frm/thread/a53c7c4a71cb48e5/5f0bbcc9fe0230a2?lnk=st&q=select+timeout+modify&rnum=9#5f0bbcc9fe0230a2
so I will fix the code accordingly. Patch attached and applied.
---------------------------------------------------------------------------
Joe Conway wrote:
> I just noticed that the stats buffer process is consuming 100% cpu as
> soon as a backend is started, and continues after that backend is ended:
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 15150 postgres 25 0 27004 948 508 S 99.9 0.0 0:30.97 postmaster
>
>
> # ps -ef |grep 15150
> postgres 15150 15143 78 11:29 pts/3 00:00:38 postgres: stats buffer
> process
> postgres 15151 15150 0 11:29 pts/3 00:00:00 postgres: stats
> collector process
>
>
> (gdb) bt
> #0 0x000000383b8c2633 in __select_nocancel () from /lib64/libc.so.6
> #1 0x000000000055e896 in PgstatBufferMain (argc=Variable "argc" is not
> available.
> ) at pgstat.c:1921
> #2 0x000000000055f73b in pgstat_start () at pgstat.c:614
> #3 0x0000000000562fda in reaper (postgres_signal_arg=Variable
> "postgres_signal_arg" is not available.
> ) at postmaster.c:2175
> #4 <signal handler called>
> #5 0x000000383b8c2633 in __select_nocancel () from /lib64/libc.so.6
> #6 0x0000000000560d0f in ServerLoop () at postmaster.c:1180
> #7 0x0000000000562443 in PostmasterMain (argc=7, argv=0x88df20) at
> postmaster.c:943
> #8 0x00000000005217fe in main (argc=7, argv=0x88df20) at main.c:263
>
> I noticed a recent discussion on the stats collector -- is this related
> to a recent change?
>
> Joe
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly
>
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
Index: src/backend/postmaster/pgstat.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/postmaster/pgstat.c,v
retrieving revision 1.117
diff -c -c -r1.117 pgstat.c
*** src/backend/postmaster/pgstat.c 3 Jan 2006 16:42:17 -0000 1.117
--- src/backend/postmaster/pgstat.c 3 Jan 2006 19:52:14 -0000
***************
*** 1871,1884 ****
msgbuffer = (char *) palloc(PGSTAT_RECVBUFFERSZ);
/*
- * Wait for some work to do; but not for more than 10 seconds. (This
- * determines how quickly we will shut down after an ungraceful
- * postmaster termination; so it needn't be very fast.)
- */
- timeout.tv_sec = 10;
- timeout.tv_usec = 0;
-
- /*
* Loop forever
*/
for (;;)
--- 1871,1876 ----
***************
*** 1918,1923 ****
--- 1910,1924 ----
maxfd = writePipe;
}
+ /*
+ * Wait for some work to do; but not for more than 10 seconds. (This
+ * determines how quickly we will shut down after an ungraceful
+ * postmaster termination; so it needn't be very fast.) struct timeout
+ * is modified by some operating systems.
+ */
+ timeout.tv_sec = 10;
+ timeout.tv_usec = 0;
+
if (select(maxfd + 1, &rfds, &wfds, NULL, &timeout) < 0)
{
if (errno == EINTR)