Andres Freund <andres@anarazel.de> writes: > On December 25, 2015 7:10:23 PM GMT+01:00, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Seems like what you've got here is a kernel bug.
> I wouldn't go as far as calling it a kernel bug. Were still doing 300k tps. And were triggering the performance degradation by adding another socket (IIRC) to the poll(2) call.
Hmm. And all those FDs point to the same pipe. I wonder if we're looking at contention for some pipe-related data structure inside the kernel.
regards, tom lane
I did bt on backends and found it in following state:
#0 0x00007f77b0e5bb60 in __poll_nocancel () from /lib64/libc.so.6
#1 0x00000000006a7cd0 in WaitLatchOrSocket (latch=0x7f779e2e96c4, wakeEvents=wakeEvents@entry=19, sock=9, timeout=timeout@entry=0) at pg_latch.c:333
#2 0x0000000000612c7d in secure_read (port=0x17e6af0, ptr=0xcc94a0 <PqRecvBuffer>, len=8192) at be-secure.c:147
#3 0x000000000061be36 in pq_recvbuf () at pqcomm.c:915
#4 pq_getbyte () at pqcomm.c:958
#5 0x0000000000728ad5 in SocketBackend (inBuf=0x7ffd8b6b1460) at postgres.c:345