Re: BUG #16007: Regarding patch for BUG #3995: pqSocketCheck doesn't return

Поиск
Список
Период
Сортировка
От Kiran Khatke
Тема Re: BUG #16007: Regarding patch for BUG #3995: pqSocketCheck doesn't return
Дата
Msg-id CAGKgC8Vv3MyFmyOFTC68_H-JTHcBfqCpSD4GhJrG9i9BOSXd7A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #16007: Regarding patch for BUG #3995: pqSocketCheck doesn't return  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs

Hello Tom, 

Thanks for the support.

Below are the thread which uses libpq, and both the thread stuck in poll() only.

We haven't enabled server logs earlier, so not sure about server side happening.

This issue is rarely reproducible, hence could not check enabling server logs. 

Thread 1:

#0  0x2ea7f184 in *__GI___poll (fds=<value optimized out>, nfds=1, timeout=<value optimized out>) at ../sysdeps/unix/sysv/linux/poll.c:87

#1  0x2b600238 in pqSocketCheck (conn=0x110999b8, forRead=1, forWrite=0, end_time=-1) at fe-misc.c:1043

#2  0x2b600404 in pqWaitTimed (forRead=<value optimized out>, forWrite=1, conn=0x110999b8, finish_time=0) at fe-misc.c:917

#3  0x2b5ff884 in PQgetResult (conn=0x110999b8) at fe-exec.c:1223

#4  0x100c2fa4 in dbConnObj::execStatement_nowait (this=0x110910e8,

    sqlStatement=0x313aae84 "INSERT INTO event (event_id,severity,flags,timestamp,managed_obj_id,managed_obj,groups,params) VALUES (184,5,0,'2019-06-26T10:26:38.133353-07:00',6,'ServicesNode.1025','TRClient','Name=\"031663-SCSN-FO"...) at src/dbmgr/dbConnObj.c:169

#5  0x10099c80 in dbConnectionMgr::insertSQL (this=0x11090640, objID=DBO_EVENT, type=DB_LOGGING,

    serialObj=0x1156273c "9,11,1171457,1,0,13,1171458,3,184,11,1171459,1,5,11,1171460,1,0,43,1171461,32,2019-06-26T10:26:38.133353-07:00,11,1171462,1,6,28,1171463,17,ServicesNode.1025,18,1171464,8,TRClient,84,1171465,73,Name=\""..., retSeqErr=true) at src/dbmgr/dbConnectionMgr.c:1489

 

Thread 2: (main processing thread)

#5  0x1006f4b8 in _pga_stop_db () at src/dbmgr/pg_admin.c:7643

#6  0x1006f618 in pga_stop () at src/dbmgr/pg_admin.c:168

#7  0x10f0c330 in _dbm_sigabrt (signo=6, si=0x7f766d58, context=0x7f766dd8) at src/dbmgr/dbm_main.c:1567

#8  <signal handler called>

#9  0x2ea7f184 in *__GI___poll (fds=<value optimized out>, nfds=1, timeout=<value optimized out>) at ../sysdeps/unix/sysv/linux/poll.c:87

#10 0x2b600238 in pqSocketCheck (conn=0x11088158, forRead=1, forWrite=0, end_time=-1) at fe-misc.c:1043

#11 0x2b600404 in pqWaitTimed (forRead=<value optimized out>, forWrite=4, conn=0x11088158, finish_time=1) at fe-misc.c:917

#12 0x2b5ff884 in PQgetResult (conn=0x11088158) at fe-exec.c:1223

#13 0x2b5ffb48 in PQexecFinish (conn=0x11088158) at fe-exec.c:1452

#14 0x100c2930 in dbConnObj::execStatement (this=0x11091048, sqlStatement=0x3100bec4 "UPDATE MGMT_SERVER SET LAST_SUCCESSFUL_CONNECTION='1561569998' ", checkAlreadyExists=false, freeResult=true,

    retSeqErr=false) at src/dbmgr/dbConnObj.c:243

#15 0x10099498 in dbConnectionMgr::updateSQL (this=0x11090640, objID=DBO_MGMT_SERVER_TIME, type=DB_CONFIGURATION, serialObj=0x1156249c "1,19,16385,10,1561569998", consObj=@0x7f7673f8, cons=0x2e8957e4 "",

    consSerial=0x2e8957e4 "") at src/dbmgr/dbConnectionMgr.c:1647

Regards,
Kiran 

On Mon, Sep 16, 2019 at 6:54 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Kiran Khatke <kirankhatke23may@gmail.com> writes:
> One of the thread of DBMGR Daemon is waiting for the result of poll()
> function.
> poll() was called by pgSocketCheck(). So pqSocketCheck() didn't return,
> hung in poll().
> Below is the backtrace.

Well, it's waiting for the query to finish, or so it thinks.  Did you
look at what the server thinks the session is doing?

Your reference to multiple threads is a red flag to me.  Very often
we see people whose programs try to use the same PGconn object from
multiple threads.  That doesn't work --- and libpq does not have any
internal mutexes that would prevent the object's state from getting
messed up by concurrent operations.  So a plausible theory is that
this PGconn was used concurrently, and now this particular thread
is stuck because the object's state is corrupt (ie, it shows the
query as busy but the server doesn't think so).

It might be worth enabling log_statement = all on the server side
and then watching the server log to see what seems to be happening
from that end.

                        regards, tom lane

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Anthony Sotolongo
Дата:
Сообщение: Re: BUG #16011: Select * query for sequences does not show allcolumns in output.
Следующее
От: "Bossart, Nathan"
Дата:
Сообщение: Re: ERROR: multixact X from before cutoff Y found to be still running