Re: Postgres gets stuck

Поиск
Список
Период
Сортировка
От Craig A. James
Тема Re: Postgres gets stuck
Дата
Msg-id 44635DFE.1060509@modgraph-usa.com
обсуждение исходный текст
Ответ на Re: Postgres gets stuck  (Chris <dmagick@gmail.com>)
Ответы Re: Postgres gets stuck  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-performance
Chris wrote:
>
>> This is a deadly bug, because our web site goes dead when this
>> happens, ...
>
> Sounds like a deadlock issue.
> ...
> stats_command_string = true
> and restart postgresql.
> then you'll be able to:
> select * from pg_stat_activity;
> to see what queries postgres is running and that might give you some clues.

Thanks, good advice.  You're absolutely right, it's stuck on a mutex.  After doing what you suggest, I discovered that
thequery in progress is a user-written function (mine).  When I log in as root, and use "gdb -p <pid>" to attach to the
process,here's what I find.  Notice the second function in the stack, a mutex lock: 

(gdb) bt
#0  0x0087f7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x0096cbfe in __lll_mutex_lock_wait () from /lib/tls/libc.so.6
#2  0x008ff67b in _L_mutex_lock_3220 () from /lib/tls/libc.so.6
#3  0x4f5fc1b4 in ?? ()
#4  0x00dc5e64 in std::string::_Rep::_S_empty_rep_storage () from /usr/local/pgsql/lib/libchmoogle.so
#5  0x009ffcf0 in ?? () from /usr/lib/libz.so.1
#6  0xbfe71c04 in ?? ()
#7  0xbfe71e50 in ?? ()
#8  0xbfe71b78 in ?? ()
#9  0x009f7019 in zcfree () from /usr/lib/libz.so.1
#10 0x009f7019 in zcfree () from /usr/lib/libz.so.1
#11 0x009f8b7c in inflateEnd () from /usr/lib/libz.so.1
#12 0x00c670a2 in ~basic_unzip_streambuf (this=0xbfe71be0) at zipstreamimpl.h:332
#13 0x00c60b61 in OpenBabel::OBConversion::Read (this=0x1, pOb=0xbfd923b8, pin=0xffffffea) at istream:115
#14 0x00c60fd8 in OpenBabel::OBConversion::ReadString (this=0x8672b50, pOb=0xbfd923b8) at obconversion.cpp:780
#15 0x00c19d69 in chmoogle_ichem_mol_alloc () at stl_construct.h:120
#16 0x00c1a203 in chmoogle_ichem_normalize_parent () at stl_construct.h:120
#17 0x00c1b172 in chmoogle_normalize_parent_sdf () at vector.tcc:243
#18 0x0810ae4d in ExecMakeFunctionResult ()
#19 0x0810de2e in ExecProject ()
#20 0x08115972 in ExecResult ()
#21 0x08109e01 in ExecProcNode ()
#22 0x00000020 in ?? ()
#23 0xbed4b340 in ?? ()
#24 0xbf92d9a0 in ?? ()
#25 0xbed4b0c0 in ?? ()
#26 0x00000000 in ?? ()

It looks to me like my code is trying to read the input parameter (a fairly long string, maybe 2K) from a buffer that
wasgzip'ed by Postgres for the trip between the client and server.  My suspicion is that it's an incompatibility
betweenmalloc() libraries.  libz (gzip compression) is calling something called zcfree, which then appears to be
interceptedby something that's (probably statically) linked into my library.  And somewhere along the way, a mutex gets
set,and then ... it's stuck forever. 

ps(1) shows that this thread had been running for about 7 hours, and the job status showed that this function had been
successfullycalled about 1 million times, before this mutex lock occurred. 

Any ideas?

Thanks,
Craig

В списке pgsql-performance по дате отправления:

Предыдущее
От: Greg Stark
Дата:
Сообщение: Re: [HACKERS] Big IN() clauses etc : feature proposal
Следующее
От: "Jim C. Nasby"
Дата:
Сообщение: Re: [HACKERS] Big IN() clauses etc : feature proposal