Postgres 7.4.7 hang in async_notify

Поиск
Список
Период
Сортировка
От pgsql-bugs@counterstorm.com
Тема Postgres 7.4.7 hang in async_notify
Дата
Msg-id 200506021747.j52Hl2aj027031@startide.sysdetect.com
обсуждение исходный текст
Ответы Re: Postgres 7.4.7 hang in async_notify  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Postgres 7.4.7 hang in async_notify  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
We saw the problem with async_notify again (See thread with subject
"Postgres 7.4.6 hang in async_notify" in first message to this list
dated "Mon, 25 Apr 2005 15:42:35 -0400") in a production setting.
Since our last instance, we converted to compiling postgres with
debugging, so we have a stack trace.  Looking at it, the problem
appears at first blush like it might be pretty obvious: an ill-timed
signal which arrives during a malloc while malloc has some
data-structure locked, and one of the extensive operations that
Async_NotifyHandler did probably involved getting the same lock.  That
doesn't explain everything, since the stack trace looks like it was
munged a little bit and it is not completely obvious to me how we got
from tag_hash to posix_memalign.

For the record, while this postgres should be (of two) generating
notifies out of triggers, we do not believe it should be listening for
any, and indeed examination of pg_listener suggests it does not.  If
it matters, the notify should probably have been on the "wwww_aaaaa"
relname.

This was on a 64bit Opteron multiprocessor running Fedora Core 3.  As
you see from the stack trace, we have SSL enabled for the connection
from the postgres client.

We have a coredump, binary, and associated build tree if anyone wants
it.

                                        -Seth Robertson
----------------------------------------------------------------------
sysd=> select * from pg_locks;
 relation | database | transaction |  pid  |      mode       | granted
----------+----------+-------------+-------+-----------------+---------
          |          |     1452163 |  5043 | ExclusiveLock   | t
          |          |     1408041 | 29980 | ExclusiveLock   | t
    16759 |    17212 |             |  5043 | AccessShareLock | t
----------------------------------------------------------------------

5043 is the above psql.  29980 is the postgres stuck in async_notify:

----------------------------------------------------------------------
sysd=> select * from pg_listener ;
           relname            | listenerpid | notification
------------------------------+-------------+--------------
 kludge_28041                 |       28055 |            0
 antura_sysd_jobs             |       25754 |            0
 antura_sysd_files            |       25754 |            0
 wwww_ssssssss_eeeeeeee       |       30219 |            0
 sensor_rrrrrrrr_policy       |       30219 |            0
 kludge_30217                 |       30219 |            0
 antura_sysd_job_cron_actions |       25754 |            0
 wwww_aaaaa                   |       30219 |            0
 Pending                      |       28055 |            0
(9 rows)
----------------------------------------------------------------------

----------------------------------------------------------------------
29980 ?        S    158:06 postgres: sysd sysd 127.0.0.1 async_notify
----------------------------------------------------------------------

----------------------------------------------------------------------
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...Using host libthread_db library "/lib64/tls/libthread_db.so.1".

Attaching to program: /usr/counterstorm/postgres/bin/postgres, process 29980
Reading symbols from /lib64/libssl.so.4...done.
Loaded symbols for /lib64/libssl.so.4
Reading symbols from /lib64/libcrypto.so.4...done.
Loaded symbols for /lib64/libcrypto.so.4
Reading symbols from /usr/lib64/libz.so.1...done.
Loaded symbols for /usr/lib64/libz.so.1
Reading symbols from /usr/lib64/libreadline.so.4...done.
Loaded symbols for /usr/lib64/libreadline.so.4
Reading symbols from /lib64/libtermcap.so.2...done.
Loaded symbols for /lib64/libtermcap.so.2
Reading symbols from /lib64/libcrypt.so.1...done.
Loaded symbols for /lib64/libcrypt.so.1
Reading symbols from /lib64/libresolv.so.2...done.
Loaded symbols for /lib64/libresolv.so.2
Reading symbols from /lib64/libnsl.so.1...done.
Loaded symbols for /lib64/libnsl.so.1
Reading symbols from /lib64/libdl.so.2...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/tls/libm.so.6...done.
Loaded symbols for /lib64/tls/libm.so.6
Reading symbols from /lib64/tls/libc.so.6...done.
Loaded symbols for /lib64/tls/libc.so.6
Reading symbols from /usr/lib64/libgssapi_krb5.so.2...done.
Loaded symbols for /usr/lib64/libgssapi_krb5.so.2
Reading symbols from /usr/lib64/libkrb5.so.3...done.
Loaded symbols for /usr/lib64/libkrb5.so.3
Reading symbols from /lib64/libcom_err.so.2...done.
Loaded symbols for /lib64/libcom_err.so.2
Reading symbols from /usr/lib64/libk5crypto.so.3...done.
Loaded symbols for /usr/lib64/libk5crypto.so.3
Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib64/libnss_files.so.2...done.
Loaded symbols for /lib64/libnss_files.so.2
Reading symbols from /usr/counterstorm/postgres/lib/plpgsql.so...done.
Loaded symbols for /usr/counterstorm/postgres/lib/plpgsql.so
Reading symbols from /usr/counterstorm/postgres/lib/array_sz.so...done.
Loaded symbols for /usr/counterstorm/postgres/lib/array_sz.so
Reading symbols from /usr/counterstorm/postgres/lib/hostinfo.so...done.
Loaded symbols for /usr/counterstorm/postgres/lib/hostinfo.so
Reading symbols from /usr/counterstorm/postgres/lib/pending.so...done.
Loaded symbols for /usr/counterstorm/postgres/lib/pending.so
0x00000036dacd09ab in __lll_mutex_lock_wait () from /lib64/tls/libc.so.6
(gdb) where
#0  0x00000036dacd09ab in __lll_mutex_lock_wait () from /lib64/tls/libc.so.6
#1  0x00000000185d6840 in ?? ()
#2  0x000000000000000d in ?? ()
#3  0x00000036dac6d181 in posix_memalign () from /lib64/tls/libc.so.6
#4  0x0000000000000001 in ?? ()
#5  0x00000000006cd449 in tag_hash (key=0x36dae2e620, keysize=0) at hashfn.c:40
#6  0x00000000006cd076 in hash_search (hashp=0x2aa221da00, keyPtr=0x7fbfffc6a0, action=20, foundPtr=0x6ccf5c
"\205Àu×ë'A#t$\034ë\230fff\220ff\220H\213-ñ6\"")at dynahash.c:647 
#7  0x00000000005ce55e in mdclose (reln=0x1e0) at md.c:384
#8  0x00000000005d0eec in smgrclose (which=-6624, reln=0x2aa221da00) at smgr.c:292
#9  0x00000000006b9dcc in RelationIdInvalidateRelationCacheByRelationId (relationId=2002738) at relcache.c:1748
#10 0x00000000006b5628 in LocalExecuteInvalidationMessage (msg=0x7fbfffc6e0) at inval.c:440
#11 0x00000000005c0388 in ReceiveSharedInvalidMessages (invalFunction=0x6b5550 <LocalExecuteInvalidationMessage>,
resetFunction=0x6b5670<InvalidateSystemCaches>) at sinval.c:125 
#12 0x00000000004684d6 in StartTransactionCommand () at xact.c:446
#13 0x00000000004d3a84 in Async_NotifyHandler (postgres_signal_arg=-622664160) at async.c:814
#14 <signal handler called>
#15 0x00000036dac69094 in _int_malloc () from /lib64/tls/libc.so.6
#16 0x00000036dac6ab12 in malloc () from /lib64/tls/libc.so.6
#17 0x00000036dcb4d05b in CRYPTO_malloc () from /lib64/libcrypto.so.4
#18 0x00000036dcb99dc1 in EVP_MD_CTX_copy_ex () from /lib64/libcrypto.so.4
#19 0x00000036dcb585cc in HMAC_Final () from /lib64/libcrypto.so.4
#20 0x00000036db91fe10 in tls1_mac () from /lib64/libssl.so.4
#21 0x00000036db91bcbf in ssl3_read_bytes () from /lib64/libssl.so.4
#22 0x00000036db919a9f in ssl3_renegotiate_check () from /lib64/libssl.so.4
#23 0x000000000053492c in secure_read (port=0x929e80, ptr=0x8a7740, len=8192) at be-secure.c:261
#24 0x000000000053ba47 in pq_getbyte () at pqcomm.c:685
#25 0x00000000005d759d in PostgresMain (argc=9992288, argv=0x52, username=0x7fbfffd660 "") at postgres.c:275
#26 0x00000000005911e4 in PostmasterMain (argc=9470864, argv=0x924a80) at postmaster.c:2561
#27 0x000000000053e0ef in main (argc=4, argv=0x7fbffff6b8) at main.c:214
----------------------------------------------------------------------

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: More graceful handling of invalid log_*_stats conf in 7.4
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Postgres 7.4.7 hang in async_notify