Re: Strange hanging bug in a simple milter

Поиск
Список
Период
Сортировка
От Vesa-Matti J Kari
Тема Re: Strange hanging bug in a simple milter
Дата
Msg-id alpine.LRH.2.00.1309091517170.7114@ruuvi.it.helsinki.fi
обсуждение исходный текст
Ответ на Re: Strange hanging bug in a simple milter  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Ответы Re: Strange hanging bug in a simple milter  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Список pgsql-hackers
Hello,

On Mon, 9 Sep 2013, Heikki Linnakangas wrote:

> I managed to set that up and got it running.

Many thanks for taking the time.

> But it works fine for me, does not hang.

Okay. Have you tried increasing the iterations for the smtp sender
scripts? And could you please specify what is your test environment like
(i.e. OS and the related library versions)?

> I'd suggest poking around with gdb, to see where it hangs.

I have actually done that, but it only show the main listener thread from
the libmilter library:

(gdb) bt
#0  0x00007fe64bdd0313 in poll () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007fe64c4f7b46 in mi_listener () from /usr/lib/libmilter.so.1.0.1
#2  0x00007fe64c4f8707 in smfi_main () from /usr/lib/libmilter.so.1.0.1
#3  0x0000000000402c8f in main (argc=15, argv=0x7fffa6560e68) at
authmilter.c:699

Hmmm. The man page mentioned no threads, but Google was helpful and
suggested "info threads" so here goes:

(I hope alpine will not wrap these long lines)

(gdb) info threads Id   Target Id         Frame 9    Thread 0x7fe64700c700 (LWP 14362) "authmilter" 0x00007fe64c0b69f7
indo_sigwait () from /lib/x86_64-linux-gnu/libpthread.so.0 8    Thread 0x7fe64680b700 (LWP 14363) "authmilter"
0x00007fe64bdd0313in poll () from /lib/x86_64-linux-gnu/libc.so.6 7    Thread 0x7fe645809700 (LWP 14365) "authmilter"
0x00007fe64c0b589cin __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 6    Thread 0x7fe645008700 (LWP
22404)"authmilter" 0x00007fe64c0b589c in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 5    Thread
0x7fe64600a700(LWP 27263) "authmilter" 0x00007fe64c0b589c in __lll_lock_wait () from
/lib/x86_64-linux-gnu/libpthread.so.04    Thread 0x7fe644807700 (LWP 27264) "authmilter" 0x00007fe64c0b589c in
__lll_lock_wait() from /lib/x86_64-linux-gnu/libpthread.so.0 3    Thread 0x7fe62ffff700 (LWP 27283) "authmilter"
0x00007fe64c0b589cin __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 2    Thread 0x7fe62f7fe700 (LWP
27284)"authmilter" 0x00007fe64c0b589c in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
 
* 1    Thread 0x7fe64c8fd740 (LWP 14361) "authmilter" 0x00007fe64bdd0313 in poll () from
/lib/x86_64-linux-gnu/libc.so.6

It looks like a deadlock situation of some kind...

(gdb) thread 2
[Switching to thread 2 (Thread 0x7fe62f7fe700 (LWP 27284))]
#0  0x00007fe64c0b589c in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
(gdb) bt
#0  0x00007fe64c0b589c in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007fe64c0b1065 in _L_lock_858 () from /lib/x86_64-linux-gnu/libpthread.so.0
#2  0x00007fe64c0b0eba in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
#3  0x00007fe64c2df200 in ?? () from /usr/lib/libpq.so.5
#4  0x00007fe64b78a5f5 in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
#5  0x00007fe64b77a915 in RSA_new_method () from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
#6  0x00007fe64b77d64d in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
#7  0x00007fe64b7b9bf2 in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
#8  0x00007fe64b7bc6d1 in ASN1_item_ex_d2i () from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
#9  0x00007fe64b7bd0c4 in ASN1_item_d2i () from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
#10 0x00007fe64b77ea2f in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
#11 0x00007fe64b7b461a in X509_PUBKEY_get () from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
#12 0x00007fe64b7d119a in X509_get_pubkey_parameters () from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
#13 0x00007fe64b7d1398 in X509_verify_cert () from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
#14 0x00007fe64bac52f8 in ?? () from /lib/x86_64-linux-gnu/libssl.so.1.0.0
#15 0x00007fe64baa2ef3 in ?? () from /lib/x86_64-linux-gnu/libssl.so.1.0.0
#16 0x00007fe64baa7222 in ?? () from /lib/x86_64-linux-gnu/libssl.so.1.0.0
#17 0x00007fe64c2dffcb in ?? () from /usr/lib/libpq.so.5
#18 0x00007fe64c2d0c5e in PQconnectPoll () from /usr/lib/libpq.so.5
#19 0x00007fe64c2d1e3e in ?? () from /usr/lib/libpq.so.5
#20 0x00007fe64c2d26f8 in PQsetdbLogin () from /usr/lib/libpq.so.5
#21 0x0000000000401ba5 in authmilt_connect (ctx=0xe81b60, hostname=0x7fe6200008c0 "localhost", hostaddr=0x7fe62f7fdce0)
atauthmilter.c:212
 
#22 0x00007fe64c4f69dc in ?? () from /usr/lib/libmilter.so.1.0.1
#23 0x00007fe64c4f5f5f in mi_engine () from /usr/lib/libmilter.so.1.0.1
#24 0x00007fe64c4fada6 in ?? () from /usr/lib/libmilter.so.1.0.1
#25 0x00007fe64c0aee9a in start_thread () from
/lib/x86_64-linux-gnu/libpthread.so.0
#26 0x00007fe64bddbccd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#27 0x0000000000000000 in ?? ()

(gdb) thread 3
[Switching to thread 3 (Thread 0x7fe62ffff700 (LWP 27283))]
#0  0x00007fe64c0b589c in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
(gdb) bt
#0  0x00007fe64c0b589c in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007fe64c0b1065 in _L_lock_858 () from /lib/x86_64-linux-gnu/libpthread.so.0
#2  0x00007fe64c0b0eba in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
#3  0x00007fe64c2df200 in ?? () from /usr/lib/libpq.so.5
#4  0x00007fe64b78a5f5 in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
#5  0x00007fe64b77a915 in RSA_new_method () from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
#6  0x00007fe64b77d64d in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
#7  0x00007fe64b7b9bf2 in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
#8  0x00007fe64b7bc6d1 in ASN1_item_ex_d2i () from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
#9  0x00007fe64b7bd0c4 in ASN1_item_d2i () from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
#10 0x00007fe64b77ea2f in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
#11 0x00007fe64b7b461a in X509_PUBKEY_get () from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
#12 0x00007fe64b7d119a in X509_get_pubkey_parameters () from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
#13 0x00007fe64b7d1398 in X509_verify_cert () from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
#14 0x00007fe64bac52f8 in ?? () from /lib/x86_64-linux-gnu/libssl.so.1.0.0
#15 0x00007fe64baa2ef3 in ?? () from /lib/x86_64-linux-gnu/libssl.so.1.0.0
#16 0x00007fe64baa7222 in ?? () from /lib/x86_64-linux-gnu/libssl.so.1.0.0
#17 0x00007fe64c2dffcb in ?? () from /usr/lib/libpq.so.5
#18 0x00007fe64c2d0c5e in PQconnectPoll () from /usr/lib/libpq.so.5
#19 0x00007fe64c2d1e3e in ?? () from /usr/lib/libpq.so.5
#20 0x00007fe64c2d26f8 in PQsetdbLogin () from /usr/lib/libpq.so.5
#21 0x0000000000401ba5 in authmilt_connect (ctx=0xe818e0, hostname=0x7fe6280008c0 "localhost", hostaddr=0x7fe62fffece0)
atauthmilter.c:212
 
#22 0x00007fe64c4f69dc in ?? () from /usr/lib/libmilter.so.1.0.1
#23 0x00007fe64c4f5f5f in mi_engine () from /usr/lib/libmilter.so.1.0.1
#24 0x00007fe64c4fada6 in ?? () from /usr/lib/libmilter.so.1.0.1
#25 0x00007fe64c0aee9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#26 0x00007fe64bddbccd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#27 0x0000000000000000 in ?? ()

If I interpret this correctly, threads #2 and #3 are waiting for the same
lock but they make no progress.

> Also, run "select *
> from pg_stat_activity" from a psql session to see what's happening inside the
> database.

It shows no activity whatsoever except the pg_stat_activity query itself.

> log_connections=on and log_disconnections=on would also be a good
> idea.

Okay. Found these when the hanging occurred:

2013-09-09 15:14:28 EEST LOG:  connection received: host=127.0.0.1
port=51519

2013-09-09 15:14:28 EEST LOG:  connection received: host=127.0.0.1
port=51520

> PS. You'll need to escape the strings in the queries, to avoid SQL injection.

Yes, thanks for the tip. I believed {auth_authen} would be safe coming
from Sendmail, but you're right that it is better to be safe than sorry.
I never intended to "release" this code yet, but was forced to do so
because I cannot figure out where bug is.

Regards,
vmk
-- 
************************************************************************              Tietotekniikkakeskus / Helsingin
yliopisto               IT department / University of Helsinki
 
************************************************************************



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: [bug fix] strerror() returns ??? in a UTF-8/C database with LC_MESSAGES=non-ASCII
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: ENABLE/DISABLE CONSTRAINT NAME