Обсуждение: BUG #17299: Exit code 3 when open connections concurrently (PQisthreadsafe() == 1)

Поиск
Список
Период
Сортировка

BUG #17299: Exit code 3 when open connections concurrently (PQisthreadsafe() == 1)

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      17299
Logged by:          Clemens
Email address:      clemens@sussol.net
PostgreSQL version: 14.0
Operating system:   Windows
Description:

Hi,

We ran into a problem that came up when using diesel-rs to connect to
postgres. The used connection manager spins up connections concurrently on
startup. On Windows 10, the app startup of the client fails occasionally
(thread problem?). The issue can be reproduced in a minimal bin which just
tries to open 10 connections concurrently (see link *). Its a Rust app but
as far as I can see the used pq-sys Rust lib does direct c calls to libpq.
The issue is Windows only, Linux and Mac are working fine.

We are using the default PG Windows version from the website. Same result
with v14.0.1, v13.5.1 and v12.9.1. 

I attached the crash stacktrace **
For reference  I also attached a link to the original issue ***

Please let me know if you need more information.

*)
https://github.com/clemens-msupply/pg-startup-crash/blob/main/src/main.rs

**)
: Call Site
00 00007ffa`51e1cb68     : 00000000`00000003 00000281`eeb36be0
00000281`eeb21170 00000000`00000000 : ntdll!NtTerminateProcess+0x14
01 00007ffa`5172d62a     : 00000000`00000003 000000ec`7c7fe7e0
00000000`00000000 00000281`eeb36be0 : ntdll!RtlExitUserProcess+0xb8
02 00007ffa`4f99a2e5     : 00000281`eeb75f00 00000000`00000000
00000000`00000008 00000000`00000008 :
KERNEL32!ExitProcessImplementation+0xa
03 00007ffa`4f99a955     : 00000000`00000001 00000281`eeb36be0
00000000`00000008 00000000`06040002 : msvcrt!_crtExitProcess+0x15
04 00007ffa`4f98f2fd     : 00000281`eeb36fe0 00000281`eeb36be0
00000281`00000000 00000281`eeb21170 : msvcrt!doexit+0x171
05 00000000`68284fe3     : 00000000`00000000 00000000`00000000
00000000`00000000 00000000`00000000 : msvcrt!abort+0x8d
06 00000000`6828190c     : 00000001`80029288 00000001`80027680
00000000`00000000 000000ec`00000000 : libintl_9!libintl_dcigettext+0x643
07 00000001`80008313     : 0000212a`1a4222ff 000000ec`7c7fe988
00000281`eeb53be0 00007ffa`51e1b9c2 : libintl_9!libintl_dcgettext+0x1c
08 00000001`8000387e     : 00000281`eeb21170 00000281`eeb48dc0
00000281`eeb21170 00000281`eeb3fdd0 : LIBPQ!PQpingParams+0x2933
09 00000001`8000603c     : 00000000`00000000 00000281`eeb21170
00000281`eeb01ad0 00000281`eeafc450 : LIBPQ!PQconnectPoll+0x63e
0a 00000001`80003168     : 00000281`eeb21170 00000281`eeb01ad0
00000281`eeafc450 00000000`00000000 : LIBPQ!PQpingParams+0x65c
0b 00000001`800048db     : 00000000`00000000 00000000`00000021
00000281`eeafc450 00000281`eeafc450 : LIBPQ!PQconnectStart+0x48
0c 00007ff6`8d214fe4     : 00000281`eeaf0000 00000000`00000000
00000000`00000000 00000000`00000000 : LIBPQ!PQconnectdb+0xb
0d 00007ff6`8d21789a     : 00000000`00000000 00000000`00000000
00000000`00000000 00000000`00000000 :
pg_debug!pg_debug::main::closure$0+0xc4 [C:\github\pg-debug\src\main.rs @
30] 
0e 00007ff6`8d2191f1     : 00000000`00000002 00007ff6`8d239990
00000000`00000020 00000281`eeafea50 :
pg_debug!std::sys_common::backtrace::__rust_begin_short_backtrace<pg_debug::main::closure$0,tuple$<>
>+0x2a
[/rustc/09c42c45858d5f3aedfa670698275303a3d19afa\library\std\src\sys_common\backtrace.rs
@ 128] 
0f 00007ff6`8d21b681     : 000000ec`7c7ff6e8 00007ff6`8d211e16
00000000`00000040 00000000`00000050 :
pg_debug!std::thread::impl$0::spawn_unchecked::closure$0::closure$0<pg_debug::main::closure$0,tuple$<>
>+0x31
[/rustc/09c42c45858d5f3aedfa670698275303a3d19afa\library\std\src\thread\mod.rs
@ 482] 
10 00007ff6`8d217de5     : 00000000`00000000 00000000`00000000
00000000`00000018 00000000`40000060 :

pg_debug!core::panic::unwind_safe::impl$23::call_once<tuple$<>,std::thread::impl$0::spawn_unchecked::closure$0::closure$0>+0x31
[/rustc/09c42c45858d5f3aedfa670698275303a3d19afa\library\core\src\panic\unwind_safe.rs
@ 272] 
11 00007ff6`8d217ec3     : 000000ec`7c7ff9b0 00007ff6`8d21d7d4
00000000`00000000 00000000`00000008 :

pg_debug!std::panicking::try::do_call<core::panic::unwind_safe::AssertUnwindSafe<std::thread::impl$0::spawn_unchecked::closure$0::closure$0>,tuple$<>
>+0x55
[/rustc/09c42c45858d5f3aedfa670698275303a3d19afa\library\std\src\panicking.rs
@ 405] 
12 00007ff6`8d217d18     : 00000000`00000000 00000000`00000000
00000000`00000000 00000000`00000000 :

pg_debug!std::panicking::try::do_catch<core::panic::unwind_safe::AssertUnwindSafe<std::thread::impl$0::spawn_unchecked::closure$0::closure$0>,tuple$<>
>+0xd3
13 00007ff6`8d211031     : 00000000`00000000 00000000`00000000
00000281`eeafa720 00000281`eeafa720 :

pg_debug!std::panicking::try<tuple$<>,core::panic::unwind_safe::AssertUnwindSafe<std::thread::impl$0::spawn_unchecked::closure$0::closure$0>
>+0xf8
[/rustc/09c42c45858d5f3aedfa670698275303a3d19afa\library\std\src\panicking.rs
@ 367] 
14 00007ff6`8d219055     : 00000000`00000000 00000000`00000000
00000000`00000000 00000000`00000000 :

pg_debug!std::panic::catch_unwind<core::panic::unwind_safe::AssertUnwindSafe<std::thread::impl$0::spawn_unchecked::closure$0::closure$0>,tuple$<>
>+0x31
[/rustc/09c42c45858d5f3aedfa670698275303a3d19afa\library\std\src\panic.rs @
129] 
15 00007ff6`8d21943e     : 00000000`00001000 00000000`00000104
00000000`00006000 000000ec`7c7f8000 :
pg_debug!std::thread::impl$0::spawn_unchecked::closure$0<pg_debug::main::closure$0,tuple$<>
>+0x125
[/rustc/09c42c45858d5f3aedfa670698275303a3d19afa\library\std\src\thread\mod.rs
@ 480] 
16 00007ff6`8d2273fc     : 00000000`00000000 00000281`eeb01ad0
00000000`00000000 00000000`00000000 :
pg_debug!core::ops::function::FnOnce::call_once<std::thread::impl$0::spawn_unchecked::closure$0,tuple$<>
>+0xe
[/rustc/09c42c45858d5f3aedfa670698275303a3d19afa\library\core\src\ops\function.rs
@ 227] 
17 (Inline Function)     : --------`-------- --------`--------
--------`-------- --------`-------- :
pg_debug!alloc::boxed::impl$44::call_once+0xb
[/rustc/09c42c45858d5f3aedfa670698275303a3d19afa\library\alloc\src\boxed.rs
@ 1636] 
18 (Inline Function)     : --------`-------- --------`--------
--------`-------- --------`-------- :
pg_debug!alloc::boxed::impl$44::call_once+0x16
[/rustc/09c42c45858d5f3aedfa670698275303a3d19afa\library\alloc\src\boxed.rs
@ 1636] 
19 00007ffa`51727974     : 00000000`00000000 00000000`00000000
00000000`00000000 00000000`00000000 :
pg_debug!std::sys::windows::thread::impl$0::new::thread_start+0x4c
[/rustc/09c42c45858d5f3aedfa670698275303a3d19afa\/library\std\src\sys\windows\thread.rs
@ 58] 
1a 00007ffa`51e0a2f1     : 00000000`00000000 00000000`00000000
00000000`00000000 00000000`00000000 : KERNEL32!BaseThreadInitThunk+0x14
1b 00000000`00000000     : 00000000`00000000 00000000`00000000
00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x21

***)
https://github.com/diesel-rs/diesel/discussions/2947


Re: BUG #17299: Exit code 3 when open connections concurrently (PQisthreadsafe() == 1)

От
Tom Lane
Дата:
PG Bug reporting form <noreply@postgresql.org> writes:
> We ran into a problem that came up when using diesel-rs to connect to
> postgres. The used connection manager spins up connections concurrently on
> startup. On Windows 10, the app startup of the client fails occasionally
> (thread problem?). The issue can be reproduced in a minimal bin which just
> tries to open 10 connections concurrently (see link *).

I don't know why there'd be such a limit - some weird Windows limitation,
perhaps?  Anyway, the immediate failure seems to be in libintl's gettext
support:

> 04 00007ffa`4f98f2fd     : 00000281`eeb36fe0 00000281`eeb36be0
> 00000281`00000000 00000281`eeb21170 : msvcrt!doexit+0x171
> 05 00000000`68284fe3     : 00000000`00000000 00000000`00000000
> 00000000`00000000 00000000`00000000 : msvcrt!abort+0x8d
> 06 00000000`6828190c     : 00000001`80029288 00000001`80027680
> 00000000`00000000 000000ec`00000000 : libintl_9!libintl_dcigettext+0x643
> 07 00000001`80008313     : 0000212a`1a4222ff 000000ec`7c7fe988
> 00000281`eeb53be0 00007ffa`51e1b9c2 : libintl_9!libintl_dcgettext+0x1c
> 08 00000001`8000387e     : 00000281`eeb21170 00000281`eeb48dc0
> 00000281`eeb21170 00000281`eeb3fdd0 : LIBPQ!PQpingParams+0x2933
> 09 00000001`8000603c     : 00000000`00000000 00000281`eeb21170
> 00000281`eeb01ad0 00000281`eeafc450 : LIBPQ!PQconnectPoll+0x63e
> 0a 00000001`80003168     : 00000281`eeb21170 00000281`eeb01ad0

Maybe you'd be able to get a usable error message if you run the
app under some other locale --- I'd try "C" locale for starters.

gettext() really is not supposed to ever crash like that (at worst,
it's supposed to return the original string if it fails to localize it).
So I think you have grounds for a bug report to the libintl maintainers,
independently of what exactly is causing libpq to want to get a translated
message.

            regards, tom lane



Re: BUG #17299: Exit code 3 when open connections concurrently (PQisthreadsafe() == 1)

От
Clemens Zeidler
Дата:
Thanks Tom,

With "C" local you meant running:

 > LANGUAGE=C ./my_app

right? unfortunately this didn't give any better errors when crashing...

I had a look at the memory of the crashed process and the second arg for 
libintl_dcigettext points to a string in memory: 'connection to server 
at "%s" (%s), port %s failed:'

which is the same what my app prints when it is not crashing, e.g. 
'Failed to initialize connection 1: connection to server at "localhost" 
(::1), port 5432 failed: fe_sendauth: no password supplied'

So it seems we are on the right track (?).

Is this of any further help or should I continue looking in in libintl? 
that's part of glib, is it?

Regards

     Clemens


On 26/11/21 9:57 am, Tom Lane wrote:
> PG Bug reporting form <noreply@postgresql.org> writes:
>> We ran into a problem that came up when using diesel-rs to connect to
>> postgres. The used connection manager spins up connections concurrently on
>> startup. On Windows 10, the app startup of the client fails occasionally
>> (thread problem?). The issue can be reproduced in a minimal bin which just
>> tries to open 10 connections concurrently (see link *).
> I don't know why there'd be such a limit - some weird Windows limitation,
> perhaps?  Anyway, the immediate failure seems to be in libintl's gettext
> support:
>
>> 04 00007ffa`4f98f2fd     : 00000281`eeb36fe0 00000281`eeb36be0
>> 00000281`00000000 00000281`eeb21170 : msvcrt!doexit+0x171
>> 05 00000000`68284fe3     : 00000000`00000000 00000000`00000000
>> 00000000`00000000 00000000`00000000 : msvcrt!abort+0x8d
>> 06 00000000`6828190c     : 00000001`80029288 00000001`80027680
>> 00000000`00000000 000000ec`00000000 : libintl_9!libintl_dcigettext+0x643
>> 07 00000001`80008313     : 0000212a`1a4222ff 000000ec`7c7fe988
>> 00000281`eeb53be0 00007ffa`51e1b9c2 : libintl_9!libintl_dcgettext+0x1c
>> 08 00000001`8000387e     : 00000281`eeb21170 00000281`eeb48dc0
>> 00000281`eeb21170 00000281`eeb3fdd0 : LIBPQ!PQpingParams+0x2933
>> 09 00000001`8000603c     : 00000000`00000000 00000281`eeb21170
>> 00000281`eeb01ad0 00000281`eeafc450 : LIBPQ!PQconnectPoll+0x63e
>> 0a 00000001`80003168     : 00000281`eeb21170 00000281`eeb01ad0
> Maybe you'd be able to get a usable error message if you run the
> app under some other locale --- I'd try "C" locale for starters.
>
> gettext() really is not supposed to ever crash like that (at worst,
> it's supposed to return the original string if it fails to localize it).
> So I think you have grounds for a bug report to the libintl maintainers,
> independently of what exactly is causing libpq to want to get a translated
> message.
>
>             regards, tom lane