Обсуждение: ECPG still having thread problems on Linux

Поиск
Список
Период
Сортировка

ECPG still having thread problems on Linux

От
Philip Yarra
Дата:
Hi all, it looks like Lee's ECPG (and libpq) thread-safety patches
have been applied, and configure --with-threads is also added. I
have been doing some testing.

On FreeBSD 4.8, the attached sample app runs without a problem.

However, I still encounter a threading problem on Linux (RedHat 7.3).

I have done the following:
1) cvs update
2) ./configure --with-threads && make && su -c "make install"
3) compiled cn.pgc as follows:       a) ecpg -t cn.pgc       b) gcc -I/usr/local/pgsql/include -L/usr/local/pgsql/lib \
             -lecpg -lpgtypes -pthread cn.c 
4) ./a.out - one thread runs to completion (inserts 5 records),       the other hangs (manages one insert, then blocks
forever)

Using gdb, I attached to the thread that has locked up, and the backtrace
looks like this:

(gdb) backtrace
#0  0x420e0187 in poll () from /lib/i686/libc.so.6
#1  0x4007d8cc in pqSocketPoll () from /usr/local/pgsql/lib/libpq.so.3
#2  0x4007d7ed in pqSocketCheck () from /usr/local/pgsql/lib/libpq.so.3
#3  0x4007d71f in pqWaitTimed () from /usr/local/pgsql/lib/libpq.so.3
#4  0x4007d6f5 in pqWait () from /usr/local/pgsql/lib/libpq.so.3
#5  0x4007bb53 in PQgetResult () from /usr/local/pgsql/lib/libpq.so.3
#6  0x4007bcbb in PQexec () from /usr/local/pgsql/lib/libpq.so.3
#7  0x40026d81 in ECPGexecute () from /usr/local/pgsql/lib/libecpg.so.4
#8  0x4002724c in ECPGdo () from /usr/local/pgsql/lib/libecpg.so.4
#9  0x08048927 in ins2 ()
#10 0x40043faf in pthread_start_thread () from /lib/i686/libpthread.so.0

Can anyone shed some light on why the behaviour differs between these two
platforms?

Also, perhaps someone other there with access to a different Linux setup
(maybe a more recent build than RedHat 7.3, or a different distro) could try
this app themselves to help verify if this is something that's stuffed on
that release. I think I can rule out this problem being a quirk of my
particular setup, as 3 different machines (all running RH7.3) give identical
results.

Build env:
Linux 2.4.18-3
gcc version 2.96 20000731 (Red Hat Linux 7.3 2.96-113)

Regards, Philip Yarra.



ECPG still having thread problems on Linux

От
Lee Kindness
Дата:
Philip, both your SELECTs are using the same database connection (and
it's undefined which one it is) without any locking. You need to add
"AT clauses" to specify an explicit connection. See attached diff.

However, i've not tried it... I'll try and get some time!

L.

Philip Yarra writes:
 > Hi all, it looks like Lee's ECPG (and libpq) thread-safety patches
 > have been applied, and configure --with-threads is also added. I
 > have been doing some testing.
 >
 > On FreeBSD 4.8, the attached sample app runs without a problem.
 >
 > However, I still encounter a threading problem on Linux (RedHat 7.3).
 >
 > I have done the following:
 > 1) cvs update
 > 2) ./configure --with-threads && make && su -c "make install"
 > 3) compiled cn.pgc as follows:
 >         a) ecpg -t cn.pgc
 >         b) gcc -I/usr/local/pgsql/include -L/usr/local/pgsql/lib \
 >                 -lecpg -lpgtypes -pthread cn.c
 > 4) ./a.out - one thread runs to completion (inserts 5 records),
 >         the other hangs (manages one insert, then blocks forever)

*** cn.pgc    2003-06-25 10:29:55.000000000 +0100
--- cn.pgc.new    2003-06-25 10:29:45.000000000 +0100
***************
*** 36,46 ****
      EXEC SQL END DECLARE SECTION;
      EXEC SQL WHENEVER sqlerror sqlprint;
      EXEC SQL CONNECT TO :cs AS test1;
!     EXEC SQL SET AUTOCOMMIT TO ON;
      for (i = 0; i < 5; i++)
      {
          printf("thread1 inserting\n");
!         EXEC SQL INSERT INTO foo VALUES(:bar);
          printf("==>thread1 insert done\n");
      }
      EXEC SQL DISCONNECT test1;
--- 36,46 ----
      EXEC SQL END DECLARE SECTION;
      EXEC SQL WHENEVER sqlerror sqlprint;
      EXEC SQL CONNECT TO :cs AS test1;
!     EXEC SQL AT test1 SET AUTOCOMMIT TO ON;
      for (i = 0; i < 5; i++)
      {
          printf("thread1 inserting\n");
!         EXEC SQL AT test1 INSERT INTO foo VALUES(:bar);
          printf("==>thread1 insert done\n");
      }
      EXEC SQL DISCONNECT test1;
***************
*** 57,67 ****
      EXEC SQL END DECLARE SECTION;
      EXEC SQL WHENEVER sqlerror sqlprint;
      EXEC SQL CONNECT TO :cs AS test2;
!     EXEC SQL SET AUTOCOMMIT TO ON;
      for (i = 0; i < 5; i++)
      {
          printf("thread2 inserting\n");
!         EXEC SQL INSERT INTO foo VALUES(:bar);
          printf("==>thread2 insert done\n");
      }
      EXEC SQL DISCONNECT test2;
--- 57,67 ----
      EXEC SQL END DECLARE SECTION;
      EXEC SQL WHENEVER sqlerror sqlprint;
      EXEC SQL CONNECT TO :cs AS test2;
!     EXEC SQL AT test2 SET AUTOCOMMIT TO ON;
      for (i = 0; i < 5; i++)
      {
          printf("thread2 inserting\n");
!         EXEC SQL AT test2 INSERT INTO foo VALUES(:bar);
          printf("==>thread2 insert done\n");
      }
      EXEC SQL DISCONNECT test2;

Re: ECPG still having thread problems on Linux

От
Philip Yarra
Дата:
On Wed, 25 Jun 2003 07:35 pm, Lee Kindness wrote:
> Philip, both your SELECTs are using the same database connection (and
> it's undefined which one it is) without any locking. You need to add
> "AT clauses" to specify an explicit connection. See attached diff.

Ah, that'd be it. I spent some time debugging last night, and I'd realised the
problem lay in the fact that the preproc was outputting NULL as the
connection name, but was unsure why. Your changes allowed both threads to
complete their inserts, which is great news for us!

I'll add that "AT" clause to my list of updates for the documentation - it
might be important. It's kinda.... absent... from the manual.

I might also add a section on using pthreads with ECPG, since people porting
from Informix or Sybase might require such info up front.

> However, i've not tried it... I'll try and get some time!

That'd be great if you could... there appears to still be a problem occurring
at "EXEC SQL DISCONNECT con_name". I'll look into it tonight if I can.

All this does kinda raise the interesting question of why it worked at all on
FreeBSD... probably different scheduling and blind luck, I suppose.

Thanks for the reponse - I'm a happy man. By 7.4, we should be able to start
porting our apps to Postgres in earnest.

Regards, Philip.


ECPG thread success (kind of) on Linux

От
Philip Yarra
Дата:
On Thu, 26 Jun 2003 11:19 am, Philip Yarra wrote:

> there appears to still be a problem
> occurring at "EXEC SQL DISCONNECT con_name". I'll look into it tonight if I
> can.

I did some more poking around last night, and believe I have found the issue:
RedHat Linux 7.3 (the only distro I have access to currently) ships with a
fairly challenged pthreads inplementation. The default mutex type (which you
get from PTHREAD_MUTEX_INITIALIZER) is, according the the man page,
PTHREAD_MUTEX_FAST_NP which is not a recursive mutex. If a thread owns a
mutex and attempts to lock the mutex again, it will hang.

By replacing PTHREAD_MUTEX_INITIALIZER with PTHREAD_MUTEX_RECURSIVE_NP for the
two mutexes that are used recursively (debug_mutex and connections_mutex) I
got my sample app to work flawlessly on Linux RedHat 7.3

Sadly, the _NP suffix is used to indicate non-portable, so of course my
FreeBSD box steadfastly refused to compile it. Darn.

The correct way to do this appears to be:

pthread_mutexattr_t *mattr;
pthread_mutexattr_settype(mattr, PTHREAD_MUTEX_RECURSIVE);

(will verify this against FreeBSD when I get home, and Tru64 man page
indicates support for this too, so I'll test that later). It won't work on
RedHat Linux 7.3... I guess something like:

#ifdef DODGY_PTHREADS
#define PTHREAD_MUTEX_RECURSIVE = PTHREAD_MUTEX_RECURSIVE_NP
#endif

might do it... if we could detect the problem during configure. How is this
sort of detection handled in other cases (such as long long, etc)?

The other solution I can think of is to eradicate the two recursive locks I
found.

One is simple: ECPGlog calls ECPGdebug, which share debug_mutex - it ought to
be okay to use different mutexes for each of these functions (there's a risk
someone might call ECPGdebug while someone else is running through ECPGlog,
but I think it is less likely, since it is a debug mechanism.)

The second recursive lock I found is ECPGdisconnect calling
ECPGget_connection, both of which share a mutex. Would it be okay if we did
the following:

ECPGdisconnect() still locks connections_mutex, but calls
ECPGget_connection_nr() instead of ECPGget_connection()

ECPGget_connection() becomes a locking wrapper, which locks connections_mutex
then calls ECPGget_connection_nr()

ECPGget_connection_nr() is a non-locking function which implements what
ECPGget_connection() currently does.

I'm not sure if this sort of thing is okay (and there may be other recursive
locking scenarios that I haven't exercised yet).

What approach should I take? I'm leaning towards eradicating recursive locks,
unless someone has a good reason not to.

> All this does kinda raise the interesting question of why it worked at all
> on FreeBSD... probably different scheduling and blind luck, I suppose.

FreeBSD 4.8 must have PTHREAD_MUTEX_RECURSIVE as default mutex type. I'm a bit
concerned about FreeBSD 4.2 though - I noticed (before I blew it away in
favour of 4.8) that its pthreads implementation came from a package called
linuxthreads.tgz - it might have inherited the same problematic behaviour.
Could someone with access to or knowledge of FreeBSD 4.2 check what the
default mutex type is there?

Regards, Philip.

I can just see the ad for 7.3's pthreads impementation
"Fast mutexes: zero to deadlock in 6.9 milliseconds!"


Re: ECPG thread success (kind of) on Linux

От
Michael Meskes
Дата:
On Fri, Jun 27, 2003 at 10:45:46AM +1000, Philip Yarra wrote:
> ECPGget_connection, both of which share a mutex. Would it be okay if we did 
> the following:
> ...

As you know I have never tried using threads, so feel free to go ahead
and change this. Either commit to cvs ot send me a patch.

Michael
-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De
ICQ: 179140304, AIM: michaelmeskes, Jabber: meskes@jabber.org
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!