Re: BUG #10888: application is getting hanged in the poll() function of libpq.so.

Поиск
Список
Период
Сортировка
От Mitu Verma
Тема Re: BUG #10888: application is getting hanged in the poll() function of libpq.so.
Дата
Msg-id 84BC7AB0D621A74893EC9C9E151293B003F904C2@ESESSMB207.ericsson.se
обсуждение исходный текст
Ответ на Re: BUG #10888: application is getting hanged in the poll() function of libpq.so.  (Mitu Verma <mitu.verma@ericsson.com>)
Список pgsql-bugs
Hi tom,

I have few more observations regarding the issue mentioned below -

Few points=20
1. Database dump has been taken from the customer site from where the issue=
 has been reported, we are using postgreSQL 9.1.3

2.Schema of the table is  -
   Column    |         Type          | Modifiers
-------------+-----------------------+-----------
 tableindex  | integer               |
 object      | character varying(80) |
 method      | character varying(80) |
 bgwuser     | character varying(80) |
 time        | character(23)         |
 realuser    | character varying(80) |
 host        | character varying(80) |
 application | character varying(80) |
Indexes:
    "ind1_eventlogentry" UNIQUE, btree (tableindex), tablespace "mmindex"
Tablespace: "mmdata"

3. On the  database which we have taken from customer site  when we were ru=
nning the following query , query is hung.
INSERT INTO eventLogEntry (object, method, bgwUser, time, realUser, host, a=
pplication, tableIndex  ) VALUES (E'Server', E'Start', E'bgw', E'2014051212=
2404', NULL, NULL, NULL, 539  );

4. But when we used index after 586 like, then the insert is happening corr=
ectly on the same database.
INSERT INTO eventLogEntry (object, method, bgwUser, time, realUser, host, a=
pplication, tableIndex  ) VALUES (E'Server', E'Start', E'bgw', E'2014051212=
2404', NULL, NULL, NULL, 587 );

5. When we did debugging , then we found that postgreSQL has stored some ot=
her transaction id and it asks for the current transaction to wait till the=
 time previous transaction
Either gets committed or rolled back.

6. It seems that somehow the indexes are corrupted and postgreSQL has store=
d some transaction id somewhere, and so it is not allowing the current tran=
saction to happen=20
For that particular index range.

7. When the index has been dropped and table has been recreated then everyt=
hing works fine.


Plz help me with this issue by providing answers to the following queries

A) why postgreSQL is not allowing the insertion to happen for a particular =
index range and allowing it after that particular range?
B) where does the post greSQL stores the transaction ids and why it is not =
getting refreshed ?


Regards
Mitu
-----Original Message-----
From: Mitu Verma=20
Sent: Wednesday, July 09, 2014 12:31 PM
To: 'Tom Lane'
Cc: pgsql-bugs@postgresql.org
Subject: RE: [BUGS] BUG #10888: application is getting hanged in the poll()=
 function of libpq.so.

Hi Tom,

I have reproduced the issue on our test bed.

When I run a single query to insert the data into table , then the query is=
 getting hanged.

 I debugged this issue by compiling the postgres code and using the databas=
e where the customer is facing problem. (see the gdb below) Then the coredu=
mp is coming with the assertion getting failed for transaction id as 0.

Following are my observations.

1. Even if I run a single query, (i.e no other transaction is in place) , t=
hen also following code is returning valid value of xwait. I checked this u=
sing gdb.(see the gdb below)

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX=
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
File nbtinsert.c=20

Bool _bt_doinsert(Relation rel, IndexTuple itup,IndexUniqueCheck checkUniqu=
e, Relation heapRel) {
-----------------------------
-----------------------------
xwait =3D _bt_check_unique(rel, itup, heapRel, buf, offset, itup_scankey ch=
eckUnique, &is_unique);

 /* As per my understanding if I am running a single transaction then tis v=
alue should retun 0, but it seems that this value is being set some where I=
n postgres code.
-----------------------------------
}
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX=
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
now because of the valid value of this xwait, it assumes that there is some=
 other transaction which is already going on and so the current transaction=
 has to wait till the old transaction is either committed or rolled backed.
So the current transaction waits in infinite loop for the completion of the=
 other transaction( which is not there actually ).
So the current transaction just keeps on waiting forever.

2. I checked inside bt_check_unique and following is the code which is hit =
in our case and which suggests that there is an already existing transactio=
n XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXiXXXXXXXXX=
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

/* If this tuple is being updated by other transaction then we have to wait=
 for its commit/abort.*/ xwait =3D (TransactionIdIsValid(SnapshotDirty.xmin=
)) ?SnapshotDirty.xmin : SnapshotDirty.xmax;

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX=
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

3. This issue is coming only with that table where table index is defined.

4. Issue is resolved if the index is dropped and we recreate the table.

It looks to me that postgres stores the transaction (SnapshotDirty ?) somew=
here and even if there is no other transaction going on this is Showing som=
e abrupt value which is giving conflicts.

If anyone can help me with this piece of postgresql code then it will be re=
ally helpful.

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX=
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
(gdb) where
#0  0x00000031c8232a45 in raise () from /lib64/libc.so.6
#1  0x00000031c8234225 in abort () from /lib64/libc.so.6
#2  0x000000000072248d in ExceptionalCondition (conditionName=3D<value opti=
mized out>, errorType=3D<value optimized out>, fileName=3D<value optimized =
out>, lineNumber=3D<value optimized out>)
    at assert.c:57
#3  0x000000000064ee48 in XactLockTableWait (xid=3D0) at lmgr.c:471
#4  0x00000000004837fb in _bt_doinsert (rel=3D0x7f616e9e3dc8, itup=3D0x221a=
b30, checkUnique=3DUNIQUE_CHECK_YES, heapRel=3D0x7f616e9df3a8) at nbtinsert=
.c:169
#5  0x0000000000487f21 in btinsert (fcinfo=3D<value optimized out>) at nbtr=
ee.c:261
#6  0x0000000000728c14 in FunctionCall6Coll (flinfo=3D<value optimized out>=
, collation=3D<value optimized out>, arg1=3D<value optimized out>, arg2=3D<=
value optimized out>, arg3=3D<value optimized out>,
    arg4=3D<value optimized out>, arg5=3D140056444400552, arg6=3D1) at fmgr=
.c:1439
#7  0x000000000047f746 in index_insert (indexRelation=3D0x7f616e9e3dc8, val=
ues=3D0x7fff36fd71f0, isnull=3D0x7fff36fd72f0 "", heap_t_ctid=3D0x221a9dc, =
heapRelation=3D0x7f616e9df3a8,
    checkUnique=3DUNIQUE_CHECK_YES) at indexam.c:215
#8  0x0000000000591c15 in ExecInsertIndexTuples (slot=3D0x2217918, tupleid=
=3D0x221a9dc, estate=3D0x2215a18) at execUtils.c:1086
#9  0x000000000059fab5 in ExecInsert (node=3D0x2219a40) at nodeModifyTable.=
c:247
#10 ExecModifyTable (node=3D0x2219a40) at nodeModifyTable.c:847
#11 0x00000000005878d8 in ExecProcNode (node=3D0x2219a40) at execProcnode.c=
:371
#12 0x0000000000586526 in ExecutePlan (queryDesc=3D0x2215608, direction=3D<=
value optimized out>, count=3D0) at execMain.c:1440
#13 standard_ExecutorRun (queryDesc=3D0x2215608, direction=3D<value optimiz=
ed out>, count=3D0) at execMain.c:314
#14 0x0000000000667abf in ProcessQuery (plan=3D0x21c5920,
    sourceText=3D0x21c3a08 "INSERT INTO eventLogEntry (object, method, bgwU=
ser, time, realUser, host, application, tableIndex ) VALUES ('Server', 'Sta=
rt', 'bgw', '20140512122404', NULL, NULL, NULL, 539);", params=3D0x0, dest=
=3D<value optimized out>, completionTag=3D0x7fff36fd76d0 "") at pquery.c:18=
7
#15 0x0000000000667d65 in PortalRunMulti (portal=3D0x21272e8, isTopLevel=3D=
1 '\001', dest=3D0x220f580, altdest=3D0x220f580, completionTag=3D0x7fff36fd=
76d0 "") at pquery.c:1276
#16 0x0000000000668592 in PortalRun (portal=3D0x21272e8, count=3D9223372036=
854775807, isTopLevel=3D1 '\001', dest=3D0x220f580, altdest=3D0x220f580, co=
mpletionTag=3D0x7fff36fd76d0 "") at pquery.c:813
#17 0x0000000000664949 in exec_simple_query (
    query_string=3D0x21c3a08 "INSERT INTO eventLogEntry (object, method, bg=
wUser, time, realUser, host, application, tableIndex ) VALUES ('Server', 'S=
tart', 'bgw', '20140512122404', NULL, NULL, NULL, 539);") at postgres.c:102=
0
#18 0x0000000000665880 in PostgresMain (argc=3D<value optimized out>, argv=
=3D<value optimized out>, username=3D<value optimized out>) at postgres.c:3=
968
#19 0x000000000061c691 in BackendRun () at postmaster.c:3617
#20 BackendStartup () at postmaster.c:3302
#21 ServerLoop () at postmaster.c:1466
#22 0x000000000061eff2 in PostmasterMain (argc=3D<value optimized out>, arg=
v=3D<value optimized out>) at postmaster.c:1127
#23 0x00000000005b8bd0 in main (argc=3D3, argv=3D0x2118e70) at main.c:199

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX=
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX=
XXX
-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, July 08, 2014 12:05 AM
To: Mitu Verma
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #10888: application is getting hanged in the poll()=
 function of libpq.so.

Mitu Verma <mitu.verma@ericsson.com> writes:
>  I am not sure if it is a postgres issue or not, so need your help in tha=
t.=20

Well, it looks like you've got two concurrent transactions trying to insert=
 the exact same data:

>  16384 | fm_db_CollectCCN |   20714 |    16391 | mmsuper |               =
   | ::1         |                 |       19540
> | 2014-07-07 19:10:09.295565+02 | 2014-07-07 19:10:09.311701+02 | 2014-07=
-07 19:10:09.311701+02 | f       | INSERT INTO ev
> entLogEntry (object, method, bgwUser, time, realUser, host, application, =
tableIndex  ) VALUES (E'Server', E'Start', E'bgw'
> , E'20140512122404', NULL, NULL, NULL, 539  )
>  16384 | fm_db_CollectCCN |   20819 |    16391 | mmsuper |               =
   | ::1         |                 |       19602
> | 2014-07-07 19:10:45.18688+02  | 2014-07-07 19:10:45.202903+02 | 2014-07=
-07 19:10:45.202903+02 | f       | INSERT INTO ev
> entLogEntry (object, method, bgwUser, time, realUser, host, application, =
tableIndex  ) VALUES (E'Server', E'Start', E'bgw'
> , E'20140512122404', NULL, NULL, NULL, 539  )

Presumably this is falling foul of a unique index on the table.
The second one is going to block until the first one either commits or roll=
s back.

            regards, tom lane

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Compile fails on AIX 6.1
Следующее
От: karel.knezourek@gmail.com
Дата:
Сообщение: BUG #10922: row_to_json creates invalid JSON for currecy lower then 1000.