Обсуждение: [HACKERS] Deadlock in XLogInsert at AIX

Поиск
Список
Период
Сортировка

[HACKERS] Deadlock in XLogInsert at AIX

От
Konstantin Knizhnik
Дата:
Hi Hackers,

We are running Postgres at AIX and encoountered two strqange problems: active zombies process and deadlock in XLOG writer.
First problem I will explain in separate mail, now I am mostly concerning about deadlock.
It is irregularly reproduced with standard pgbench launched with 100 connections.

It sometimes happens with 9.6 stable version of Postgres but only when it is compiled with xlc compiler.
We failed to reproduce the problem with GCC. So it looks like as bug in compiler or xlc-specific atomics implementation...
But there are few moments which contradicts with this hypothesis:

1. The problem is reproduce with Postgres built without optimization. Usually compiler bugs affect only optimized code.
2. Disabling atomics doesn't help.
3. Without optimization and with  LOCK_DEBUG defined time of reproducing the problem significantly increased. With optimized code it is almost always reproduced in few minutes.
With debug version it usually takes much more time.

But the most confusing thing is stack trace:

(dbx) where
semop(??, ??, ??) at 0x9000000001f5790
PGSemaphoreLock(sema = 0x0a00000044b95928), line 387 in "pg_sema.c"
unnamed block in LWLockWaitForVar(lock = 0x0a0000000000d980, valptr = 0x0a0000000000d9a8, oldval = 102067874256, newval = 0x0fffffffffff9c10), line 1666 in "lwlock.c"
LWLockWaitForVar(lock = 0x0a0000000000d980, valptr = 0x0a0000000000d9a8, oldval = 102067874256, newval = 0x0fffffffffff9c10), line 1666 in "lwlock.c"
unnamed block in WaitXLogInsertionsToFinish(upto = 102067874328), line 1583 in "xlog.c"
WaitXLogInsertionsToFinish(upto = 102067874328), line 1583 in "xlog.c"
AdvanceXLInsertBuffer(upto = 102067874256, opportunistic = '\0'), line 1916 in "xlog.c"
unnamed block in GetXLogBuffer(ptr = 102067874256), line 1697 in "xlog.c"
GetXLogBuffer(ptr = 102067874256), line 1697 in "xlog.c"
CopyXLogRecordToWAL(write_len = 70, isLogSwitch = '\0', rdata = 0x000000011007ce10, StartPos = 102067874256, EndPos = 102067874328), line 1279 in "xlog.c"
XLogInsertRecord(rdata = 0x000000011007ce10, fpw_lsn = 102067718328), line 1011 in "xlog.c"
unnamed block in XLogInsert(rmid = '\n', info = '@'), line 453 in "xloginsert.c"
XLogInsert(rmid = '\n', info = '@'), line 453 in "xloginsert.c"
log_heap_update(reln = 0x0000000110273540, oldbuf = 40544, newbuf = 40544, oldtup = 0x0fffffffffffa2a0, newtup = 0x00000001102bb958, old_key_tuple = (nil), all_visible_cleared = '\0', new_all_visible_cleared = '\0'), line 7708 in "heapam.c"
unnamed block in heap_update(relation = 0x0000000110273540, otid = 0x0fffffffffffa6f8, newtup = 0x00000001102bb958, cid = 1, crosscheck = (nil), wait = '^A', hufd = 0x0fffffffffffa5b0, lockmode = 0x0fffffffffffa5c8), line 4212 in "heapam.c"
heap_update(relation = 0x0000000110273540, otid = 0x0fffffffffffa6f8, newtup = 0x00000001102bb958, cid = 1, crosscheck = (nil), wait = '^A', hufd = 0x0fffffffffffa5b0, lockmode = 0x0fffffffffffa5c8), line 4212 in "heapam.c"
unnamed block in ExecUpdate(tupleid = 0x0fffffffffffa6f8, oldtuple = (nil), slot = 0x00000001102bb308, planSlot = 0x00000001102b4630, epqstate = 0x00000001102b2cd8, estate = 0x00000001102b29e0, canSetTag = '^A'), line 937 in "nodeModifyTable.c"
ExecUpdate(tupleid = 0x0fffffffffffa6f8, oldtuple = (nil), slot = 0x00000001102bb308, planSlot = 0x00000001102b4630, epqstate = 0x00000001102b2cd8, estate = 0x00000001102b29e0, canSetTag = '^A'), line 937 in "nodeModifyTable.c"
ExecModifyTable(node = 0x00000001102b2c30), line 1516 in "nodeModifyTable.c"
ExecProcNode(node = 0x00000001102b2c30), line 396 in "execProcnode.c"
ExecutePlan(estate = 0x00000001102b29e0, planstate = 0x00000001102b2c30, use_parallel_mode = '\0', operation = CMD_UPDATE, sendTuples = '\0', numberTuples = 0, direction = ForwardScanDirection, dest = 0x00000001102b7520), line 1569 in "execMain.c"
standard_ExecutorRun(queryDesc = 0x00000001102b25c0, direction = ForwardScanDirection, count = 0), line 338 in "execMain.c"
ExecutorRun(queryDesc = 0x00000001102b25c0, direction = ForwardScanDirection, count = 0), line 286 in "execMain.c"
ProcessQuery(plan = 0x00000001102b1510, sourceText = "UPDATE pgbench_tellers SET tbalance = tbalance + 4019 WHERE tid = 6409;", params = (nil), dest = 0x00000001102b7520, completionTag = ""), line 187 in "pquery.c"
unnamed block in PortalRunMulti(portal = 0x0000000110115e20, isTopLevel = '^A', setHoldSnapshot = '\0', dest = 0x00000001102b7520, altdest = 0x00000001102b7520, completionTag = ""), line 1303 in "pquery.c"
unnamed block in PortalRunMulti(portal = 0x0000000110115e20, isTopLevel = '^A', setHoldSnapshot = '\0', dest = 0x00000001102b7520, altdest = 0x00000001102b7520, completionTag = ""), line 1303 in "pquery.c"
PortalRunMulti(portal = 0x0000000110115e20, isTopLevel = '^A', setHoldSnapshot = '\0', dest = 0x00000001102b7520, altdest = 0x00000001102b7520, completionTag = ""), line 1303 in "pquery.c"
unnamed block in PortalRun(portal = 0x0000000110115e20, count = 9223372036854775807, isTopLevel = '^A', dest = 0x00000001102b7520, altdest = 0x00000001102b7520, completionTag = ""), line 815 in "pquery.c"
PortalRun(portal = 0x0000000110115e20, count = 9223372036854775807, isTopLevel = '^A', dest = 0x00000001102b7520, altdest = 0x00000001102b7520, completionTag = ""), line 815 in "pquery.c"
unnamed block in exec_simple_query(query_string = "UPDATE pgbench_tellers SET tbalance = tbalance + 4019 WHERE tid = 6409;"), line 1094 in "postgres.c"
exec_simple_query(query_string = "UPDATE pgbench_tellers SET tbalance = tbalance + 4019 WHERE tid = 6409;"), line 1094 in "postgres.c"
unnamed block in PostgresMain(argc = 1, argv = 0x0000000110119f68, dbname = "postgres", username = "postgres"), line 4076 in "postgres.c"
PostgresMain(argc = 1, argv = 0x0000000110119f68, dbname = "postgres", username = "postgres"), line 4076 in "postgres.c"
BackendRun(port = 0x0000000110114290), line 4279 in "postmaster.c"
BackendStartup(port = 0x0000000110114290), line 3953 in "postmaster.c"
unnamed block in ServerLoop(), line 1701 in "postmaster.c"
unnamed block in ServerLoop(), line 1701 in "postmaster.c"
unnamed block in ServerLoop(), line 1701 in "postmaster.c"
ServerLoop(), line 1701 in "postmaster.c"
PostmasterMain(argc = 3, argv = 0x00000001100c6190), line 1309 in "postmaster.c"
main(argc = 3, argv = 0x00000001100c6190), line 228 in "main.c"


As I already mentioned, we built Postgres with LOCK_DEBUG , so we can inspect lock owner. Backend is waiting for itself!
Now please look at two frames in this stack trace marked with red.
XLogInsertRecord is setting WALInsert locks at the beginning of the function:

    if (isLogSwitch)
        WALInsertLockAcquireExclusive();
    else
        WALInsertLockAcquire();

WALInsertLockAcquire just selects random item from WALInsertLocks array and exclusively locks:

    if (lockToTry == -1)
        lockToTry = MyProc->pgprocno % NUM_XLOGINSERT_LOCKS;
    MyLockNo = lockToTry;
    immed = LWLockAcquire(&WALInsertLocks[MyLockNo].l.lock, LW_EXCLUSIVE);

Then, following the stack trace, AdvanceXLInsertBuffer calls WaitXLogInsertionsToFinish:

            /*
             * Now that we have an up-to-date LogwrtResult value, see if we
             * still need to write it or if someone else already did.
             */
            if (LogwrtResult.Write < OldPageRqstPtr)
            {
                /*
                 * Must acquire write lock. Release WALBufMappingLock first,
                 * to make sure that all insertions that we need to wait for
                 * can finish (up to this same position). Otherwise we risk
                 * deadlock.
                 */
                LWLockRelease(WALBufMappingLock);

                WaitXLogInsertionsToFinish(OldPageRqstPtr);

                LWLockAcquire(WALWriteLock, LW_EXCLUSIVE);


It releases WALBufMappingLock but not WAL insert locks!
Finally in WaitXLogInsertionsToFinish tries to wait for all locks:

    for (i = 0; i < NUM_XLOGINSERT_LOCKS; i++)
    {
        XLogRecPtr    insertingat = InvalidXLogRecPtr;

        do
        {
            /*
             * See if this insertion is in progress. LWLockWait will wait for
             * the lock to be released, or for the 'value' to be set by a
             * LWLockUpdateVar call.  When a lock is initially acquired, its
             * value is 0 (InvalidXLogRecPtr), which means that we don't know
             * where it's inserting yet.  We will have to wait for it.  If
             * it's a small insertion, the record will most likely fit on the
             * same page and the inserter will release the lock without ever
             * calling LWLockUpdateVar.  But if it has to sleep, it will
             * advertise the insertion point with LWLockUpdateVar before
             * sleeping.
             */
            if (LWLockWaitForVar(&WALInsertLocks[i].l.lock,
                                 &WALInsertLocks[i].l.insertingAt,
                                 insertingat, &insertingat))

And here we stuck!
The comment to WaitXLogInsertionsToFinish  says:

 * Note: When you are about to write out WAL, you must call this function
 * *before* acquiring WALWriteLock, to avoid deadlocks. This function might
 * need to wait for an insertion to finish (or at least advance to next
 * uninitialized page), and the inserter might need to evict an old WAL buffer
 * to make room for a new one, which in turn requires WALWriteLock.

Which contradicts to the observed stack trace.

I wonder if it is really synchronization bug in xlog.c or there is something wrong in this stack trace and it can not happen in case of normal work?

Thanks in advance,
-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Konstantin Knizhnik
Дата:
More information about the problem - Postgres log contains several records:

2017-01-24 19:15:20.272 MSK [19270462] LOG:  request to flush past end of generated WAL; request 6/AAEBE000, currpos 6/AAEBC2B0

and them correspond to the time when deadlock happen.
There is the following comment in xlog.c concerning this message:

    /*
     * No-one should request to flush a piece of WAL that hasn't even been
     * reserved yet. However, it can happen if there is a block with a bogus
     * LSN on disk, for example. XLogFlush checks for that situation and
     * complains, but only after the flush. Here we just assume that to mean
     * that all WAL that has been reserved needs to be finished. In this
     * corner-case, the return value can be smaller than 'upto' argument.
     */

So looks like it should not happen.
The first thing to suspect is spinlock implementation which is different for GCC and XLC.
But ... if I rebuild Postgres without spinlocks, then the problem is still reproduced.

On 24.01.2017 17:47, Konstantin Knizhnik wrote:
Hi Hackers,

We are running Postgres at AIX and encoountered two strqange problems: active zombies process and deadlock in XLOG writer.
First problem I will explain in separate mail, now I am mostly concerning about deadlock.
It is irregularly reproduced with standard pgbench launched with 100 connections.

It sometimes happens with 9.6 stable version of Postgres but only when it is compiled with xlc compiler.
We failed to reproduce the problem with GCC. So it looks like as bug in compiler or xlc-specific atomics implementation...
But there are few moments which contradicts with this hypothesis:

1. The problem is reproduce with Postgres built without optimization. Usually compiler bugs affect only optimized code.
2. Disabling atomics doesn't help.
3. Without optimization and with  LOCK_DEBUG defined time of reproducing the problem significantly increased. With optimized code it is almost always reproduced in few minutes.
With debug version it usually takes much more time.

But the most confusing thing is stack trace:

(dbx) where
semop(??, ??, ??) at 0x9000000001f5790
PGSemaphoreLock(sema = 0x0a00000044b95928), line 387 in "pg_sema.c"
unnamed block in LWLockWaitForVar(lock = 0x0a0000000000d980, valptr = 0x0a0000000000d9a8, oldval = 102067874256, newval = 0x0fffffffffff9c10), line 1666 in "lwlock.c"
LWLockWaitForVar(lock = 0x0a0000000000d980, valptr = 0x0a0000000000d9a8, oldval = 102067874256, newval = 0x0fffffffffff9c10), line 1666 in "lwlock.c"
unnamed block in WaitXLogInsertionsToFinish(upto = 102067874328), line 1583 in "xlog.c"
WaitXLogInsertionsToFinish(upto = 102067874328), line 1583 in "xlog.c"
AdvanceXLInsertBuffer(upto = 102067874256, opportunistic = '\0'), line 1916 in "xlog.c"
unnamed block in GetXLogBuffer(ptr = 102067874256), line 1697 in "xlog.c"
GetXLogBuffer(ptr = 102067874256), line 1697 in "xlog.c"
CopyXLogRecordToWAL(write_len = 70, isLogSwitch = '\0', rdata = 0x000000011007ce10, StartPos = 102067874256, EndPos = 102067874328), line 1279 in "xlog.c"
XLogInsertRecord(rdata = 0x000000011007ce10, fpw_lsn = 102067718328), line 1011 in "xlog.c"
unnamed block in XLogInsert(rmid = '\n', info = '@'), line 453 in "xloginsert.c"
XLogInsert(rmid = '\n', info = '@'), line 453 in "xloginsert.c"
log_heap_update(reln = 0x0000000110273540, oldbuf = 40544, newbuf = 40544, oldtup = 0x0fffffffffffa2a0, newtup = 0x00000001102bb958, old_key_tuple = (nil), all_visible_cleared = '\0', new_all_visible_cleared = '\0'), line 7708 in "heapam.c"
unnamed block in heap_update(relation = 0x0000000110273540, otid = 0x0fffffffffffa6f8, newtup = 0x00000001102bb958, cid = 1, crosscheck = (nil), wait = '^A', hufd = 0x0fffffffffffa5b0, lockmode = 0x0fffffffffffa5c8), line 4212 in "heapam.c"
heap_update(relation = 0x0000000110273540, otid = 0x0fffffffffffa6f8, newtup = 0x00000001102bb958, cid = 1, crosscheck = (nil), wait = '^A', hufd = 0x0fffffffffffa5b0, lockmode = 0x0fffffffffffa5c8), line 4212 in "heapam.c"
unnamed block in ExecUpdate(tupleid = 0x0fffffffffffa6f8, oldtuple = (nil), slot = 0x00000001102bb308, planSlot = 0x00000001102b4630, epqstate = 0x00000001102b2cd8, estate = 0x00000001102b29e0, canSetTag = '^A'), line 937 in "nodeModifyTable.c"
ExecUpdate(tupleid = 0x0fffffffffffa6f8, oldtuple = (nil), slot = 0x00000001102bb308, planSlot = 0x00000001102b4630, epqstate = 0x00000001102b2cd8, estate = 0x00000001102b29e0, canSetTag = '^A'), line 937 in "nodeModifyTable.c"
ExecModifyTable(node = 0x00000001102b2c30), line 1516 in "nodeModifyTable.c"
ExecProcNode(node = 0x00000001102b2c30), line 396 in "execProcnode.c"
ExecutePlan(estate = 0x00000001102b29e0, planstate = 0x00000001102b2c30, use_parallel_mode = '\0', operation = CMD_UPDATE, sendTuples = '\0', numberTuples = 0, direction = ForwardScanDirection, dest = 0x00000001102b7520), line 1569 in "execMain.c"
standard_ExecutorRun(queryDesc = 0x00000001102b25c0, direction = ForwardScanDirection, count = 0), line 338 in "execMain.c"
ExecutorRun(queryDesc = 0x00000001102b25c0, direction = ForwardScanDirection, count = 0), line 286 in "execMain.c"
ProcessQuery(plan = 0x00000001102b1510, sourceText = "UPDATE pgbench_tellers SET tbalance = tbalance + 4019 WHERE tid = 6409;", params = (nil), dest = 0x00000001102b7520, completionTag = ""), line 187 in "pquery.c"
unnamed block in PortalRunMulti(portal = 0x0000000110115e20, isTopLevel = '^A', setHoldSnapshot = '\0', dest = 0x00000001102b7520, altdest = 0x00000001102b7520, completionTag = ""), line 1303 in "pquery.c"
unnamed block in PortalRunMulti(portal = 0x0000000110115e20, isTopLevel = '^A', setHoldSnapshot = '\0', dest = 0x00000001102b7520, altdest = 0x00000001102b7520, completionTag = ""), line 1303 in "pquery.c"
PortalRunMulti(portal = 0x0000000110115e20, isTopLevel = '^A', setHoldSnapshot = '\0', dest = 0x00000001102b7520, altdest = 0x00000001102b7520, completionTag = ""), line 1303 in "pquery.c"
unnamed block in PortalRun(portal = 0x0000000110115e20, count = 9223372036854775807, isTopLevel = '^A', dest = 0x00000001102b7520, altdest = 0x00000001102b7520, completionTag = ""), line 815 in "pquery.c"
PortalRun(portal = 0x0000000110115e20, count = 9223372036854775807, isTopLevel = '^A', dest = 0x00000001102b7520, altdest = 0x00000001102b7520, completionTag = ""), line 815 in "pquery.c"
unnamed block in exec_simple_query(query_string = "UPDATE pgbench_tellers SET tbalance = tbalance + 4019 WHERE tid = 6409;"), line 1094 in "postgres.c"
exec_simple_query(query_string = "UPDATE pgbench_tellers SET tbalance = tbalance + 4019 WHERE tid = 6409;"), line 1094 in "postgres.c"
unnamed block in PostgresMain(argc = 1, argv = 0x0000000110119f68, dbname = "postgres", username = "postgres"), line 4076 in "postgres.c"
PostgresMain(argc = 1, argv = 0x0000000110119f68, dbname = "postgres", username = "postgres"), line 4076 in "postgres.c"
BackendRun(port = 0x0000000110114290), line 4279 in "postmaster.c"
BackendStartup(port = 0x0000000110114290), line 3953 in "postmaster.c"
unnamed block in ServerLoop(), line 1701 in "postmaster.c"
unnamed block in ServerLoop(), line 1701 in "postmaster.c"
unnamed block in ServerLoop(), line 1701 in "postmaster.c"
ServerLoop(), line 1701 in "postmaster.c"
PostmasterMain(argc = 3, argv = 0x00000001100c6190), line 1309 in "postmaster.c"
main(argc = 3, argv = 0x00000001100c6190), line 228 in "main.c"


As I already mentioned, we built Postgres with LOCK_DEBUG , so we can inspect lock owner. Backend is waiting for itself!
Now please look at two frames in this stack trace marked with red.
XLogInsertRecord is setting WALInsert locks at the beginning of the function:

    if (isLogSwitch)
        WALInsertLockAcquireExclusive();
    else
        WALInsertLockAcquire();

WALInsertLockAcquire just selects random item from WALInsertLocks array and exclusively locks:

    if (lockToTry == -1)
        lockToTry = MyProc->pgprocno % NUM_XLOGINSERT_LOCKS;
    MyLockNo = lockToTry;
    immed = LWLockAcquire(&WALInsertLocks[MyLockNo].l.lock, LW_EXCLUSIVE);

Then, following the stack trace, AdvanceXLInsertBuffer calls WaitXLogInsertionsToFinish:

            /*
             * Now that we have an up-to-date LogwrtResult value, see if we
             * still need to write it or if someone else already did.
             */
            if (LogwrtResult.Write < OldPageRqstPtr)
            {
                /*
                 * Must acquire write lock. Release WALBufMappingLock first,
                 * to make sure that all insertions that we need to wait for
                 * can finish (up to this same position). Otherwise we risk
                 * deadlock.
                 */
                LWLockRelease(WALBufMappingLock);

                WaitXLogInsertionsToFinish(OldPageRqstPtr);

                LWLockAcquire(WALWriteLock, LW_EXCLUSIVE);


It releases WALBufMappingLock but not WAL insert locks!
Finally in WaitXLogInsertionsToFinish tries to wait for all locks:

    for (i = 0; i < NUM_XLOGINSERT_LOCKS; i++)
    {
        XLogRecPtr    insertingat = InvalidXLogRecPtr;

        do
        {
            /*
             * See if this insertion is in progress. LWLockWait will wait for
             * the lock to be released, or for the 'value' to be set by a
             * LWLockUpdateVar call.  When a lock is initially acquired, its
             * value is 0 (InvalidXLogRecPtr), which means that we don't know
             * where it's inserting yet.  We will have to wait for it.  If
             * it's a small insertion, the record will most likely fit on the
             * same page and the inserter will release the lock without ever
             * calling LWLockUpdateVar.  But if it has to sleep, it will
             * advertise the insertion point with LWLockUpdateVar before
             * sleeping.
             */
            if (LWLockWaitForVar(&WALInsertLocks[i].l.lock,
                                 &WALInsertLocks[i].l.insertingAt,
                                 insertingat, &insertingat))

And here we stuck!
The comment to WaitXLogInsertionsToFinish  says:

 * Note: When you are about to write out WAL, you must call this function
 * *before* acquiring WALWriteLock, to avoid deadlocks. This function might
 * need to wait for an insertion to finish (or at least advance to next
 * uninitialized page), and the inserter might need to evict an old WAL buffer
 * to make room for a new one, which in turn requires WALWriteLock.

Which contradicts to the observed stack trace.

I wonder if it is really synchronization bug in xlog.c or there is something wrong in this stack trace and it can not happen in case of normal work?

Thanks in advance,
-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Bernd Helmle
Дата:
Hi Konstantin,

We had observed exactly the same issues on a customer system with the
same environment and PostgreSQL 9.5.5. Additionally, we've tested on
Linux with XL/C 12 and 13 with exactly the same deadlock behavior. 

So we assumed that this is somehow a compiler issue.

Am Dienstag, den 24.01.2017, 19:26 +0300 schrieb Konstantin Knizhnik:
> More information about the problem - Postgres log contains several
> records:
> 
> 2017-01-24 19:15:20.272 MSK [19270462] LOG:  request to flush past
> end 
> of generated WAL; request 6/AAEBE000, currpos 6/AAEBC2B0
> 
> and them correspond to the time when deadlock happen.

Yeah, the same logs here:

LOG:  request to flush past end of generated WAL; request 1/1F4C6000,
currpos 1/1F4C40E0
STATEMENT:  UPDATE pgbench_accounts SET abalance = abalance + -2653
WHERE aid = 3662494;


> There is the following comment in xlog.c concerning this message:
> 
>      /*
>       * No-one should request to flush a piece of WAL that hasn't
> even been
>       * reserved yet. However, it can happen if there is a block with
> a 
> bogus
>       * LSN on disk, for example. XLogFlush checks for that situation
> and
>       * complains, but only after the flush. Here we just assume that
> to 
> mean
>       * that all WAL that has been reserved needs to be finished. In
> this
>       * corner-case, the return value can be smaller than 'upto'
> argument.
>       */
> 
> So looks like it should not happen.
> The first thing to suspect is spinlock implementation which is
> different 
> for GCC and XLC.
> But ... if I rebuild Postgres without spinlocks, then the problem is 
> still reproduced.

Before we got the results from XLC on Linux (where Postgres show the
same behavior) i had a look into the spinlock implementation. If i got
it right, XLC doesn't use the ppc64 specific ones, but the fallback
implementation (system monitoring on AIX also has shown massive calls
for signal(0)...). So i tried the following patch:

diff --git a/src/include/port/atomics/arch-ppc.h
b/src/include/port/atomics/arch-ppc.h
new file mode 100644
index f901a0c..028cced
*** a/src/include/port/atomics/arch-ppc.h
--- b/src/include/port/atomics/arch-ppc.h
***************
*** 23,26 ****
--- 23,33 ----
  #define pg_memory_barrier_impl()      __asm__ __volatile__ ("sync" :
: :
"memory")
  #define pg_read_barrier_impl()                __asm__ __volatile__
("lwsync" : : : "memory")
  #define pg_write_barrier_impl()               __asm__ __volatile__
("lwsync" : : : "memory")
+
+ #elif defined(__IBMC__) || defined(__IBMCPP__)
+
+ #define pg_memory_barrier_impl()      __asm__ __volatile__ (" sync
\n"
::: "memory")
+ #define pg_read_barrier_impl()                __asm__ __volatile__ ("
lwsync \n" ::: "memory")
+ #define pg_write_barrier_impl()               __asm__ __volatile__ ("
lwsync \n" ::: "memory")
+
  #endif

This didn't change the picture, though.




Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Heikki Linnakangas
Дата:
On 01/24/2017 04:47 PM, Konstantin Knizhnik wrote:
> As I already mentioned, we built Postgres with LOCK_DEBUG , so we can
> inspect lock owner. Backend is waiting for itself!
> Now please look at two frames in this stack trace marked with red.
> XLogInsertRecord is setting WALInsert locks at the beginning of the
> function:
>
>      if (isLogSwitch)
>          WALInsertLockAcquireExclusive();
>      else
>          WALInsertLockAcquire();
>
> WALInsertLockAcquire just selects random item from WALInsertLocks array
> and exclusively locks:
>
>      if (lockToTry == -1)
>          lockToTry = MyProc->pgprocno % NUM_XLOGINSERT_LOCKS;
>      MyLockNo = lockToTry;
>      immed = LWLockAcquire(&WALInsertLocks[MyLockNo].l.lock, LW_EXCLUSIVE);
>
> Then, following the stack trace, AdvanceXLInsertBuffer calls
> WaitXLogInsertionsToFinish:
>
>              /*
>               * Now that we have an up-to-date LogwrtResult value, see if we
>               * still need to write it or if someone else already did.
>               */
>              if (LogwrtResult.Write < OldPageRqstPtr)
>              {
>                  /*
>                   * Must acquire write lock. Release WALBufMappingLock
> first,
>                   * to make sure that all insertions that we need to
> wait for
>                   * can finish (up to this same position). Otherwise we risk
>                   * deadlock.
>                   */
>                  LWLockRelease(WALBufMappingLock);
>
> WaitXLogInsertionsToFinish(OldPageRqstPtr);
>
>                  LWLockAcquire(WALWriteLock, LW_EXCLUSIVE);
>
>
> It releases WALBufMappingLock but not WAL insert locks!
> Finally in WaitXLogInsertionsToFinish tries to wait for all locks:
>
>      for (i = 0; i < NUM_XLOGINSERT_LOCKS; i++)
>      {
>          XLogRecPtr    insertingat = InvalidXLogRecPtr;
>
>          do
>          {
>              /*
>               * See if this insertion is in progress. LWLockWait will
> wait for
>               * the lock to be released, or for the 'value' to be set by a
>               * LWLockUpdateVar call.  When a lock is initially
> acquired, its
>               * value is 0 (InvalidXLogRecPtr), which means that we
> don't know
>               * where it's inserting yet.  We will have to wait for it.  If
>               * it's a small insertion, the record will most likely fit
> on the
>               * same page and the inserter will release the lock without
> ever
>               * calling LWLockUpdateVar.  But if it has to sleep, it will
>               * advertise the insertion point with LWLockUpdateVar before
>               * sleeping.
>               */
>              if (LWLockWaitForVar(&WALInsertLocks[i].l.lock,
>   &WALInsertLocks[i].l.insertingAt,
>                                   insertingat, &insertingat))
>
> And here we stuck!

Interesting.. What should happen here is that for the backend's own 
insertion slot, the "insertingat" value should be greater than the 
requested flush point ('upto' variable). That's because before 
GetXLogBuffer() calls AdvanceXLInsertBuffer(), it updates the backend's 
insertingat value, to the position that it wants to insert to. And 
AdvanceXLInsertBuffer() only calls WaitXLogInsertionsToFinish() with 
value smaller than what was passed as the 'upto' argument.

> The comment to WaitXLogInsertionsToFinish says:
>
>   * Note: When you are about to write out WAL, you must call this function
>   * *before* acquiring WALWriteLock, to avoid deadlocks. This function might
>   * need to wait for an insertion to finish (or at least advance to next
>   * uninitialized page), and the inserter might need to evict an old WAL
> buffer
>   * to make room for a new one, which in turn requires WALWriteLock.
>
> Which contradicts to the observed stack trace.

Not AFAICS. In the stack trace you showed, the backend is not holding 
WALWriteLock. It would only acquire it after the 
WaitXLogInsertionsToFinish() call finished.

> I wonder if it is really synchronization bug in xlog.c or there is
> something wrong in this stack trace and it can not happen in case of
> normal work?

Yeah, hard to tell. Something is clearly wrong..

This line in the stack trace is suspicious:

> WaitXLogInsertionsToFinish(upto = 102067874328), line 1583 in "xlog.c"

AdvanceXLInsertBuffer() should only ever call 
WaitXLogInsertionsToFinish() with an xlog position that points to a page 
bounary, but that upto value points to the middle of a page.

Perhaps the value stored in the stack trace is not what the caller 
passed, but it was updated because it was past the 'reserveUpto' value? 
That would explain the "request to flush past end
of generated WAL" notices you saw in the log. Now, why would that 
happen, I have no idea.

If you can and want to provide me access to the system, I could have a 
look myself. I'd also like to see if the attached additional Assertions 
will fire.

- Heikki


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Konstantin Knizhnik
Дата:
On 30.01.2017 19:21, Heikki Linnakangas wrote:
> On 01/24/2017 04:47 PM, Konstantin Knizhnik wrote:
> Interesting.. What should happen here is that for the backend's own 
> insertion slot, the "insertingat" value should be greater than the 
> requested flush point ('upto' variable). That's because before 
> GetXLogBuffer() calls AdvanceXLInsertBuffer(), it updates the 
> backend's insertingat value, to the position that it wants to insert 
> to. And AdvanceXLInsertBuffer() only calls 
> WaitXLogInsertionsToFinish() with value smaller than what was passed 
> as the 'upto' argument.
>
>> The comment to WaitXLogInsertionsToFinish says:
>>
>>   * Note: When you are about to write out WAL, you must call this 
>> function
>>   * *before* acquiring WALWriteLock, to avoid deadlocks. This 
>> function might
>>   * need to wait for an insertion to finish (or at least advance to next
>>   * uninitialized page), and the inserter might need to evict an old WAL
>> buffer
>>   * to make room for a new one, which in turn requires WALWriteLock.
>>
>> Which contradicts to the observed stack trace.
>
> Not AFAICS. In the stack trace you showed, the backend is not holding 
> WALWriteLock. It would only acquire it after the 
> WaitXLogInsertionsToFinish() call finished.
>
>

Hmmm, may be I missed something.
I am not telling about WALBufMappingLock which is required after return 
from XLogInsertionsToFinish.
But about lock obtained by WALInsertLockAcquire  at line 946 in 
XLogInsertRecord.
It will be release at line  1021 by  WALInsertLockRelease(). But 
CopyXLogRecordToWAL is invoked with this lock granted.


> This line in the stack trace is suspicious:
>
>> WaitXLogInsertionsToFinish(upto = 102067874328), line 1583 in "xlog.c"
>
> AdvanceXLInsertBuffer() should only ever call 
> WaitXLogInsertionsToFinish() with an xlog position that points to a 
> page bounary, but that upto value points to the middle of a page.
>
> Perhaps the value stored in the stack trace is not what the caller 
> passed, but it was updated because it was past the 'reserveUpto' 
> value? That would explain the "request to flush past end
> of generated WAL" notices you saw in the log. Now, why would that 
> happen, I have no idea.
>
> If you can and want to provide me access to the system, I could have a 
> look myself. I'd also like to see if the attached additional 
> Assertions will fire.

I really get this assertion failed:

ExceptionalCondition(conditionName = "!(OldPageRqstPtr <= upto || 
opportunistic)", errorType = "FailedAssertion", fileName = "xlog.c", 
lineNumber = 1917), line 54 in "assert.c"
(dbx) up
unnamed block in AdvanceXLInsertBuffer(upto = 147439056632, 
opportunistic = '\0'), line 1917 in "xlog.c"
(dbx) p OldPageRqstPtr
147439058944
(dbx) p upto
147439056632
(dbx) p opportunistic
'\0'

Also , in another run, I encountered yet another assertion failure:

ExceptionalCondition(conditionName = "!((((NewPageBeginPtr) / 8192) % 
(XLogCtl->XLogCacheBlck + 1)) == nextidx)", errorType = 
"FailedAssertion", fileName = "xlog.c", lineNumber = 1950), line 54 in 
"assert.c"

nextidx equals to 1456, while expected value is 1457.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Konstantin Knizhnik
Дата:
One more assertion failure:


ExceptionalCondition(conditionName = "!(OldPageRqstPtr <= XLogCtl->InitializedUpTo)", errorType = "FailedAssertion", fileName = "xlog.c", lineNumber = 1887), line 54 in "assert.c"

(dbx) p OldPageRqstPtr
153551667200
(dbx) p XLogCtl->InitializedUpTo
153551667200
(dbx) p InitializedUpTo
153551659008

I slightly modify xlog.c code - store value of XLogCtl->InitializedUpTo in local variable:


 1870         LWLockAcquire(WALBufMappingLock, LW_EXCLUSIVE);
 1871
 1872         /*
 1873          * Now that we have the lock, check if someone initialized the page
 1874          * already.
 1875          */
 1876         while (upto >= XLogCtl->InitializedUpTo || opportunistic)
 1877         {
 1878                 XLogRecPtr InitializedUpTo = XLogCtl->InitializedUpTo;
 1879                 nextidx = XLogRecPtrToBufIdx(InitializedUpTo);
 1880
 1881                 /*
 1882                  * Get ending-offset of the buffer page we need to replace (this may
 1883                  * be zero if the buffer hasn't been used yet).  Fall through if it's
 1884                  * already written out.
 1885                  */
 1886                 OldPageRqstPtr = XLogCtl->xlblocks[nextidx];
 1887                 Assert(OldPageRqstPtr <= XLogCtl->InitializedUpTo);


And, as you can see,  XLogCtl->InitializedUpTo is not equal to saved value InitializedUpTo.
But we are under exclusive WALBufMappingLock and InitializedUpTo is updated only under this lock.
So it means that LW-locks doesn't work!
I inspected code of pg_atomic_compare_exchange_u32_impl and didn't sync in prologue:

(dbx) listi pg_atomic_compare_exchange_u32_impl
0x1000817bc (pg_atomic_compare_exchange_u32_impl+0x1c)  e88100b0             ld   r4,0xb0(r1)
0x1000817c0 (pg_atomic_compare_exchange_u32_impl+0x20)  e86100b8             ld   r3,0xb8(r1)
0x1000817c4 (pg_atomic_compare_exchange_u32_impl+0x24)  800100c0            lwz   r0,0xc0(r1)
0x1000817c8 (pg_atomic_compare_exchange_u32_impl+0x28)  7c0007b4          extsw   r0,r0
0x1000817cc (pg_atomic_compare_exchange_u32_impl+0x2c)  e8a30002            lwa   r5,0x0(r3)
0x1000817d0 (pg_atomic_compare_exchange_u32_impl+0x30)  7cc02028          lwarx   r6,r0,r4,0x0
0x1000817d4 (pg_atomic_compare_exchange_u32_impl+0x34)  7c053040           cmpl   cr0,0x0,r5,r6
0x1000817d8 (pg_atomic_compare_exchange_u32_impl+0x38)  4082000c            bne   0x1000817e4 (pg_atomic_compare_exchange_u32_impl+0x44)
0x1000817dc (pg_atomic_compare_exchange_u32_impl+0x3c)  7c00212d         stwcx.   r0,r0,r4
0x1000817e0 (pg_atomic_compare_exchange_u32_impl+0x40)  40e2fff0           bne+   0x1000817d0 (pg_atomic_compare_exchange_u32_impl+0x30)
0x1000817e4 (pg_atomic_compare_exchange_u32_impl+0x44)  60c00000            ori   r0,r6,0x0
0x1000817e8 (pg_atomic_compare_exchange_u32_impl+0x48)  90030000            stw   r0,0x0(r3)
0x1000817ec (pg_atomic_compare_exchange_u32_impl+0x4c)  7c000026           mfcr   r0
0x1000817f0 (pg_atomic_compare_exchange_u32_impl+0x50)  54001ffe         rlwinm   r0,r0,0x3,0x1f,0x1f
0x1000817f4 (pg_atomic_compare_exchange_u32_impl+0x54)  78000620         rldicl   r0,r0,0x0,0x19
0x1000817f8 (pg_atomic_compare_exchange_u32_impl+0x58)  98010070            stb   r0,0x70(r1)
0x1000817fc (pg_atomic_compare_exchange_u32_impl+0x5c)  4c00012c          isync
0x100081800 (pg_atomic_compare_exchange_u32_impl+0x60)  88610070            lbz   r3,0x70(r1)
0x100081804 (pg_atomic_compare_exchange_u32_impl+0x64)  48000004              b   0x100081808 (pg_atomic_compare_exchange_u32_impl+0x68)
0x100081808 (pg_atomic_compare_exchange_u32_impl+0x68)  38210080           addi   r1,0x80(r1)
0x10008180c (pg_atomic_compare_exchange_u32_impl+0x6c)  4e800020            blr


Source code of pg_atomic_compare_exchange_u32_impl is the following:

static inline bool
pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr,
                                    uint32 *expected, uint32 newval)
{
    bool        ret;

    /*
     * atomics.h specifies sequential consistency ("full barrier semantics")
     * for this interface.  Since "lwsync" provides acquire/release
     * consistency only, do not use it here.  GCC atomics observe the same
     * restriction; see its rs6000_pre_atomic_barrier().
     */
    __asm__ __volatile__ ("    sync \n" ::: "memory");

    /*
     * XXX: __compare_and_swap is defined to take signed parameters, but that
     * shouldn't matter since we don't perform any arithmetic operations.
     */
    ret = __compare_and_swap((volatile int*)&ptr->value,
                             (int *)expected, (int)newval);

    /*
     * xlc's documentation tells us:
     * "If __compare_and_swap is used as a locking primitive, insert a call to
     * the __isync built-in function at the start of any critical sections."
     *
     * The critical section begins immediately after __compare_and_swap().
     */
    __isync();

    return ret;
}

and if I compile this fuctions standalone, I get the following assembler code:

.pg_atomic_compare_exchange_u32_impl:   # 0x0000000000000000 (H.4.NO_SYMBOL)
        stdu       SP,-128(SP)
        std        r3,176(SP)
        std        r4,184(SP)
        std        r5,192(SP)
        ld         r0,192(SP)
        stw        r0,192(SP)
        sync      
        ld         r4,176(SP)
        ld         r3,184(SP)
        lwz        r0,192(SP)
        extsw      r0,r0
        lwa        r5,0(r3)
__L30:                                  # 0x0000000000000030 (H.4.NO_SYMBOL+0x030)
        lwarx      r6,r0,r4
        cmpl       0,0,r5,r6
        bc         BO_IF_NOT,CR0_EQ,__L44
        stwcx.     r0,r0,r4
        .machine        "any"
        bc         BO_IF_NOT_3,CR0_EQ,__L30
__L44:                                  # 0x0000000000000044 (H.4.NO_SYMBOL+0x044)
        ori        r0,r6,0x0000
        stw        r0,0(r3)
        mfcr       r0
        rlwinm     r0,r0,3,31,31
        rldicl     r0,r0,0,56
        stb        r0,112(SP)
        isync     
        lbz        r3,112(SP)
        addi       SP,SP,128
        bclr       BO_ALWAYS,CR0_LT

sync is here!


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Heikki Linnakangas
Дата:
On 01/31/2017 05:03 PM, Konstantin Knizhnik wrote:
> One more assertion failure:
>
>
> ExceptionalCondition(conditionName = "!(OldPageRqstPtr <=
> XLogCtl->InitializedUpTo)", errorType = "FailedAssertion", fileName =
> "xlog.c", lineNumber = 1887), line 54 in "assert.c"
>
> (dbx) p OldPageRqstPtr
> 153551667200
> (dbx) p XLogCtl->InitializedUpTo
> 153551667200
> (dbx) p InitializedUpTo
> 153551659008
>
> I slightly modify xlog.c code - store value of XLogCtl->InitializedUpTo
> in local variable:
>
>
>   1870         LWLockAcquire(WALBufMappingLock, LW_EXCLUSIVE);
>   1871
>   1872         /*
>   1873          * Now that we have the lock, check if someone
> initialized the page
>   1874          * already.
>   1875          */
>   1876         while (upto >= XLogCtl->InitializedUpTo || opportunistic)
>   1877         {
>   1878                 XLogRecPtr InitializedUpTo =
> XLogCtl->InitializedUpTo;
>   1879                 nextidx = XLogRecPtrToBufIdx(InitializedUpTo);
>   1880
>   1881                 /*
>   1882                  * Get ending-offset of the buffer page we need
> to replace (this may
>   1883                  * be zero if the buffer hasn't been used yet).
> Fall through if it's
>   1884                  * already written out.
>   1885                  */
>   1886                 OldPageRqstPtr = XLogCtl->xlblocks[nextidx];
>   1887                 Assert(OldPageRqstPtr <= XLogCtl->InitializedUpTo);
>
>
> And, as you can see,  XLogCtl->InitializedUpTo is not equal to saved
> value InitializedUpTo.
> But we are under exclusive WALBufMappingLock and InitializedUpTo is
> updated only under this lock.
> So it means that LW-locks doesn't work!

Yeah, so it seems. XLogCtl->InitializeUpTo is quite clearly protected by 
the WALBufMappingLock. All references to it (after StartupXLog) happen 
while holding the lock.

Can you get the assembly output of the AdvanceXLInsertBuffer() function? 
I wonder if the compiler is rearranging things so that 
XLogCtl->InitializedUpTo is fetched before the LWLockAcquire call. Or 
should there be a memory barrier instruction somewhere in LWLockAcquire?

- Heikki




Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Heikki Linnakangas
Дата:
Oh, you were one step ahead of me, I didn't understand it on first read 
of your email. Need more coffee..

On 01/31/2017 05:03 PM, Konstantin Knizhnik wrote:
> I inspected code of pg_atomic_compare_exchange_u32_impl and didn't sync
> in prologue:
>
> (dbx) listi pg_atomic_compare_exchange_u32_impl> [no sync instruction]

> and if I compile this fuctions standalone, I get the following assembler
> code:
>
> .pg_atomic_compare_exchange_u32_impl:   # 0x0000000000000000 (H.4.NO_SYMBOL)
>          stdu       SP,-128(SP)
>          std        r3,176(SP)
>          std        r4,184(SP)
>          std        r5,192(SP)
>          ld         r0,192(SP)
>          stw        r0,192(SP)
>         sync
>          ld         r4,176(SP)
>          ld         r3,184(SP)
>          lwz        r0,192(SP)
>          extsw      r0,r0
>          lwa        r5,0(r3)> ...
>
> sync is here!

Ok, so, the 'sync' instruction gets lost somehow. That "standalone" 
assemly version looks slightly different in other ways too, you perhaps 
used different optimization levels, or it looks different when it's 
inlined into the caller. Not sure which version of the function gdb 
would show, when it's a "static inline" function. Would be good to check 
the disassembly of LWLockAttemptLock(), to see if the 'sync' is there.

Certainly seems like a compiler bug, though.

- Heikki




Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Konstantin Knizhnik
Дата:
Attached please find my patch for XLC/AIX.
The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
The comment in this file says that:

       * __fetch_and_add() emits a leading "sync" and trailing "isync", 
thereby
       * providing sequential consistency.  This is undocumented.

But it is not true any more (I checked generated assembler code in 
debugger).
This is why I have added __sync() to this function. Now pgbench working 
normally.

Also there is mysterious disappearance of assembler section function 
with sync instruction from pg_atomic_compare_exchange_u32_impl.
I have fixed it by using __sync() built-in function instead.


Thanks to everybody who helped me to locate and fix this problem.

-- 

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Heikki Linnakangas
Дата:
On 02/01/2017 01:07 PM, Konstantin Knizhnik wrote:
> Attached please find my patch for XLC/AIX.
> The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
> The comment in this file says that:
>
>        * __fetch_and_add() emits a leading "sync" and trailing "isync",
> thereby
>        * providing sequential consistency.  This is undocumented.
>
> But it is not true any more (I checked generated assembler code in
> debugger).
> This is why I have added __sync() to this function. Now pgbench working
> normally.

Seems like it was not so much undocumented, but an implementation detail 
that was not guaranteed after all..

Does __fetch_and_add emit a trailing isync there either? Seems odd if 
__compare_and_swap requires it, but __fetch_and_add does not. Unless we 
can find conclusive documentation on that, I think we should assume that 
an __isync() is required, too.

> Also there is mysterious disappearance of assembler section function
> with sync instruction from pg_atomic_compare_exchange_u32_impl.
> I have fixed it by using __sync() built-in function instead.

__sync() seems more appropriate there, anyway. We're using intrinsics 
for all the other things in generic-xlc.h. But it sure is scary that the 
"asm" sections just disappeared.

In arch-ppc.h, shouldn't we have #ifdef __IBMC__ guards for the __sync() 
and __lwsync() intrinsics? Those are an xlc compiler-specific thing, 
right? Or if they are expected to work on any ppc compiler, then we 
should probably use them always, instead of the asm sections.

In summary, I came up with the attached. It's essentially your patch, 
with tweaks for the above-mentioned things. I don't have a powerpc 
system to test on, so there are probably some silly typos there.

- Heikki


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Heikki Linnakangas
Дата:
On 02/01/2017 01:07 PM, Konstantin Knizhnik wrote:
> Attached please find my patch for XLC/AIX.
> The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
> The comment in this file says that:
>
>        * __fetch_and_add() emits a leading "sync" and trailing "isync",
> thereby
>        * providing sequential consistency.  This is undocumented.
>
> But it is not true any more (I checked generated assembler code in
> debugger).
> This is why I have added __sync() to this function. Now pgbench working
> normally.

Seems like it was not so much undocumented, but an implementation detail 
that was not guaranteed after all..

Does __fetch_and_add emit a trailing isync there either? Seems odd if 
__compare_and_swap requires it, but __fetch_and_add does not. Unless we 
can find conclusive documentation on that, I think we should assume that 
an __isync() is required, too.

There was a long thread on these things the last time this was changed: 
https://www.postgresql.org/message-id/20160425185204.jrvlghn3jxulsb7i%40alap3.anarazel.de. 
I couldn't find an explanation there of why we thought that 
fetch_and_add implicitly performs sync and isync.

> Also there is mysterious disappearance of assembler section function
> with sync instruction from pg_atomic_compare_exchange_u32_impl.
> I have fixed it by using __sync() built-in function instead.

__sync() seems more appropriate there, anyway. We're using intrinsics 
for all the other things in generic-xlc.h. But it sure is scary that the 
"asm" sections just disappeared.

In arch-ppc.h, shouldn't we have #ifdef __IBMC__ guards for the __sync() 
and __lwsync() intrinsics? Those are an xlc compiler-specific thing, 
right? Or if they are expected to work on any ppc compiler, then we 
should probably use them always, instead of the asm sections.

In summary, I came up with the attached. It's essentially your patch, 
with tweaks for the above-mentioned things. I don't have a powerpc 
system to test on, so there are probably some silly typos there.

- Heikki


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] Deadlock in XLogInsert at AIX

От
"REIX, Tony"
Дата:

Hi,

I'm now working on the port of PostgreSQL on AIX.
(RPMs can be found, as free OpenSource work, at http://http://bullfreeware.com/ .
 
http://bullfreeware.com/search.php?package=postgresql )

I was not aware of any issue with XLC v12 on AIX for atomic operations.
(XLC v13 generates at least 2 tests failures)

For now, with version 9.6.1, all tests "check-world", plus numeric_big test, are OK, in both 32 & 64bit versions.

Am I missing something ?

I configure the build of PostgreSQL with (in 64bits):

 ./configure
        --prefix=/opt/freeware
        --libdir=/opt/freeware/lib64
        --mandir=/opt/freeware/man
        --with-perl
        --with-tcl
        --with-tclconfig=/opt/freeware/lib
        --with-python
        --with-ldap
        --with-openssl
        --with-libxml
        --with-libxslt
        --enable-nls
        --enable-thread-safety
        --sysconfdir=/etc/sysconfig/postgresql

Am I missing some option for more optimization on AIX ?

Thanks

Regards,

Tony

Le 01/02/2017 à 12:07, Konstantin Knizhnik a écrit :
Attached please find my patch for XLC/AIX.
The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
The comment in this file says that:

      * __fetch_and_add() emits a leading "sync" and trailing "isync",
thereby
      * providing sequential consistency.  This is undocumented.

But it is not true any more (I checked generated assembler code in
debugger).
This is why I have added __sync() to this function. Now pgbench working
normally.

Also there is mysterious disappearance of assembler section function
with sync instruction from pg_atomic_compare_exchange_u32_impl.
I have fixed it by using __sync() built-in function instead.


Thanks to everybody who helped me to locate and fix this problem.

--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


ATOS WARNING !
This message contains attachments that could potentially harm your computer.
Please make sure you open ONLY attachments from senders you know, trust and is in an e-mail that you are expecting.

AVERTISSEMENT ATOS !
Ce message contient des pièces jointes qui peuvent potentiellement endommager votre ordinateur.
Merci de vous assurer que vous ouvrez uniquement les pièces jointes provenant d’emails que vous attendez et dont vous connaissez les expéditeurs et leur faites confiance.

AVISO DE ATOS !
Este mensaje contiene datos adjuntos que pudiera ser que dañaran su ordenador.
Asegúrese de abrir SOLO datos adjuntos enviados desde remitentes de confianza y que procedan de un correo esperado.

ATOS WARNUNG !
Diese E-Mail enthält Anlagen, welche möglicherweise ihren Computer beschädigen könnten.
Bitte beachten Sie, daß Sie NUR Anlagen öffnen, von einem Absender den Sie kennen, vertrauen und vom dem Sie vor allem auch E-Mails mit Anlagen erwarten.



Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Konstantin Knizhnik
Дата:
Hi,

We are using 13.1.3 version of XLC. All tests are passed.
Please notice that is is synchronization bug which can be reproduced only under hard load.
Our server has 64 cores and it is necessary to run pgbench with 100 connections during several minutes to reproduce the problem.
So may be you just didn't notice it;)



On 01.02.2017 16:29, REIX, Tony wrote:

Hi,

I'm now working on the port of PostgreSQL on AIX.
(RPMs can be found, as free OpenSource work, at http://http://bullfreeware.com/ .
 
http://bullfreeware.com/search.php?package=postgresql )

I was not aware of any issue with XLC v12 on AIX for atomic operations.
(XLC v13 generates at least 2 tests failures)

For now, with version 9.6.1, all tests "check-world", plus numeric_big test, are OK, in both 32 & 64bit versions.

Am I missing something ?

I configure the build of PostgreSQL with (in 64bits):

 ./configure
        --prefix=/opt/freeware
        --libdir=/opt/freeware/lib64
        --mandir=/opt/freeware/man
        --with-perl
        --with-tcl
        --with-tclconfig=/opt/freeware/lib
        --with-python
        --with-ldap
        --with-openssl
        --with-libxml
        --with-libxslt
        --enable-nls
        --enable-thread-safety
        --sysconfdir=/etc/sysconfig/postgresql

Am I missing some option for more optimization on AIX ?

Thanks

Regards,

Tony

Le 01/02/2017 à 12:07, Konstantin Knizhnik a écrit :
Attached please find my patch for XLC/AIX.
The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
The comment in this file says that:

      * __fetch_and_add() emits a leading "sync" and trailing "isync",
thereby
      * providing sequential consistency.  This is undocumented.

But it is not true any more (I checked generated assembler code in
debugger).
This is why I have added __sync() to this function. Now pgbench working
normally.

Also there is mysterious disappearance of assembler section function
with sync instruction from pg_atomic_compare_exchange_u32_impl.
I have fixed it by using __sync() built-in function instead.


Thanks to everybody who helped me to locate and fix this problem.

--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


ATOS WARNING !
This message contains attachments that could potentially harm your computer.
Please make sure you open ONLY attachments from senders you know, trust and is in an e-mail that you are expecting.

AVERTISSEMENT ATOS !
Ce message contient des pièces jointes qui peuvent potentiellement endommager votre ordinateur.
Merci de vous assurer que vous ouvrez uniquement les pièces jointes provenant d’emails que vous attendez et dont vous connaissez les expéditeurs et leur faites confiance.

AVISO DE ATOS !
Este mensaje contiene datos adjuntos que pudiera ser que dañaran su ordenador.
Asegúrese de abrir SOLO datos adjuntos enviados desde remitentes de confianza y que procedan de un correo esperado.

ATOS WARNUNG !
Diese E-Mail enthält Anlagen, welche möglicherweise ihren Computer beschädigen könnten.
Bitte beachten Sie, daß Sie NUR Anlagen öffnen, von einem Absender den Sie kennen, vertrauen und vom dem Sie vor allem auch E-Mails mit Anlagen erwarten.




-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Konstantin Knizhnik
Дата:
On 01.02.2017 15:39, Heikki Linnakangas wrote:
On 02/01/2017 01:07 PM, Konstantin Knizhnik wrote:
Attached please find my patch for XLC/AIX.
The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
The comment in this file says that:

       * __fetch_and_add() emits a leading "sync" and trailing "isync",
thereby
       * providing sequential consistency.  This is undocumented.

But it is not true any more (I checked generated assembler code in
debugger).
This is why I have added __sync() to this function. Now pgbench working
normally.

Seems like it was not so much undocumented, but an implementation detail that was not guaranteed after all..

Does __fetch_and_add emit a trailing isync there either? Seems odd if __compare_and_swap requires it, but __fetch_and_add does not. Unless we can find conclusive documentation on that, I think we should assume that an __isync() is required, too.

There was a long thread on these things the last time this was changed: https://www.postgresql.org/message-id/20160425185204.jrvlghn3jxulsb7i%40alap3.anarazel.de. I couldn't find an explanation there of why we thought that fetch_and_add implicitly performs sync and isync.

Also there is mysterious disappearance of assembler section function
with sync instruction from pg_atomic_compare_exchange_u32_impl.
I have fixed it by using __sync() built-in function instead.

__sync() seems more appropriate there, anyway. We're using intrinsics for all the other things in generic-xlc.h. But it sure is scary that the "asm" sections just disappeared.

In arch-ppc.h, shouldn't we have #ifdef __IBMC__ guards for the __sync() and __lwsync() intrinsics? Those are an xlc compiler-specific thing, right? Or if they are expected to work on any ppc compiler, then we should probably use them always, instead of the asm sections.

In summary, I came up with the attached. It's essentially your patch, with tweaks for the above-mentioned things. I don't have a powerpc system to test on, so there are probably some silly typos there.

Why do you prefer to use _check_lock instead of __check_lock_mp ?
First one is even not mentioned in XLC compiler manual:
http://www-01.ibm.com/support/docview.wss?uid=swg27046906&aid=7
or
http://scv.bu.edu/computation/bluegene/IBMdocs/compiler/xlc-8.0/html/compiler/ref/bif_sync.htm


- Heikki




-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Konstantin Knizhnik
Дата:

On 01.02.2017 15:39, Heikki Linnakangas wrote:
>
> In summary, I came up with the attached. It's essentially your patch, 
> with tweaks for the above-mentioned things. I don't have a powerpc 
> system to test on, so there are probably some silly typos there.
>

Attached pleased find fixed version of your patch.
I verified that it is correctly applied, build and postgres normally 
works with it.


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] Deadlock in XLogInsert at AIX

От
"REIX, Tony"
Дата:

Hi Konstantin

XLC.

I'm on AIX 7.1 for now.

I'm using this version of XLC v13:

# xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0003

With this version, I have (at least, since I tested with "check" and not "check-world" at that time) 2 failing tests: create_aggregate , aggregates .


With the following XLC v12 version, I have NO test failure:

# /usr/vac/bin/xlc -qversion
IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
Version: 12.01.0000.0016


So maybe you are not using XLC v13.1.3.3, rather another sub-version. Unless you are using more options for the configure ?


Configure.

What are the options that you give to the configure ?


Hard load & 64 cores ? OK. That clearly explains why I do not see this issue.


pgbench ? I wanted to run it. However, I'm still looking where to get it plus a guide for using it for testing. I would add such tests when building my PostgreSQL RPMs on AIX. So any help is welcome !


Performance.

- Also, I'd like to compare PostgreSQL performance on AIX vs Linux/PPC64. Any idea how I should proceed ? Any PostgreSQL performance benchmark that I could find and use ? pgbench ?

- I'm interested in any information for improving the performance & quality of my PostgreSQM RPMs on AIX. (As I already said, BullFreeware RPMs for AIX are free and can be used by anyone, like Perzl RPMs are. My company (ATOS/Bull) sells IBM Power machines under the Escala brand since ages (25 years this year)).


How to help ?

How could I help for improving the quality and performance of PostgreSQL on AIX ?
I may have access to very big machines for even more deeply testing of PostgreSQL. I just need to know how to run tests.


Thanks!

Regards,

Tony



Le 01/02/2017 à 14:48, Konstantin Knizhnik a écrit :
Hi,

We are using 13.1.3 version of XLC. All tests are passed.
Please notice that is is synchronization bug which can be reproduced only under hard load.
Our server has 64 cores and it is necessary to run pgbench with 100 connections during several minutes to reproduce the problem.
So may be you just didn't notice it;)



On 01.02.2017 16:29, REIX, Tony wrote:

Hi,

I'm now working on the port of PostgreSQL on AIX.
(RPMs can be found, as free OpenSource work, at http://http://bullfreeware.com/ .
 
http://bullfreeware.com/search.php?package=postgresql )

I was not aware of any issue with XLC v12 on AIX for atomic operations.
(XLC v13 generates at least 2 tests failures)

For now, with version 9.6.1, all tests "check-world", plus numeric_big test, are OK, in both 32 & 64bit versions.

Am I missing something ?

I configure the build of PostgreSQL with (in 64bits):

 ./configure
        --prefix=/opt/freeware
        --libdir=/opt/freeware/lib64
        --mandir=/opt/freeware/man
        --with-perl
        --with-tcl
        --with-tclconfig=/opt/freeware/lib
        --with-python
        --with-ldap
        --with-openssl
        --with-libxml
        --with-libxslt
        --enable-nls
        --enable-thread-safety
        --sysconfdir=/etc/sysconfig/postgresql

Am I missing some option for more optimization on AIX ?

Thanks

Regards,

Tony

Le 01/02/2017 à 12:07, Konstantin Knizhnik a écrit :
Attached please find my patch for XLC/AIX.
The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
The comment in this file says that:

      * __fetch_and_add() emits a leading "sync" and trailing "isync",
thereby
      * providing sequential consistency.  This is undocumented.

But it is not true any more (I checked generated assembler code in
debugger).
This is why I have added __sync() to this function. Now pgbench working
normally.

Also there is mysterious disappearance of assembler section function
with sync instruction from pg_atomic_compare_exchange_u32_impl.
I have fixed it by using __sync() built-in function instead.


Thanks to everybody who helped me to locate and fix this problem.

--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


ATOS WARNING !
This message contains attachments that could potentially harm your computer.
Please make sure you open ONLY attachments from senders you know, trust and is in an e-mail that you are expecting.

AVERTISSEMENT ATOS !
Ce message contient des pièces jointes qui peuvent potentiellement endommager votre ordinateur.
Merci de vous assurer que vous ouvrez uniquement les pièces jointes provenant d’emails que vous attendez et dont vous connaissez les expéditeurs et leur faites confiance.

AVISO DE ATOS !
Este mensaje contiene datos adjuntos que pudiera ser que dañaran su ordenador.
Asegúrese de abrir SOLO datos adjuntos enviados desde remitentes de confianza y que procedan de un correo esperado.

ATOS WARNUNG !
Diese E-Mail enthält Anlagen, welche möglicherweise ihren Computer beschädigen könnten.
Bitte beachten Sie, daß Sie NUR Anlagen öffnen, von einem Absender den Sie kennen, vertrauen und vom dem Sie vor allem auch E-Mails mit Anlagen erwarten.




-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Konstantin Knizhnik
Дата:
Hi Tony,

On 01.02.2017 18:42, REIX, Tony wrote:

Hi Konstantin

XLC.

I'm on AIX 7.1 for now.

I'm using this version of XLC v13:

# xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0003

With this version, I have (at least, since I tested with "check" and not "check-world" at that time) 2 failing tests: create_aggregate , aggregates .


With the following XLC v12 version, I have NO test failure:

# /usr/vac/bin/xlc -qversion
IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
Version: 12.01.0000.0016


So maybe you are not using XLC v13.1.3.3, rather another sub-version. Unless you are using more options for the configure ?


Configure.

What are the options that you give to the configure ?


export CC="/opt/IBM/xlc/13.1.3/bin/xlc"
export CFLAGS="-qarch=pwr8 -qtune=pwr8 -O2 -qalign=natural -q64 "
export LDFLAGS="-Wl,-bbigtoc,-b64"
export AR="/usr/bin/ar -X64"
export LD="/usr/bin/ld -b64 "
export NM="/usr/bin/nm -X64"
./configure --prefix="/opt/postgresql/xlc-debug/9.6"


Hard load & 64 cores ? OK. That clearly explains why I do not see this issue.


pgbench ? I wanted to run it. However, I'm still looking where to get it plus a guide for using it for testing.


pgbench is part of Postgres distributive (src/bin/pgbench)


I would add such tests when building my PostgreSQL RPMs on AIX. So any help is welcome !


Performance.

- Also, I'd like to compare PostgreSQL performance on AIX vs Linux/PPC64. Any idea how I should proceed ? Any PostgreSQL performance benchmark that I could find and use ? pgbench ?

pgbench is most widely used tool simulating OLTP workload. Certainly it is quite primitive and its results are rather artificial. TPC-C seems to be better choice.
But the best case is to implement your own benchmark simulating actual workload of your real application.

- I'm interested in any information for improving the performance & quality of my PostgreSQM RPMs on AIX. (As I already said, BullFreeware RPMs for AIX are free and can be used by anyone, like Perzl RPMs are. My company (ATOS/Bull) sells IBM Power machines under the Escala brand since ages (25 years this year)).


How to help ?

How could I help for improving the quality and performance of PostgreSQL on AIX ?


We still have one open issue at AIX: see https://www.mail-archive.com/pgsql-hackers@postgresql.org/msg303094.html
It will be great if you can somehow help to fix this problem.



-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Heikki Linnakangas
Дата:
On 02/01/2017 04:12 PM, Konstantin Knizhnik wrote:
> On 01.02.2017 15:39, Heikki Linnakangas wrote:
>> On 02/01/2017 01:07 PM, Konstantin Knizhnik wrote:
>>> Attached please find my patch for XLC/AIX.
>>> The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
>>> The comment in this file says that:
>>>
>>>        * __fetch_and_add() emits a leading "sync" and trailing "isync",
>>> thereby
>>>        * providing sequential consistency.  This is undocumented.
>>>
>>> But it is not true any more (I checked generated assembler code in
>>> debugger).
>>> This is why I have added __sync() to this function. Now pgbench working
>>> normally.
>>
>> Seems like it was not so much undocumented, but an implementation
>> detail that was not guaranteed after all..
>>
>> Does __fetch_and_add emit a trailing isync there either? Seems odd if
>> __compare_and_swap requires it, but __fetch_and_add does not. Unless
>> we can find conclusive documentation on that, I think we should assume
>> that an __isync() is required, too.
>>
>> There was a long thread on these things the last time this was
>> changed:
>> https://www.postgresql.org/message-id/20160425185204.jrvlghn3jxulsb7i%40alap3.anarazel.de.
>> I couldn't find an explanation there of why we thought that
>> fetch_and_add implicitly performs sync and isync.
>>
>>> Also there is mysterious disappearance of assembler section function
>>> with sync instruction from pg_atomic_compare_exchange_u32_impl.
>>> I have fixed it by using __sync() built-in function instead.
>>
>> __sync() seems more appropriate there, anyway. We're using intrinsics
>> for all the other things in generic-xlc.h. But it sure is scary that
>> the "asm" sections just disappeared.
>>
>> In arch-ppc.h, shouldn't we have #ifdef __IBMC__ guards for the
>> __sync() and __lwsync() intrinsics? Those are an xlc compiler-specific
>> thing, right? Or if they are expected to work on any ppc compiler,
>> then we should probably use them always, instead of the asm sections.
>>
>> In summary, I came up with the attached. It's essentially your patch,
>> with tweaks for the above-mentioned things. I don't have a powerpc
>> system to test on, so there are probably some silly typos there.
>
> Why do you prefer to use _check_lock instead of __check_lock_mp ?
> First one is even not mentioned in XLC compiler manual:
> http://www-01.ibm.com/support/docview.wss?uid=swg27046906&aid=7
> or
> http://scv.bu.edu/computation/bluegene/IBMdocs/compiler/xlc-8.0/html/compiler/ref/bif_sync.htm

Googling around, it seems that they do more or less the same thing. I 
would guess that they actually produce the same assembly code, but I 
have no machine to test on. If I understand correctly, the difference is 
that __check_lock_mp() is an xlc compiler intrinsic, while _check_lock() 
is a libc function. The libc function presumably does __check_lock_mp() 
or __check_lock_up() depending on whether the system is a multi- or 
uni-processor system.

So I think if we're going to change this, the use of __check_lock_mp() 
needs to be in an #ifdef block to check that you're on the XLC compiler, 
as it's a *compiler* intrinsic, while the current code that uses 
_check_lock() are in an "#ifdef _AIX" block, which is correct for 
_check_lock() because it's defined in libc, not by the compiler.

But if there's no pressing reason to change it, let's leave it alone. 
It's not related to the problem at hand, right?

- Heikki




Re: [HACKERS] Deadlock in XLogInsert at AIX

От
"REIX, Tony"
Дата:

Hi Konstantin,

Please run: /opt/IBM/xlc/13.1.3/bin/xlc -qversion  so that I know your exact XLC v13 version.

I'm building on Power7 and not giving any architecture flag to XLC.

I'm not using -qalign=natural . Thus, by default, XLC use -qalign=power, which is close to natural, as explained at:
         https://www.ibm.com/support/knowledgecenter/SSGH2K_13.1.0/com.ibm.xlc131.aix.doc/compiler_ref/opt_align.html
Why are you using this flag ?

Thanks for info about pgbench. PostgreSQL web-site contains a lot of old information...

If you could share scripts or instructions about the tests you are doing with pgbench, I would reproduce here.
I have no "real" application. My job consists in porting OpenSource packages on AIX. Many packages. Erlang, Go, these days. I just want to make PostgreSQL RPMs as good as possible... within the limited amount of time I can give to this package, before moving to another one.

About the zombie issue, I've discussed with my colleagues. Looks like the process keeps zombie till the father looks at its status. However, though I did that several times, I  do not remember well the details. And that should be not specific to AIX. I'll discuss with another colleague, tomorrow, who should understand this better than me.

Patch for Large Files: When building PostgreSQL, I found required to use the following patch so that PostgreSQL works with large files. I do not remember the details. Do you agree with such a patch ? 1rst version (new-...) shows the exact places where   define _LARGE_FILES 1  is required.  2nd version (new2-...) is simpler.

I'm now experimenting with your patch for dead lock. However, that should be invisible with the  "check-world" tests I guess.

Regards,

Tony

Le 01/02/2017 à 16:59, Konstantin Knizhnik a écrit :
Hi Tony,

On 01.02.2017 18:42, REIX, Tony wrote:

Hi Konstantin

XLC.

I'm on AIX 7.1 for now.

I'm using this version of XLC v13:

# xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0003

With this version, I have (at least, since I tested with "check" and not "check-world" at that time) 2 failing tests: create_aggregate , aggregates .


With the following XLC v12 version, I have NO test failure:

# /usr/vac/bin/xlc -qversion
IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
Version: 12.01.0000.0016


So maybe you are not using XLC v13.1.3.3, rather another sub-version. Unless you are using more options for the configure ?


Configure.

What are the options that you give to the configure ?


export CC="/opt/IBM/xlc/13.1.3/bin/xlc"
export CFLAGS="-qarch=pwr8 -qtune=pwr8 -O2 -qalign=natural -q64 "
export LDFLAGS="-Wl,-bbigtoc,-b64"
export AR="/usr/bin/ar -X64"
export LD="/usr/bin/ld -b64 "
export NM="/usr/bin/nm -X64"
./configure --prefix="/opt/postgresql/xlc-debug/9.6"


Hard load & 64 cores ? OK. That clearly explains why I do not see this issue.


pgbench ? I wanted to run it. However, I'm still looking where to get it plus a guide for using it for testing.


pgbench is part of Postgres distributive (src/bin/pgbench)


I would add such tests when building my PostgreSQL RPMs on AIX. So any help is welcome !


Performance.

- Also, I'd like to compare PostgreSQL performance on AIX vs Linux/PPC64. Any idea how I should proceed ? Any PostgreSQL performance benchmark that I could find and use ? pgbench ?

pgbench is most widely used tool simulating OLTP workload. Certainly it is quite primitive and its results are rather artificial. TPC-C seems to be better choice.
But the best case is to implement your own benchmark simulating actual workload of your real application.

- I'm interested in any information for improving the performance & quality of my PostgreSQM RPMs on AIX. (As I already said, BullFreeware RPMs for AIX are free and can be used by anyone, like Perzl RPMs are. My company (ATOS/Bull) sells IBM Power machines under the Escala brand since ages (25 years this year)).


How to help ?

How could I help for improving the quality and performance of PostgreSQL on AIX ?


We still have one open issue at AIX: see https://www.mail-archive.com/pgsql-hackers@postgresql.org/msg303094.html
It will be great if you can somehow help to fix this problem.



-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Вложения

Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Konstantin Knizhnik
Дата:
On 02/01/2017 08:30 PM, REIX, Tony wrote:

Hi Konstantin,

Please run: /opt/IBM/xlc/13.1.3/bin/xlc -qversion  so that I know your exact XLC v13 version.

IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)

I'm building on Power7 and not giving any architecture flag to XLC.

I'm not using -qalign=natural . Thus, by default, XLC use -qalign=power, which is close to natural, as explained at:
         https://www.ibm.com/support/knowledgecenter/SSGH2K_13.1.0/com.ibm.xlc131.aix.doc/compiler_ref/opt_align.html
Why are you using this flag ?


Because otherwise double type is aligned on 4 bytes.

Thanks for info about pgbench. PostgreSQL web-site contains a lot of old information...

If you could share scripts or instructions about the tests you are doing with pgbench, I would reproduce here.


You do not need any script.
Just two simple commands.
One to initialize database:

pgbench -i -s 1000

And another to run benchmark itself:

pgbench -c 100 -j 20 -P 1 -T 1000000000


I have no "real" application. My job consists in porting OpenSource packages on AIX. Many packages. Erlang, Go, these days. I just want to make PostgreSQL RPMs as good as possible... within the limited amount of time I can give to this package, before moving to another one.

About the zombie issue, I've discussed with my colleagues. Looks like the process keeps zombie till the father looks at its status. However, though I did that several times, I  do not remember well the details. And that should be not specific to AIX. I'll discuss with another colleague, tomorrow, who should understand this better than me.


1. Process is not in zomby state (according to ps). It is in <exiting> state... It is something AIX specific, I have not see processes in this state at Linux.
2. I have implemented simple test - forkbomb. It creates 1000 children and then wait for them. It is about ten times slower than at Intel/Linux, but still much faster than 100 seconds. So there is some difference between postgress backend and dummy process doing nothing - just immediately terminating after return from fork()

Patch for Large Files: When building PostgreSQL, I found required to use the following patch so that PostgreSQL works with large files. I do not remember the details. Do you agree with such a patch ? 1rst version (new-...) shows the exact places where   define _LARGE_FILES 1  is required.  2nd version (new2-...) is simpler.

I'm now experimenting with your patch for dead lock. However, that should be invisible with the  "check-world" tests I guess.

Regards,

Tony

Le 01/02/2017 à 16:59, Konstantin Knizhnik a écrit :
Hi Tony,

On 01.02.2017 18:42, REIX, Tony wrote:

Hi Konstantin

XLC.

I'm on AIX 7.1 for now.

I'm using this version of XLC v13:

# xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0003

With this version, I have (at least, since I tested with "check" and not "check-world" at that time) 2 failing tests: create_aggregate , aggregates .


With the following XLC v12 version, I have NO test failure:

# /usr/vac/bin/xlc -qversion
IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
Version: 12.01.0000.0016


So maybe you are not using XLC v13.1.3.3, rather another sub-version. Unless you are using more options for the configure ?


Configure.

What are the options that you give to the configure ?


export CC="/opt/IBM/xlc/13.1.3/bin/xlc"
export CFLAGS="-qarch=pwr8 -qtune=pwr8 -O2 -qalign=natural -q64 "
export LDFLAGS="-Wl,-bbigtoc,-b64"
export AR="/usr/bin/ar -X64"
export LD="/usr/bin/ld -b64 "
export NM="/usr/bin/nm -X64"
./configure --prefix="/opt/postgresql/xlc-debug/9.6"


Hard load & 64 cores ? OK. That clearly explains why I do not see this issue.


pgbench ? I wanted to run it. However, I'm still looking where to get it plus a guide for using it for testing.


pgbench is part of Postgres distributive (src/bin/pgbench)


I would add such tests when building my PostgreSQL RPMs on AIX. So any help is welcome !


Performance.

- Also, I'd like to compare PostgreSQL performance on AIX vs Linux/PPC64. Any idea how I should proceed ? Any PostgreSQL performance benchmark that I could find and use ? pgbench ?

pgbench is most widely used tool simulating OLTP workload. Certainly it is quite primitive and its results are rather artificial. TPC-C seems to be better choice.
But the best case is to implement your own benchmark simulating actual workload of your real application.

- I'm interested in any information for improving the performance & quality of my PostgreSQM RPMs on AIX. (As I already said, BullFreeware RPMs for AIX are free and can be used by anyone, like Perzl RPMs are. My company (ATOS/Bull) sells IBM Power machines under the Escala brand since ages (25 years this year)).


How to help ?

How could I help for improving the quality and performance of PostgreSQL on AIX ?


We still have one open issue at AIX: see https://www.mail-archive.com/pgsql-hackers@postgresql.org/msg303094.html
It will be great if you can somehow help to fix this problem.



-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 



-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Konstantin Knizhnik
Дата:
On 02/01/2017 08:28 PM, Heikki Linnakangas wrote:
>
> But if there's no pressing reason to change it, let's leave it alone. It's not related to the problem at hand,
right?
>

Yes, I agree with you: we should better leave it as it is.


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Konstantin Knizhnik
Дата:
Last update on the issue with deadlock in XLogInsert.

After almost one day of working, pgbench is once again not working 
normally:(
There are no deadlock, there are no core files and no error messages in log.
But TPS is almost zero:

progress: 57446.0 s, 1.1 tps, lat 3840.265 ms stddev NaNQ
progress: 57447.3 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57448.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57449.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57450.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57451.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57452.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57453.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57454.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57455.1 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57456.5 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57457.1 s, 164.6 tps, lat 11504.085 ms stddev 5902.148
progress: 57458.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57459.0 s, 234.0 tps, lat 1597.573 ms stddev 3665.814
progress: 57460.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57461.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57462.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57463.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57464.0 s, 602.8 tps, lat 906.765 ms stddev 1940.256
progress: 57465.0 s, 7.2 tps, lat 38.052 ms stddev 12.302
progress: 57466.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57467.1 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57468.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57469.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57470.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57471.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57472.1 s, 147.8 tps, lat 4379.790 ms stddev 3431.477
progress: 57473.0 s, 1314.1 tps, lat 156.884 ms stddev 535.761
progress: 57474.0 s, 1272.2 tps, lat 31.548 ms stddev 59.538
progress: 57475.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57476.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57477.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57478.0 s, 1688.6 tps, lat 268.379 ms stddev 956.537
progress: 57479.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57480.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57481.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57482.1 s, 29.0 tps, lat 3500.432 ms stddev 54.177
progress: 57483.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57484.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57485.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57486.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57487.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57488.0 s, 66.0 tps, lat 9813.646 ms stddev 19.807
progress: 57489.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57490.0 s, 31.0 tps, lat 8368.125 ms stddev 933.997
progress: 57491.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57492.0 s, 1601.0 tps, lat 226.865 ms stddev 844.952
progress: 57493.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57494.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ


ps auwx shows the following picture:
[10:44:12]postgres@postgres:~/postgresql $ ps auwx | fgrep postgres
postgres 61802470  0.4  0.0 4064 4180  pts/6 A    18:54:58 976:56 
pgbench -c 100 -j 20 -P 1 -T 1000000000 -p 5436
postgres 15271518  0.0  0.0 138276 15024      - A    10:43:34  0:06 
postgres: autovacuum worker process   postgres
postgres 13305354  0.0  0.0 22944 21356      - A    20:49:04 27:51 
postgres: autovacuum worker process   postgres
postgres  5245902  0.0  0.0 14072 14020      - A    18:54:59 10:24 
postgres: postgres postgres [local] COMMIT
postgres 44303278  0.0  0.0 15176 14036      - A    18:54:59 10:18 
postgres: postgres postgres [local] COMMIT
postgres 38601340  0.0  0.0 11564 14008      - A    18:54:59 10:16 
postgres: postgres postgres [local] COMMIT
postgres 53674890  0.0  0.0 12712 14004      - A    18:54:59  8:54 
postgres: postgres postgres [local] COMMIT
postgres 27591640  0.0  0.0 15040 14028      - A    18:54:59  8:38 
postgres: postgres postgres [local] COMMIT
postgres 40960422  0.0  0.0 12128 13996      - A    18:54:59  8:36 
postgres: postgres postgres [local] COMMIT
postgres 41288514  0.0  0.0 10544 14012      - A    18:54:59  8:30 
postgres: postgres postgres [local] idle
postgres 55771564  0.0  0.0 12844 14008      - A    18:54:59  8:24 
postgres: postgres postgres [local] COMMIT
postgres 21760842  0.0  0.0 13164 14008      - A    18:54:59  8:17 
postgres: postgres postgres [local] COMMIT
postgres 18810974  0.0  0.0 10416 14012      - A    18:54:59  8:13 
postgres: postgres postgres [local] idle in transaction
postgres 17566474  0.0  0.0 10224 14012      - A    18:54:59  8:02 
postgres: postgres postgres [local] COMMIT
postgres 63963402  0.0  0.0 11300 14000      - A    18:54:59  7:48 
postgres: postgres postgres [local] COMMIT
postgres  9963962  0.0  0.0 15548 14024      - A    18:54:59  7:37 
postgres: postgres postgres [local] idle
postgres 10094942  0.0  0.0 12192 13996      - A    18:54:59  7:33 
postgres: postgres postgres [local] COMMIT
postgres 53740662  0.0  0.0 15104 14028      - A    18:54:59  7:33 
postgres: postgres postgres [local] idle
postgres 42926266  0.0  0.0 15352 14020      - A    18:54:59  7:32 
postgres: postgres postgres [local] COMMIT
postgres 29295244  0.0  0.0 10612 14016      - A    18:54:59  7:31 
postgres: postgres postgres [local] idle in transaction
postgres  4392458  0.0  0.0 11504 14012      - A    18:54:59  7:28 
postgres: postgres postgres [local] COMMIT
postgres 45482810  0.0  0.0 9896 14004      - A    18:54:59  7:27 
postgres: postgres postgres [local] COMMIT
postgres 59703706  0.0  0.0 11384 14020      - A    18:54:59  7:26 
postgres: postgres postgres [local] COMMIT
postgres 13697320  0.0  0.0 13556 14016      - A    18:54:59  7:26 
postgres: postgres postgres [local] COMMIT
postgres 65275126  0.0  0.0 13748 14016      - A    18:54:59  7:24 
postgres: postgres postgres [local] COMMIT
postgres 17435626  0.0  0.0 13492 14016      - A    18:54:59  7:23 
postgres: postgres postgres [local] COMMIT
postgres 32834044  0.0  0.0 9648 14012      - A    18:54:59  7:23 
postgres: postgres postgres [local] idle in transaction
postgres  3015796  0.0  0.0 15292 14024      - A    18:54:59  7:22 
postgres: postgres postgres [local] COMMIT
postgres 54789310  0.0  0.0 15480 14020      - A    18:54:59  7:21 
postgres: postgres postgres [local] COMMIT
postgres 13369644  0.0  0.0 13300 14016      - A    18:54:59  7:20 
postgres: postgres postgres [local] COMMIT
postgres 49415352  0.0  0.0 12392 14004      - A    18:54:59  7:19 
postgres: postgres postgres [local] COMMIT
postgres 11273960  0.0  0.0 12328 14004      - A    18:54:59  7:19 
postgres: postgres postgres [local] COMMIT
postgres 37749126  0.0  0.0 13628 14024      - A    18:54:59  7:17 
postgres: postgres postgres [local] COMMIT
postgres 42664990  0.0  0.0 14012 14024      - A    18:54:59  7:16 
postgres: postgres postgres [local] idle
postgres 48628314  0.0  0.0 12972 14008      - A    18:54:59  7:15 
postgres: postgres postgres [local] COMMIT
postgres 27526940  0.0  0.0 9832 14004      - A    18:54:59  7:15 
postgres: postgres postgres [local] COMMIT
postgres  4262142  0.0  0.0 11048 14004      - A    18:54:59  7:14 
postgres: postgres postgres [local] COMMIT
postgres 59049404  0.0  0.0 14200 14020      - A    18:54:59  7:13 
postgres: postgres postgres [local] COMMIT
postgres 25035818  0.0  0.0 9264 14012      - A    18:54:59  7:11 
postgres: postgres postgres [local] COMMIT
postgres 62587380  0.0  0.0 15420 14024      - A    18:54:59  7:10 
postgres: postgres postgres [local] COMMIT
postgres 66848122  0.0  0.0 12588 14008      - A    18:54:59  7:06 
postgres: postgres postgres [local] COMMIT
postgres 45352748  0.0  0.0 14912 14028      - A    18:54:59  7:04 
postgres: postgres postgres [local] UPDATE waiting
postgres 46990366  0.0  0.0 11680 13996      - A    18:54:59  7:03 
postgres: postgres postgres [local] idle
postgres 42271516  0.0  0.0 14776 14020      - A    18:54:59  6:59 
postgres: postgres postgres [local] COMMIT
postgres 43253972  0.0  0.0 9192 14004      - A    18:54:59  6:59 
postgres: postgres postgres [local] UPDATE waiting
postgres 37487324  0.0  0.0 11936 13996      - A    18:54:59  6:58 
postgres: postgres postgres [local] COMMIT
postgres 33096324  0.0  0.0 14396 14024      - A    18:54:59  6:58 
postgres: postgres postgres [local] COMMIT
postgres 37094658  0.0  0.0 11184 14012      - A    18:54:59  6:57 
postgres: postgres postgres [local] UPDATE waiting
postgres 41223048  0.0  0.0 11628 14008      - A    18:54:59  6:57 
postgres: postgres postgres [local] idle in transaction
postgres 13240806  0.0  0.0 10024 14004      - A    18:54:59  6:57 
postgres: postgres postgres [local] COMMIT
postgres 61276560  0.0  0.0 10728 14004      - A    18:54:59  6:56 
postgres: postgres postgres [local] COMMIT
postgres 66585476  0.0  0.0 12908 14008      - A    18:54:59  6:52 
postgres: postgres postgres [local] UPDATE waiting
postgres 15074434  0.0  0.0 9328 14012      - A    18:54:59  6:50 
postgres: postgres postgres [local] COMMIT
postgres 33751620  0.0  0.0 12456 14004      - A    18:54:59  6:47 
postgres: postgres postgres [local] COMMIT
postgres   854400  0.0  0.0 14584 14020      - A    18:54:59  6:46 
postgres: postgres postgres [local] COMMIT
postgres 36504484  0.0  0.0 14264 14020      - A    18:54:59  6:46 
postgres: postgres postgres [local] COMMIT
postgres 61408076  0.0  0.0 11112 14004      - A    18:54:59  6:46 
postgres: postgres postgres [local] COMMIT
postgres 24905384  0.0  0.0 14712 14020      - A    18:54:59  6:45 
postgres: postgres postgres [local] COMMIT
postgres 61867150  0.0  0.0 13812 14016      - A    18:54:59  6:45 
postgres: postgres postgres [local] COMMIT
postgres 38798230  0.0  0.0 13876 14016      - A    18:54:59  6:45 
postgres: postgres postgres [local] COMMIT
postgres 53217076  0.0  0.0 13228 14008      - A    18:54:59  6:43 
postgres: postgres postgres [local] COMMIT
postgres 19727378  0.0  0.0 9456 14012      - A    18:54:59  6:40 
postgres: postgres postgres [local] idle in transaction
postgres 20253128  0.0  0.0 12648 14004      - A    18:54:59  6:38 
postgres: postgres postgres [local] COMMIT
postgres 35784016  0.0  0.0 10088 14004      - A    18:54:59  6:38 
postgres: postgres postgres [local] COMMIT
postgres  9243026  0.0  0.0 13100 14008      - A    18:54:59  6:37 
postgres: postgres postgres [local] COMMIT
postgres 14027754  0.0  0.0 10792 14004      - A    18:54:59  6:35 
postgres: postgres postgres [local] COMMIT
postgres 61342300  0.0  0.0 12264 14004      - A    18:54:59  6:34 
postgres: postgres postgres [local] UPDATE waiting
postgres 21693262  0.0  0.0 14460 14024      - A    18:54:59  6:30 
postgres: postgres postgres [local] COMMIT
postgres 53938020  0.0  0.0 10856 14004      - A    18:54:59  6:30 
postgres: postgres postgres [local] COMMIT
postgres 24053688  0.0  0.0 12064 13996      - A    18:54:59  6:29 
postgres: postgres postgres [local] COMMIT
postgres 45024698  0.0  0.0 14984 14036      - A    18:54:59  6:25 
postgres: postgres postgres [local] COMMIT
postgres 20710448  0.0  0.0 10480 14012      - A    18:54:59  6:24 
postgres: postgres postgres [local] COMMIT
postgres  8718716  0.0  0.0 15232 14028      - A    18:54:59  6:23 
postgres: postgres postgres [local] COMMIT
postgres 55313538  0.0  0.0 14136 14020      - A    18:54:59  6:22 
postgres: postgres postgres [local] COMMIT
postgres 13896472  0.0  0.0 9584 14012      - A    18:54:59  6:18 
postgres: postgres postgres [local] COMMIT
postgres  8261178  0.0  0.0 9960 14004      - A    18:54:59  6:18 
postgres: postgres postgres [local] COMMIT
postgres  8980574  0.0  0.0 10540 15608      - A    18:54:54  6:17 
postgres: checkpointer process
postgres  4787946  0.0  0.0 9392 14012      - A    18:54:59  6:16 
postgres: postgres postgres [local] COMMIT
postgres 27919600  0.0  0.0 11812 14000      - A    18:54:59  6:16 
postgres: postgres postgres [local] COMMIT
postgres 47579896  0.0  0.0 10920 14004      - A    18:54:59  6:14 
postgres: postgres postgres [local] COMMIT
postgres 10290864  0.0  0.0 10160 14012      - A    18:54:59  6:13 
postgres: postgres postgres [local] COMMIT
postgres 46072384  0.0  0.0 14520 14020      - A    18:54:59  6:13 
postgres: postgres postgres [local] COMMIT
postgres 57737434  0.0  0.0 9520 14012      - A    18:54:59  6:12 
postgres: postgres postgres [local] COMMIT
postgres 65012512  0.0  0.0 9768 14004      - A    18:54:59  6:11 
postgres: postgres postgres [local] COMMIT
postgres 21495924  0.0  0.0 12000 13996      - A    18:54:59  6:11 
postgres: postgres postgres [local] COMMIT
postgres 59704720  0.0  0.0 14656 14028      - A    18:54:59  6:09 
postgres: postgres postgres [local] COMMIT
postgres 58458128  0.0  0.0 13036 14008      - A    18:54:59  6:08 
postgres: postgres postgres [local] COMMIT
postgres 53412068  0.0  0.0 13684 14016      - A    18:54:59  6:02 
postgres: postgres postgres [local] COMMIT
postgres  8652638  0.0  0.0 11244 14008      - A    18:54:59  6:00 
postgres: postgres postgres [local] COMMIT
postgres 14289464  0.0  0.0 10672 14012      - A    18:54:59  6:00 
postgres: postgres postgres [local] COMMIT
postgres 16582572  0.0  0.0 14328 14020      - A    18:54:59  5:56 
postgres: postgres postgres [local] COMMIT
postgres  9308408  0.0  0.0 9704 14004      - A    18:54:59  5:56 
postgres: postgres postgres [local] COMMIT
postgres 51970736  0.0  0.0 11440 14012      - A    18:54:59  5:51 
postgres: postgres postgres [local] COMMIT
postgres 19792490  0.0  0.0 13364 14016      - A    18:54:59  5:48 
postgres: postgres postgres [local] COMMIT
postgres 58065970  0.0  0.0 11744 13996      - A    18:54:59  5:38 
postgres: postgres postgres [local] COMMIT
postgres 17891344  0.0  0.0 10984 14004      - A    18:54:59  5:33 
postgres: postgres postgres [local] COMMIT
postgres 20121588  0.0  0.0 12520 14004      - A    18:54:59  5:30 
postgres: postgres postgres [local] COMMIT
postgres 39977868  0.0  0.0 13944 14020      - A    18:54:59  5:26 
postgres: postgres postgres [local] COMMIT
postgres 25167604  0.0  0.0 11872 13996      - A    18:54:59  5:25 
postgres: postgres postgres [local] COMMIT
postgres 22348378  0.0  0.0 10352 14012      - A    18:54:59  5:25 
postgres: postgres postgres [local] COMMIT
postgres  8587156  0.0  0.0 12780 14008      - A    18:54:59  5:21 
postgres: postgres postgres [local] COMMIT
postgres 12453402  0.0  0.0 10288 14012      - A    18:54:59  5:20 
postgres: postgres postgres [local] COMMIT
postgres 25822956  0.0  0.0 13428 14016      - A    18:54:59  5:04 
postgres: postgres postgres [local] COMMIT
postgres  7145012  0.0  0.0 14844 14024      - A    18:54:59  5:04 
postgres: postgres postgres [local] COMMIT
postgres 10224292  0.0  0.0 8100 13040      - A    18:54:54  2:55 
postgres: wal writer process
postgres 47711172  0.0  0.0 8080 13084      - A    18:54:54  0:57 
postgres: writer process
postgres 22743474  0.0  0.0 8328 13204      - A    18:54:54  0:29 
postgres: stats collector process
postgres  1706138  0.0  0.0 137280 12940  pts/6 A    18:54:54  0:17 
/opt/postgresql/xlc-debug/9.6/bin/postgres -D xlc-debug
postgres  9898070  0.0  0.0 8528 13404      - A    18:54:54  0:02 
postgres: autovacuum launcher process
postgres 55444330  0.0  0.0 7884 13016      - A    18:54:54  0:01 
postgres: logger process
postgres 42927172  0.0  0.0 138232 13892      - A    10:44:18  0:00 
postgres: autovacuum worker process   postgres
postgres  3934886  0.0  0.0  244  256  pts/3 A    10:44:21  0:00 fgrep 
postgres



which is pericodically (but slowly) updated. So there seems to be no 
hanged backends (it is hard to inspect state of all 100 backends).
CPU activity is almost zero as well as disk activity, memory is mostly 
free, ...

Attaching to one of the backends I got the following stack traces in 
debugger:


[10:44:21]postgres@postgres:~/postgresql $ dbx -a 25822956 
/opt/postgresql/xlc-debug/9.6/bin/postgres
Waiting to attach to process 25822956 ...
Successfully attached to postgres.
warning: Directory containing postgres could not be determined.
Apply 'use' command to initialize source path.

Type 'help' for help.
reading symbolic information ...warning: no source compiled with -g

stopped in __fd_poll at 0x9000000001545d4
0x9000000001545d4 (__fd_poll+0xb4) e8410028             ld r2,0x28(r1)
(dbx) where
__fd_poll(??, ??, ??) at 0x9000000001545d4
WaitEventSetWait() at 0x10012b730
secure_read() at 0x100141ca4
IPRA.$pq_recvbuf() at 0x10013e370
pq_getbyte() at 0x10013f294
SocketBackend() at 0x10006cf98
postgres.IPRA.$exec_simple_query.ReadCommand() at 0x10006cf1c
PostgresMain() at 0x10006e590
IPRA.$BackendRun() at 0x1001301ac
BackendStartup() at 0x10012f784
postmaster.IPRA.$do_start_bgworker.IPRA.$ServerLoop() at 0x10012fec4
PostmasterMain() at 0x100134760
main() at 0x1000006ac
(dbx) cont
^C
Interrupt in semop at 0x9000000001f5790
0x9000000001f5790 (semop+0xb0) e8410028             ld   r2,0x28(r1)
(dbx) where
semop(??, ??, ??) at 0x9000000001f5790
PGSemaphoreLock() at 0x100049040
LWLockAcquireOrWait() at 0x100047140
XLogFlush() at 0x1000e89ec
RecordTransactionCommit() at 0x10005856c
xact.RecordTransactionAbort.IPRA.$CommitTransaction() at 0x10005a598
CommitTransactionCommand() at 0x10005ea30
postgres.IPRA.$exec_describe_portal_message.IPRA.$finish_xact_command() 
at 0x10006c91c
IPRA.$exec_simple_query() at 0x10006c298
PostgresMain() at 0x10006eac0
IPRA.$BackendRun() at 0x1001301ac
BackendStartup() at 0x10012f784
postmaster.IPRA.$do_start_bgworker.IPRA.$ServerLoop() at 0x10012fec4
PostmasterMain() at 0x100134760
main() at 0x1000006ac
(dbx) cont
^C
Interrupt in __fd_poll at 0x9000000001545d4
0x9000000001545d4 (__fd_poll+0xb4) e8410028             ld r2,0x28(r1)
(dbx) where
__fd_poll(??, ??, ??) at 0x9000000001545d4
WaitEventSetWait() at 0x10012b730
WaitLatchOrSocket() at 0x10012b48c
ProcSleep() at 0x100144fe4
IPRA.$WaitOnLock() at 0x1001492fc
LockAcquireExtended() at 0x1001519ec
XactLockTableWait() at 0x10016d7e8
heap_update() at 0x1000cc7e4
IPRA.$ExecUpdate() at 0x1004f041c
ExecModifyTable() at 0x1004f2220
ExecProcNode() at 0x1001faf08
IPRA.$ExecutePlan() at 0x1001f64b0
standard_ExecutorRun() at 0x1001f9c34
ExecutorRun() at 0x1001f9d6c
pquery.IPRA.$FillPortalStore.IPRA.$ProcessQuery() at 0x1003309e4
IPRA.$PortalRunMulti() at 0x10032fcf8
PortalRun() at 0x100331030
IPRA.$exec_simple_query() at 0x10006c014
PostgresMain() at 0x10006eac0
IPRA.$BackendRun() at 0x1001301ac
BackendStartup() at 0x10012f784
postmaster.IPRA.$do_start_bgworker.IPRA.$ServerLoop() at 0x10012fec4
PostmasterMain() at 0x100134760
main() at 0x1000006ac
(dbx) cont

User defined signal 1 in __fd_poll at 0x9000000001545d4
0x9000000001545d4 (__fd_poll+0xb4) e8410028             ld r2,0x28(r1)
(dbx) where
__fd_poll(??, ??, ??) at 0x9000000001545d4
WaitEventSetWait() at 0x10012b730
WaitLatchOrSocket() at 0x10012b48c
ProcSleep() at 0x100144fe4
IPRA.$WaitOnLock() at 0x1001492fc
LockAcquireExtended() at 0x1001519ec
XactLockTableWait() at 0x10016d7e8
heap_update() at 0x1000cc7e4
IPRA.$ExecUpdate() at 0x1004f041c
ExecModifyTable() at 0x1004f2220
ExecProcNode() at 0x1001faf08
IPRA.$ExecutePlan() at 0x1001f64b0
standard_ExecutorRun() at 0x1001f9c34
ExecutorRun() at 0x1001f9d6c
pquery.IPRA.$FillPortalStore.IPRA.$ProcessQuery() at 0x1003309e4
IPRA.$PortalRunMulti() at 0x10032fcf8
PortalRun() at 0x100331030
IPRA.$exec_simple_query() at 0x10006c014
PostgresMain() at 0x10006eac0
IPRA.$BackendRun() at 0x1001301ac
BackendStartup() at 0x10012f784
postmaster.IPRA.$do_start_bgworker.IPRA.$ServerLoop() at 0x10012fec4
PostmasterMain() at 0x100134760
main() at 0x1000006ac
(dbx) cont

Broken pipe in send at 0x90000000010cfb4
0x90000000010cfb4 (send+0x2b4) e8410028             ld   r2,0x28(r1)
(dbx) where
send(??, ??, ??, ??) at 0x90000000010cfb4
secure_write() at 0x100141a94
IPRA.$internal_flush() at 0x10013e800
socket_flush() at 0x10013ee7c
ReadyForQuery@AF106_4() at 0x10033241c
PostgresMain() at 0x10006eb7c
IPRA.$BackendRun() at 0x1001301ac
BackendStartup() at 0x10012f784
postmaster.IPRA.$do_start_bgworker.IPRA.$ServerLoop() at 0x10012fec4
PostmasterMain() at 0x100134760
main() at 0x1000006ac
(dbx) where
send(??, ??, ??, ??) at 0x90000000010cfb4
secure_write() at 0x100141a94
IPRA.$internal_flush() at 0x10013e800
socket_flush() at 0x10013ee7c
ReadyForQuery@AF106_4() at 0x10033241c
PostgresMain() at 0x10006eb7c
IPRA.$BackendRun() at 0x1001301ac
BackendStartup() at 0x10012f784
postmaster.IPRA.$do_start_bgworker.IPRA.$ServerLoop() at 0x10012fec4
PostmasterMain() at 0x100134760
main() at 0x1000006ac
(dbx) cont

execution completed (exit code 1)
(dbx) quit
^C
(dbx) quit
libdebug assertion "(rc == DB_SUCCESS)" failed at line 162 in file 
../../../../../../../../../../../src/bos/usr/ccs/lib/libdbx/libdebug/modules/procdebug/ptrace/procdb_PtraceSession.C

I have no idea what's going on. It is release version without debug 
information, assert checks and lwlock info. So it is hard to debug it.
Heikki, I will be pleased if you have a chance to login at the system 
and look at it yourself.
May be you will have some idea what's happening...

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: [HACKERS] Deadlock in XLogInsert at AIX

От
"REIX, Tony"
Дата:

Hi Konstantin

I've discussed the "zombie/exit" issue with our expert here.

- He does not think that AIX has anything special here

- If the process is marked <exiting> in ps, this is because the flag SEXIT is set, thus the process is blocked somewhere in the kexitx() syscall, waiting for something.

- In order to know what it is waiting for, the best would be to have a look with kdb.

- either it is waiting for an asynchronous I/O to end, or a thread to end if the process is multi-thread

- Using the proctree command for analyzing the issue is not a good idea, since the process will block in kexitx() if there is an operation on /proc being done

- If the process is marked <defunct>, that means that the process has not called waitpid() yet for getting the son's status. Maybe the parent is blocked in non-interruptible code where the signal handler cannot be called.

- In short, that may be due to many causes... Use kdb is the best way.

- Instead of proctree (which makes use of /proc), use: "ps -faT <pid>".


I'll try to reproduce here.

Regards

Tony

Le 01/02/2017 à 21:26, Konstantin Knizhnik a écrit :
On 02/01/2017 08:30 PM, REIX, Tony wrote:

....

About the zombie issue, I've discussed with my colleagues. Looks like the process keeps zombie till the father looks at its status. However, though I did that several times, I  do not remember well the details. And that should be not specific to AIX. I'll discuss with another colleague, tomorrow, who should understand this better than me.


1. Process is not in zomby state (according to ps). It is in <exiting> state... It is something AIX specific, I have not see processes in this state at Linux.
2. I have implemented simple test - forkbomb. It creates 1000 children and then wait for them. It is about ten times slower than at Intel/Linux, but still much faster than 100 seconds. So there is some difference between postgress backend and dummy process doing nothing - just immediately terminating after return from fork()
....

Regards,

Tony

Le 01/02/2017 à 16:59, Konstantin Knizhnik a écrit :
Hi Tony,

On 01.02.2017 18:42, REIX, Tony wrote:

Hi Konstantin

XLC.

I'm on AIX 7.1 for now.

I'm using this version of XLC v13:

# xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0003

With this version, I have (at least, since I tested with "check" and not "check-world" at that time) 2 failing tests: create_aggregate , aggregates .


With the following XLC v12 version, I have NO test failure:

# /usr/vac/bin/xlc -qversion
IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
Version: 12.01.0000.0016


So maybe you are not using XLC v13.1.3.3, rather another sub-version. Unless you are using more options for the configure ?


Configure.

What are the options that you give to the configure ?


export CC="/opt/IBM/xlc/13.1.3/bin/xlc"
export CFLAGS="-qarch=pwr8 -qtune=pwr8 -O2 -qalign=natural -q64 "
export LDFLAGS="-Wl,-bbigtoc,-b64"
export AR="/usr/bin/ar -X64"
export LD="/usr/bin/ld -b64 "
export NM="/usr/bin/nm -X64"
./configure --prefix="/opt/postgresql/xlc-debug/9.6"


Hard load & 64 cores ? OK. That clearly explains why I do not see this issue.


pgbench ? I wanted to run it. However, I'm still looking where to get it plus a guide for using it for testing.


pgbench is part of Postgres distributive (src/bin/pgbench)


I would add such tests when building my PostgreSQL RPMs on AIX. So any help is welcome !


Performance.

- Also, I'd like to compare PostgreSQL performance on AIX vs Linux/PPC64. Any idea how I should proceed ? Any PostgreSQL performance benchmark that I could find and use ? pgbench ?

pgbench is most widely used tool simulating OLTP workload. Certainly it is quite primitive and its results are rather artificial. TPC-C seems to be better choice.
But the best case is to implement your own benchmark simulating actual workload of your real application.

- I'm interested in any information for improving the performance & quality of my PostgreSQM RPMs on AIX. (As I already said, BullFreeware RPMs for AIX are free and can be used by anyone, like Perzl RPMs are. My company (ATOS/Bull) sells IBM Power machines under the Escala brand since ages (25 years this year)).


How to help ?

How could I help for improving the quality and performance of PostgreSQL on AIX ?


We still have one open issue at AIX: see https://www.mail-archive.com/pgsql-hackers@postgresql.org/msg303094.html
It will be great if you can somehow help to fix this problem.



-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 



-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] Deadlock in XLogInsert at AIX

От
"REIX, Tony"
Дата:

Hi Konstantin

I have an issue with pgbench. Any idea ?


  # mkdir /tmp/PGS
 # chown pgstbf.staff /tmp/PGS
 
 # su pgstbf

 $ /opt/freeware/bin/initdb -D /tmp/PGS
 The files belonging to this database system will be owned by user "pgstbf".
 This user must also own the server prcess.
 
 The database cluster will be initialized with locale "C".
 The default database encoding has accordingly been set to "SQL_ASCII".
 The default text search configuration will be set to "english".
 
 Data page checksums are disabled.
 
 fixing permissions on existing directory /tmp/PGS ... ok
 creating subdirectories ... ok
 selecting default max_connections ... 100
 selecting default shared_buffers ... 128MB
 selecting dynamic shared memory implementation ... posix
 creating configuration files ... ok
 running bootstrap script ... ok
 performing post-bootstrap initialization ... ok
 syncing data to disk ... ok
 
 WARNING: enabling "trust" authentication for local connections
 You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb.
 
 Success. You can now start the database server using:
 


 $ /opt/freeware/bin/pg_ctl -D /tmp/PGS -l /tmp/PGS/logfile start
 server starting
 

 $ /opt/freeware/bin/pg_ctl -D /tmp/PGS -l /tmp/PGS/logfile status
  pg_ctl: server is running (PID: 11599920)
 /opt/freeware/bin/postgres_64 "-D" "/tmp/PGS"


 $ /usr/bin/createdb pgstbf
 $
 


 $ pgbench -i -s 1000
 creating tables...
 100000 of 100000000 tuples (0%) done (elapsed 0.29 s, remaining 288.09 s)
 ...
 100000000 of 100000000 tuples (100%) done (elapsed 42.60 s, remaining 0.00 s)
 ERROR:  could not extend file "base/16384/24614": wrote only 7680 of 8192 bytes at block 131071
 HINT:  Check free disk space.

 CONTEXT:  COPY pgbench_accounts, line 7995584
 PQendcopy failed


After cleaning all /tmp/PGS and symlinking it to /home, where I have 6GB free, I've retried and I got nearly the same:


 100000000 of 100000000 tuples (100%) done (elapsed 204.65 s, remaining 0.00 s)
 ERROR:  could not extend file "base/16384/16397.6": No space left on device
 HINT:  Check free disk space.
 CONTEXT:  COPY pgbench_accounts, line 51235802
PQendcopy failed


Do I need more than 6GB ???


Thanks

Tony


$ df -k .
Filesystem    1024-blocks      Free %Used    Iused %Iused Mounted on
/dev/hd1         45088768   6719484   86%   946016    39% /home

bash-4.3$ pwd
/tmp/PGS

bash-4.3$ ll /tmp/PGS
lrwxrwxrwx    1 root     system           10 Feb  2 08:43 /tmp/PGS -> /home/PGS/


$ df -k
Filesystem    1024-blocks      Free %Used    Iused %Iused Mounted on
/dev/hd4           524288    277284   48%    10733    14% /
/dev/hd2          6684672    148896   98%    49303    48% /usr
/dev/hd9var       2097152    314696   85%    24934    18% /var
/dev/hd3          3145728   2527532   20%      418     1% /tmp
/dev/hd1         45088768   6719484   86%   946016    39% /home
/dev/hd11admin      131072    130692    1%        7     1% /admin
/proc                   -         -    -         -     -  /proc
/dev/hd10opt     65273856    829500   99%   938339    41% /opt
/dev/livedump      262144    261776    1%        4     1% /var/adm/ras/livedump
/aha                    -         -    -        18     1% /aha

$ cat logfile
LOG:  database system was shut down at 2017-02-02 09:08:31 CST
LOG:  MultiXact member wraparound protections are now enabled
LOG:  autovacuum launcher started
LOG:  database system is ready to accept connections
ERROR:  could not extend file "base/16384/16397.6": No space left on device
HINT:  Check free disk space.
CONTEXT:  COPY pgbench_accounts, line 51235802
STATEMENT:  copy pgbench_accounts from stdin


$ ulimit -a
core file size          (blocks, -c) 1048575
data seg size           (kbytes, -d) 131072
file size               (blocks, -f) unlimited
max memory size         (kbytes, -m) 32768
open files                      (-n) 2000
pipe size            (512 bytes, -p) 64
stack size              (kbytes, -s) 32768
cpu time               (seconds, -t) unlimited
max user processes              (-u) unlimited
virtual memory          (kbytes, -v) unlimited


bash-4.3$ ll /tmp/PGS
lrwxrwxrwx    1 root     system           10 Feb  2 08:43 /tmp/PGS -> /home/PGS/
bash-4.3$ ls -l
total 120
-rw-------    1 pgstbf   staff             4 Feb  2 09:08 PG_VERSION
drwx------    6 pgstbf   staff           256 Feb  2 09:09 base
drwx------    2 pgstbf   staff          4096 Feb  2 09:09 global
-rw-------    1 pgstbf   staff           410 Feb  2 09:13 logfile
drwx------    2 pgstbf   staff           256 Feb  2 09:08 pg_clog
drwx------    2 pgstbf   staff           256 Feb  2 09:08 pg_commit_ts
drwx------    2 pgstbf   staff           256 Feb  2 09:08 pg_dynshmem
-rw-------    1 pgstbf   staff          4462 Feb  2 09:08 pg_hba.conf
-rw-------    1 pgstbf   staff          1636 Feb  2 09:08 pg_ident.conf
drwx------    4 pgstbf   staff           256 Feb  2 09:08 pg_logical
drwx------    4 pgstbf   staff           256 Feb  2 09:08 pg_multixact
drwx------    2 pgstbf   staff           256 Feb  2 09:08 pg_notify
drwx------    2 pgstbf   staff           256 Feb  2 09:08 pg_replslot
drwx------    2 pgstbf   staff           256 Feb  2 09:08 pg_serial
drwx------    2 pgstbf   staff           256 Feb  2 09:08 pg_snapshots
drwx------    2 pgstbf   staff           256 Feb  2 09:08 pg_stat
drwx------    2 pgstbf   staff           256 Feb  2 09:17 pg_stat_tmp
drwx------    2 pgstbf   staff           256 Feb  2 09:08 pg_subtrans
drwx------    2 pgstbf   staff           256 Feb  2 09:08 pg_tblspc
drwx------    2 pgstbf   staff           256 Feb  2 09:08 pg_twophase
drwx------    3 pgstbf   staff           256 Feb  2 09:08 pg_xlog
-rw-------    1 pgstbf   staff            88 Feb  2 09:08 postgresql.auto.conf
-rw-------    1 pgstbf   staff         22236 Feb  2 09:08 postgresql.conf
-rw-------    1 pgstbf   staff            46 Feb  2 09:08 postmaster.opts
-rw-------    1 pgstbf   staff            69 Feb  2 09:08 postmaster.pid
bash-4.3$ ls -l base
total 112
drwx------    2 pgstbf   staff         16384 Feb  2 09:08 1
drwx------    2 pgstbf   staff         12288 Feb  2 09:08 12407
drwx------    2 pgstbf   staff         12288 Feb  2 09:09 12408
drwx------    2 pgstbf   staff         16384 Feb  2 09:14 16384
bash-4.3$ ls -l base/16384/
total 15200
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 112
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 113
-rw-------    1 pgstbf   staff         57344 Feb  2 09:09 12243
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 12243_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12243_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 12245
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12247
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12248
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 12248_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12248_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 12250
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12252
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12253
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 12253_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12253_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 12255
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12257
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12258
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 12258_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12258_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 12260
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12262
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12263
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 12263_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12263_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 12265
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12267
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12268
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 12268_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12268_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 12270
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12272
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 12273
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 12275
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12277
-rw-------    1 pgstbf   staff         73728 Feb  2 09:14 1247
-rw-------    1 pgstbf   staff         24576 Feb  2 09:14 1247_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:14 1247_vm
-rw-------    1 pgstbf   staff        368640 Feb  2 09:14 1249
-rw-------    1 pgstbf   staff         24576 Feb  2 09:14 1249_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:14 1249_vm
-rw-------    1 pgstbf   staff        589824 Feb  2 09:09 1255
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 1255_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 1255_vm
-rw-------    1 pgstbf   staff         90112 Feb  2 09:14 1259
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 1259_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:14 1259_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 1417
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 1417_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 1418
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 1418_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 16385
-rw-------    1 pgstbf   staff        450560 Feb  2 09:14 16388
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 16388_fsm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 16391
-rw-------    1 pgstbf   staff         40960 Feb  2 09:14 16394
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 16394_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 174
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 175
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2187
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2328
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2328_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2336
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2336_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2337
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2600
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2600_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2600_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2601
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2601_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2601_vm
-rw-------    1 pgstbf   staff         49152 Feb  2 09:09 2602
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2602_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2602_vm
-rw-------    1 pgstbf   staff         40960 Feb  2 09:09 2603
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2603_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2603_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2604
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2604_vm
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2605
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2605_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2605_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2606
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2606_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2606_vm
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2607
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2607_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2607_vm
-rw-------    1 pgstbf   staff        450560 Feb  2 09:14 2608
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2608_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:14 2608_vm
-rw-------    1 pgstbf   staff        278528 Feb  2 09:09 2609
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2609_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2609_vm
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2610
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2610_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2610_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2611
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2611_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2612
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2612_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2612_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2613
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2613_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2615
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2615_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2615_vm
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2616
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2616_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2616_vm
-rw-------    1 pgstbf   staff        122880 Feb  2 09:09 2617
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2617_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2617_vm
-rw-------    1 pgstbf   staff         98304 Feb  2 09:09 2618
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2618_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2618_vm
-rw-------    1 pgstbf   staff        122880 Feb  2 09:14 2619
-rw-------    1 pgstbf   staff         24576 Feb  2 09:14 2619_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:14 2619_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2620
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2620_vm
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2650
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2651
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2652
-rw-------    1 pgstbf   staff         40960 Feb  2 09:09 2653
-rw-------    1 pgstbf   staff         40960 Feb  2 09:09 2654
-rw-------    1 pgstbf   staff         40960 Feb  2 09:09 2655
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2656
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2657
-rw-------    1 pgstbf   staff        106496 Feb  2 09:14 2658
-rw-------    1 pgstbf   staff         73728 Feb  2 09:14 2659
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2660
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2661
-rw-------    1 pgstbf   staff         32768 Feb  2 09:14 2662
-rw-------    1 pgstbf   staff         40960 Feb  2 09:14 2663
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2664
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2665
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2666
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2667
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2668
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2669
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2670
-rw-------    1 pgstbf   staff        319488 Feb  2 09:14 2673
-rw-------    1 pgstbf   staff        352256 Feb  2 09:14 2674
-rw-------    1 pgstbf   staff        172032 Feb  2 09:09 2675
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2678
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2679
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2680
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2681
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2682
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2683
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2684
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2685
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2686
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2687
-rw-------    1 pgstbf   staff         40960 Feb  2 09:09 2688
-rw-------    1 pgstbf   staff         40960 Feb  2 09:09 2689
-rw-------    1 pgstbf   staff         81920 Feb  2 09:09 2690
-rw-------    1 pgstbf   staff        253952 Feb  2 09:09 2691
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2692
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2693
-rw-------    1 pgstbf   staff         16384 Feb  2 09:14 2696
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2699
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2701
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2702
-rw-------    1 pgstbf   staff         16384 Feb  2 09:14 2703
-rw-------    1 pgstbf   staff         40960 Feb  2 09:14 2704
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2753
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2753_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2753_vm
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2754
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2755
-rw-------    1 pgstbf   staff         32768 Feb  2 09:09 2756
-rw-------    1 pgstbf   staff         32768 Feb  2 09:09 2757
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2830
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2830_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2831
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2832
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2832_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2833
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2834
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2834_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2835
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2836
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2836_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2837
-rw-------    1 pgstbf   staff        385024 Feb  2 09:09 2838
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2838_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2838_vm
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2839
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2840
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2840_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2840_vm
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2841
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2995
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2995_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2996
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3079
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 3079_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3079_vm
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3080
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3081
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3085
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3118
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3118_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3119
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3164
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3256
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3256_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3257
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3258
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3394
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 3394_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3394_vm
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3395
-rw-------    1 pgstbf   staff         32768 Feb  2 09:14 3455
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3456
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 3456_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3456_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3466
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3466_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3467
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3468
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3501
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3501_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3502
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3503
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3534
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3541
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 3541_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3541_vm
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3542
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3574
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3575
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3576
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3576_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3596
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3596_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3597
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3598
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3598_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3599
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3600
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 3600_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3600_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3601
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 3601_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3601_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3602
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 3602_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3602_vm
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3603
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 3603_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3603_vm
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3604
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3605
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3606
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3607
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3608
-rw-------    1 pgstbf   staff         32768 Feb  2 09:09 3609
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3712
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3764
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 3764_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3764_vm
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3766
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3767
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 548
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 549
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 826
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 826_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 827
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 828
-rw-------    1 pgstbf   staff             4 Feb  2 09:09 PG_VERSION
-rw-------    1 pgstbf   staff           512 Feb  2 09:09 pg_filenode.map
-rw-------    1 pgstbf   staff        112660 Feb  2 09:09 pg_internal.init



Le 01/02/2017 à 21:26, Konstantin Knizhnik a écrit :
On 02/01/2017 08:30 PM, REIX, Tony wrote:

Hi Konstantin, ....

If you could share scripts or instructions about the tests you are doing with pgbench, I would reproduce here.


You do not need any script.
Just two simple commands.
One to initialize database:

pgbench -i -s 1000

And another to run benchmark itself:

pgbench -c 100 -j 20 -P 1 -T 1000000000
...

Regards,

Tony

Le 01/02/2017 à 16:59, Konstantin Knizhnik a écrit :
Hi Tony,

On 01.02.2017 18:42, REIX, Tony wrote:

Hi Konstantin

XLC.

I'm on AIX 7.1 for now.

I'm using this version of XLC v13:

# xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0003

With this version, I have (at least, since I tested with "check" and not "check-world" at that time) 2 failing tests: create_aggregate , aggregates .


With the following XLC v12 version, I have NO test failure:

# /usr/vac/bin/xlc -qversion
IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
Version: 12.01.0000.0016


So maybe you are not using XLC v13.1.3.3, rather another sub-version. Unless you are using more options for the configure ?


Configure.

What are the options that you give to the configure ?


export CC="/opt/IBM/xlc/13.1.3/bin/xlc"
export CFLAGS="-qarch=pwr8 -qtune=pwr8 -O2 -qalign=natural -q64 "
export LDFLAGS="-Wl,-bbigtoc,-b64"
export AR="/usr/bin/ar -X64"
export LD="/usr/bin/ld -b64 "
export NM="/usr/bin/nm -X64"
./configure --prefix="/opt/postgresql/xlc-debug/9.6"


Hard load & 64 cores ? OK. That clearly explains why I do not see this issue.


pgbench ? I wanted to run it. However, I'm still looking where to get it plus a guide for using it for testing.


pgbench is part of Postgres distributive (src/bin/pgbench)


I would add such tests when building my PostgreSQL RPMs on AIX. So any help is welcome !


Performance.

- Also, I'd like to compare PostgreSQL performance on AIX vs Linux/PPC64. Any idea how I should proceed ? Any PostgreSQL performance benchmark that I could find and use ? pgbench ?

pgbench is most widely used tool simulating OLTP workload. Certainly it is quite primitive and its results are rather artificial. TPC-C seems to be better choice.
But the best case is to implement your own benchmark simulating actual workload of your real application.

- I'm interested in any information for improving the performance & quality of my PostgreSQM RPMs on AIX. (As I already said, BullFreeware RPMs for AIX are free and can be used by anyone, like Perzl RPMs are. My company (ATOS/Bull) sells IBM Power machines under the Escala brand since ages (25 years this year)).


How to help ?

How could I help for improving the quality and performance of PostgreSQL on AIX ?


We still have one open issue at AIX: see https://www.mail-archive.com/pgsql-hackers@postgresql.org/msg303094.html
It will be great if you can somehow help to fix this problem.



-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 



-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Konstantin Knizhnik
Дата:
On 02.02.2017 18:20, REIX, Tony wrote:

Hi Konstantin

I have an issue with pgbench. Any idea ?



Pgbench -s options specifies scale.
Scale 1000 corresponds to 1000 million rows and requires about 16Gb at disk.

  # mkdir /tmp/PGS
 # chown pgstbf.staff /tmp/PGS
 
 # su pgstbf

 $ /opt/freeware/bin/initdb -D /tmp/PGS
 The files belonging to this database system will be owned by user "pgstbf".
 This user must also own the server prcess.
 
 The database cluster will be initialized with locale "C".
 The default database encoding has accordingly been set to "SQL_ASCII".
 The default text search configuration will be set to "english".
 
 Data page checksums are disabled.
 
 fixing permissions on existing directory /tmp/PGS ... ok
 creating subdirectories ... ok
 selecting default max_connections ... 100
 selecting default shared_buffers ... 128MB
 selecting dynamic shared memory implementation ... posix
 creating configuration files ... ok
 running bootstrap script ... ok
 performing post-bootstrap initialization ... ok
 syncing data to disk ... ok
 
 WARNING: enabling "trust" authentication for local connections
 You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb.
 
 Success. You can now start the database server using:
 


 $ /opt/freeware/bin/pg_ctl -D /tmp/PGS -l /tmp/PGS/logfile start
 server starting
 

 $ /opt/freeware/bin/pg_ctl -D /tmp/PGS -l /tmp/PGS/logfile status
  pg_ctl: server is running (PID: 11599920)
 /opt/freeware/bin/postgres_64 "-D" "/tmp/PGS"


 $ /usr/bin/createdb pgstbf
 $
 


 $ pgbench -i -s 1000
 creating tables...
 100000 of 100000000 tuples (0%) done (elapsed 0.29 s, remaining 288.09 s)
 ...
 100000000 of 100000000 tuples (100%) done (elapsed 42.60 s, remaining 0.00 s)
 ERROR:  could not extend file "base/16384/24614": wrote only 7680 of 8192 bytes at block 131071
 HINT:  Check free disk space.

 CONTEXT:  COPY pgbench_accounts, line 7995584
 PQendcopy failed


After cleaning all /tmp/PGS and symlinking it to /home, where I have 6GB free, I've retried and I got nearly the same:


 100000000 of 100000000 tuples (100%) done (elapsed 204.65 s, remaining 0.00 s)
 ERROR:  could not extend file "base/16384/16397.6": No space left on device
 HINT:  Check free disk space.
 CONTEXT:  COPY pgbench_accounts, line 51235802
PQendcopy failed


Do I need more than 6GB ???


Thanks

Tony


$ df -k .
Filesystem    1024-blocks      Free %Used    Iused %Iused Mounted on
/dev/hd1         45088768   6719484   86%   946016    39% /home

bash-4.3$ pwd
/tmp/PGS

bash-4.3$ ll /tmp/PGS
lrwxrwxrwx    1 root     system           10 Feb  2 08:43 /tmp/PGS -> /home/PGS/


$ df -k
Filesystem    1024-blocks      Free %Used    Iused %Iused Mounted on
/dev/hd4           524288    277284   48%    10733    14% /
/dev/hd2          6684672    148896   98%    49303    48% /usr
/dev/hd9var       2097152    314696   85%    24934    18% /var
/dev/hd3          3145728   2527532   20%      418     1% /tmp
/dev/hd1         45088768   6719484   86%   946016    39% /home
/dev/hd11admin      131072    130692    1%        7     1% /admin
/proc                   -         -    -         -     -  /proc
/dev/hd10opt     65273856    829500   99%   938339    41% /opt
/dev/livedump      262144    261776    1%        4     1% /var/adm/ras/livedump
/aha                    -         -    -        18     1% /aha

$ cat logfile
LOG:  database system was shut down at 2017-02-02 09:08:31 CST
LOG:  MultiXact member wraparound protections are now enabled
LOG:  autovacuum launcher started
LOG:  database system is ready to accept connections
ERROR:  could not extend file "base/16384/16397.6": No space left on device
HINT:  Check free disk space.
CONTEXT:  COPY pgbench_accounts, line 51235802
STATEMENT:  copy pgbench_accounts from stdin


$ ulimit -a
core file size          (blocks, -c) 1048575
data seg size           (kbytes, -d) 131072
file size               (blocks, -f) unlimited
max memory size         (kbytes, -m) 32768
open files                      (-n) 2000
pipe size            (512 bytes, -p) 64
stack size              (kbytes, -s) 32768
cpu time               (seconds, -t) unlimited
max user processes              (-u) unlimited
virtual memory          (kbytes, -v) unlimited


bash-4.3$ ll /tmp/PGS
lrwxrwxrwx    1 root     system           10 Feb  2 08:43 /tmp/PGS -> /home/PGS/
bash-4.3$ ls -l
total 120
-rw-------    1 pgstbf   staff             4 Feb  2 09:08 PG_VERSION
drwx------    6 pgstbf   staff           256 Feb  2 09:09 base
drwx------    2 pgstbf   staff          4096 Feb  2 09:09 global
-rw-------    1 pgstbf   staff           410 Feb  2 09:13 logfile
drwx------    2 pgstbf   staff           256 Feb  2 09:08 pg_clog
drwx------    2 pgstbf   staff           256 Feb  2 09:08 pg_commit_ts
drwx------    2 pgstbf   staff           256 Feb  2 09:08 pg_dynshmem
-rw-------    1 pgstbf   staff          4462 Feb  2 09:08 pg_hba.conf
-rw-------    1 pgstbf   staff          1636 Feb  2 09:08 pg_ident.conf
drwx------    4 pgstbf   staff           256 Feb  2 09:08 pg_logical
drwx------    4 pgstbf   staff           256 Feb  2 09:08 pg_multixact
drwx------    2 pgstbf   staff           256 Feb  2 09:08 pg_notify
drwx------    2 pgstbf   staff           256 Feb  2 09:08 pg_replslot
drwx------    2 pgstbf   staff           256 Feb  2 09:08 pg_serial
drwx------    2 pgstbf   staff           256 Feb  2 09:08 pg_snapshots
drwx------    2 pgstbf   staff           256 Feb  2 09:08 pg_stat
drwx------    2 pgstbf   staff           256 Feb  2 09:17 pg_stat_tmp
drwx------    2 pgstbf   staff           256 Feb  2 09:08 pg_subtrans
drwx------    2 pgstbf   staff           256 Feb  2 09:08 pg_tblspc
drwx------    2 pgstbf   staff           256 Feb  2 09:08 pg_twophase
drwx------    3 pgstbf   staff           256 Feb  2 09:08 pg_xlog
-rw-------    1 pgstbf   staff            88 Feb  2 09:08 postgresql.auto.conf
-rw-------    1 pgstbf   staff         22236 Feb  2 09:08 postgresql.conf
-rw-------    1 pgstbf   staff            46 Feb  2 09:08 postmaster.opts
-rw-------    1 pgstbf   staff            69 Feb  2 09:08 postmaster.pid
bash-4.3$ ls -l base
total 112
drwx------    2 pgstbf   staff         16384 Feb  2 09:08 1
drwx------    2 pgstbf   staff         12288 Feb  2 09:08 12407
drwx------    2 pgstbf   staff         12288 Feb  2 09:09 12408
drwx------    2 pgstbf   staff         16384 Feb  2 09:14 16384
bash-4.3$ ls -l base/16384/
total 15200
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 112
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 113
-rw-------    1 pgstbf   staff         57344 Feb  2 09:09 12243
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 12243_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12243_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 12245
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12247
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12248
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 12248_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12248_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 12250
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12252
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12253
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 12253_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12253_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 12255
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12257
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12258
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 12258_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12258_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 12260
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12262
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12263
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 12263_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12263_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 12265
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12267
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12268
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 12268_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12268_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 12270
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12272
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 12273
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 12275
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 12277
-rw-------    1 pgstbf   staff         73728 Feb  2 09:14 1247
-rw-------    1 pgstbf   staff         24576 Feb  2 09:14 1247_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:14 1247_vm
-rw-------    1 pgstbf   staff        368640 Feb  2 09:14 1249
-rw-------    1 pgstbf   staff         24576 Feb  2 09:14 1249_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:14 1249_vm
-rw-------    1 pgstbf   staff        589824 Feb  2 09:09 1255
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 1255_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 1255_vm
-rw-------    1 pgstbf   staff         90112 Feb  2 09:14 1259
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 1259_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:14 1259_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 1417
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 1417_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 1418
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 1418_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 16385
-rw-------    1 pgstbf   staff        450560 Feb  2 09:14 16388
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 16388_fsm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 16391
-rw-------    1 pgstbf   staff         40960 Feb  2 09:14 16394
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 16394_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 174
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 175
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2187
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2328
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2328_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2336
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2336_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2337
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2600
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2600_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2600_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2601
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2601_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2601_vm
-rw-------    1 pgstbf   staff         49152 Feb  2 09:09 2602
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2602_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2602_vm
-rw-------    1 pgstbf   staff         40960 Feb  2 09:09 2603
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2603_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2603_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2604
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2604_vm
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2605
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2605_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2605_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2606
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2606_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2606_vm
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2607
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2607_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2607_vm
-rw-------    1 pgstbf   staff        450560 Feb  2 09:14 2608
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2608_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:14 2608_vm
-rw-------    1 pgstbf   staff        278528 Feb  2 09:09 2609
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2609_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2609_vm
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2610
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2610_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2610_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2611
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2611_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2612
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2612_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2612_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2613
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2613_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2615
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2615_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2615_vm
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2616
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2616_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2616_vm
-rw-------    1 pgstbf   staff        122880 Feb  2 09:09 2617
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2617_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2617_vm
-rw-------    1 pgstbf   staff         98304 Feb  2 09:09 2618
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2618_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2618_vm
-rw-------    1 pgstbf   staff        122880 Feb  2 09:14 2619
-rw-------    1 pgstbf   staff         24576 Feb  2 09:14 2619_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:14 2619_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2620
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2620_vm
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2650
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2651
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2652
-rw-------    1 pgstbf   staff         40960 Feb  2 09:09 2653
-rw-------    1 pgstbf   staff         40960 Feb  2 09:09 2654
-rw-------    1 pgstbf   staff         40960 Feb  2 09:09 2655
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2656
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2657
-rw-------    1 pgstbf   staff        106496 Feb  2 09:14 2658
-rw-------    1 pgstbf   staff         73728 Feb  2 09:14 2659
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2660
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2661
-rw-------    1 pgstbf   staff         32768 Feb  2 09:14 2662
-rw-------    1 pgstbf   staff         40960 Feb  2 09:14 2663
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2664
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2665
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2666
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2667
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2668
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2669
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2670
-rw-------    1 pgstbf   staff        319488 Feb  2 09:14 2673
-rw-------    1 pgstbf   staff        352256 Feb  2 09:14 2674
-rw-------    1 pgstbf   staff        172032 Feb  2 09:09 2675
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2678
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2679
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2680
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2681
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2682
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2683
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2684
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2685
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2686
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2687
-rw-------    1 pgstbf   staff         40960 Feb  2 09:09 2688
-rw-------    1 pgstbf   staff         40960 Feb  2 09:09 2689
-rw-------    1 pgstbf   staff         81920 Feb  2 09:09 2690
-rw-------    1 pgstbf   staff        253952 Feb  2 09:09 2691
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2692
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2693
-rw-------    1 pgstbf   staff         16384 Feb  2 09:14 2696
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2699
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2701
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2702
-rw-------    1 pgstbf   staff         16384 Feb  2 09:14 2703
-rw-------    1 pgstbf   staff         40960 Feb  2 09:14 2704
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2753
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2753_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2753_vm
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2754
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2755
-rw-------    1 pgstbf   staff         32768 Feb  2 09:09 2756
-rw-------    1 pgstbf   staff         32768 Feb  2 09:09 2757
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2830
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2830_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2831
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2832
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2832_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2833
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2834
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2834_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2835
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2836
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2836_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2837
-rw-------    1 pgstbf   staff        385024 Feb  2 09:09 2838
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2838_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2838_vm
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2839
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2840
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 2840_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2840_vm
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 2841
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2995
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 2995_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 2996
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3079
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 3079_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3079_vm
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3080
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3081
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3085
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3118
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3118_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3119
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3164
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3256
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3256_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3257
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3258
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3394
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 3394_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3394_vm
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3395
-rw-------    1 pgstbf   staff         32768 Feb  2 09:14 3455
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3456
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 3456_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3456_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3466
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3466_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3467
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3468
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3501
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3501_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3502
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3503
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3534
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3541
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 3541_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3541_vm
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3542
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3574
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3575
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3576
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3576_vm
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3596
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3596_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3597
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3598
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 3598_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3599
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3600
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 3600_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3600_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3601
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 3601_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3601_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3602
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 3602_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3602_vm
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3603
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 3603_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3603_vm
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3604
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3605
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3606
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3607
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3608
-rw-------    1 pgstbf   staff         32768 Feb  2 09:09 3609
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3712
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3764
-rw-------    1 pgstbf   staff         24576 Feb  2 09:09 3764_fsm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 3764_vm
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3766
-rw-------    1 pgstbf   staff         16384 Feb  2 09:09 3767
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 548
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 549
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 826
-rw-------    1 pgstbf   staff             0 Feb  2 09:09 826_vm
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 827
-rw-------    1 pgstbf   staff          8192 Feb  2 09:09 828
-rw-------    1 pgstbf   staff             4 Feb  2 09:09 PG_VERSION
-rw-------    1 pgstbf   staff           512 Feb  2 09:09 pg_filenode.map
-rw-------    1 pgstbf   staff        112660 Feb  2 09:09 pg_internal.init



Le 01/02/2017 à 21:26, Konstantin Knizhnik a écrit :
On 02/01/2017 08:30 PM, REIX, Tony wrote:

Hi Konstantin, ....

If you could share scripts or instructions about the tests you are doing with pgbench, I would reproduce here.


You do not need any script.
Just two simple commands.
One to initialize database:

pgbench -i -s 1000

And another to run benchmark itself:

pgbench -c 100 -j 20 -P 1 -T 1000000000
...

Regards,

Tony

Le 01/02/2017 à 16:59, Konstantin Knizhnik a écrit :
Hi Tony,

On 01.02.2017 18:42, REIX, Tony wrote:

Hi Konstantin

XLC.

I'm on AIX 7.1 for now.

I'm using this version of XLC v13:

# xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0003

With this version, I have (at least, since I tested with "check" and not "check-world" at that time) 2 failing tests: create_aggregate , aggregates .


With the following XLC v12 version, I have NO test failure:

# /usr/vac/bin/xlc -qversion
IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
Version: 12.01.0000.0016


So maybe you are not using XLC v13.1.3.3, rather another sub-version. Unless you are using more options for the configure ?


Configure.

What are the options that you give to the configure ?


export CC="/opt/IBM/xlc/13.1.3/bin/xlc"
export CFLAGS="-qarch=pwr8 -qtune=pwr8 -O2 -qalign=natural -q64 "
export LDFLAGS="-Wl,-bbigtoc,-b64"
export AR="/usr/bin/ar -X64"
export LD="/usr/bin/ld -b64 "
export NM="/usr/bin/nm -X64"
./configure --prefix="/opt/postgresql/xlc-debug/9.6"


Hard load & 64 cores ? OK. That clearly explains why I do not see this issue.


pgbench ? I wanted to run it. However, I'm still looking where to get it plus a guide for using it for testing.


pgbench is part of Postgres distributive (src/bin/pgbench)


I would add such tests when building my PostgreSQL RPMs on AIX. So any help is welcome !


Performance.

- Also, I'd like to compare PostgreSQL performance on AIX vs Linux/PPC64. Any idea how I should proceed ? Any PostgreSQL performance benchmark that I could find and use ? pgbench ?

pgbench is most widely used tool simulating OLTP workload. Certainly it is quite primitive and its results are rather artificial. TPC-C seems to be better choice.
But the best case is to implement your own benchmark simulating actual workload of your real application.

- I'm interested in any information for improving the performance & quality of my PostgreSQM RPMs on AIX. (As I already said, BullFreeware RPMs for AIX are free and can be used by anyone, like Perzl RPMs are. My company (ATOS/Bull) sells IBM Power machines under the Escala brand since ages (25 years this year)).


How to help ?

How could I help for improving the quality and performance of PostgreSQL on AIX ?


We still have one open issue at AIX: see https://www.mail-archive.com/pgsql-hackers@postgresql.org/msg303094.html
It will be great if you can somehow help to fix this problem.



-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 



-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Konstantin Knizhnik
Дата:
Hi Tony,

On 02.02.2017 17:10, REIX, Tony wrote:

Hi Konstantin

I've discussed the "zombie/exit" issue with our expert here.

- He does not think that AIX has anything special here

- If the process is marked <exiting> in ps, this is because the flag SEXIT is set, thus the process is blocked somewhere in the kexitx() syscall, waiting for something.

- In order to know what it is waiting for, the best would be to have a look with kdb.


kdb shows the following stack:

pvthread+073000 STACK:
[005E1958]slock+000578 (00000000005E1958, 8000000000001032 [??])
[00009558].simple_lock+000058 ()
[00651DBC]vm_relalias+00019C (??, ??, ??, ??, ??)
[006544AC]vm_map_entry_delete+00074C (??, ??, ??)
[00659C30]vm_map_delete+000150 (??, ??, ??, ??)
[00659D88]vm_map_deallocate+000048 (??, ??)
[0011C588]kexitx+001408 (??)
[000BB08C]kexit+00008C ()
___ Recovery (FFFFFFFFFFF9290) ___
WARNING: Eyecatcher/version mismatch in RWA


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Noah Misch
Дата:
On Wed, Feb 01, 2017 at 02:39:25PM +0200, Heikki Linnakangas wrote:
> On 02/01/2017 01:07 PM, Konstantin Knizhnik wrote:
> >Attached please find my patch for XLC/AIX.
> >The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
> >The comment in this file says that:
> >
> >       * __fetch_and_add() emits a leading "sync" and trailing "isync",
> >thereby
> >       * providing sequential consistency.  This is undocumented.
> >
> >But it is not true any more (I checked generated assembler code in
> >debugger).
> >This is why I have added __sync() to this function. Now pgbench working
> >normally.

Konstantin, does "make -C src/bin/pg_bench check" fail >10% of the time in the
bad build?

> Seems like it was not so much undocumented, but an implementation detail
> that was not guaranteed after all..

Seems so.

> There was a long thread on these things the last time this was changed:
https://www.postgresql.org/message-id/20160425185204.jrvlghn3jxulsb7i%40alap3.anarazel.de.
> I couldn't find an explanation there of why we thought that fetch_and_add
> implicitly performs sync and isync.

It was in the generated code, for AIX xlc 12.1.0.0.

> >Also there is mysterious disappearance of assembler section function
> >with sync instruction from pg_atomic_compare_exchange_u32_impl.
> >I have fixed it by using __sync() built-in function instead.
> 
> __sync() seems more appropriate there, anyway. We're using intrinsics for
> all the other things in generic-xlc.h. But it sure is scary that the "asm"
> sections just disappeared.

That is a problem, but it's a stretch to conclude that asm sections are
generally prone to removal, while intrinsics are generally durable.

> @@ -73,11 +73,19 @@ pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr,
>  static inline uint32
>  pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_)
>  {
> +    uint32        ret;
> +
>      /*
> -     * __fetch_and_add() emits a leading "sync" and trailing "isync", thereby
> -     * providing sequential consistency.  This is undocumented.
> +     * Use __sync() before and __isync() after, like in compare-exchange
> +     * above.
>       */
> -    return __fetch_and_add((volatile int *)&ptr->value, add_);
> +    __sync();
> +
> +    ret = __fetch_and_add((volatile int *)&ptr->value, add_);
> +
> +    __isync();
> +
> +    return ret;
>  }

Since this emits double syncs with older xlc, I recommend instead replacing
the whole thing with inline asm.  As I opined in the last message of the
thread you linked above, the intrinsics provide little value as abstractions
if one checks the generated code to deduce how to use them.  Now that the
generated code is xlc-version-dependent, the port is better off with
compiler-independent asm like we have for ppc in s_lock.h.



Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Michael Paquier
Дата:
On Fri, Feb 03, 2017 at 12:26:50AM +0000, Noah Misch wrote:
> On Wed, Feb 01, 2017 at 02:39:25PM +0200, Heikki Linnakangas wrote:
>> @@ -73,11 +73,19 @@ pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr,
>>  static inline uint32
>>  pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_)
>>  {
>> +    uint32        ret;
>> +
>>      /*
>> -     * __fetch_and_add() emits a leading "sync" and trailing "isync", thereby
>> -     * providing sequential consistency.  This is undocumented.
>> +     * Use __sync() before and __isync() after, like in compare-exchange
>> +     * above.
>>       */
>> -    return __fetch_and_add((volatile int *)&ptr->value, add_);
>> +    __sync();
>> +
>> +    ret = __fetch_and_add((volatile int *)&ptr->value, add_);
>> +
>> +    __isync();
>> +
>> +    return ret;
>>  }
>
> Since this emits double syncs with older xlc, I recommend instead replacing
> the whole thing with inline asm.  As I opined in the last message of the
> thread you linked above, the intrinsics provide little value as abstractions
> if one checks the generated code to deduce how to use them.  Now that the
> generated code is xlc-version-dependent, the port is better off with
> compiler-independent asm like we have for ppc in s_lock.h.

Could it be cleaner to just use __xlc_ver__ to avoid double syncs on
past versions? I think that it would make the code more understandable
than just listing directly the instructions. As there have been other
bug reports from Tony Reix who has been working on AIX with XLC 13.1 and
that this thread got lost in the wild, I have added an entry in the next
CF:
https://commitfest.postgresql.org/17/1484/

As Heikki is not around these days, Noah, could you provide a new
version of the patch? This bug has been around for some time now, it
would be nice to move on.. I think I could have written patches myself,
but I don't have an AIX machine at hand. Of course not with XLC 13.1.
--
Michael

Вложения

RE:[HACKERS] Deadlock in XLogInsert at AIX

От
"REIX, Tony"
Дата:
Hi Michael,

My team and my company (ATOS/Bull) are involved in improving the quality of PostgreSQL on AIX.

We have AIX 6.1, 7.1, and 7.2 Power8 systems, with several logical/physical processors.
And I plan to have a more powerful (more processors) machine for running PostgreSQL stress tests.
A DB-expert colleague has started to write some new not-too-complex stress tests that we'd like to submit to PostgreSQL
projectlater. 
For now, using latest versions of XLC 12 (12.1.0.19) and 13 (13.1.3.4 with a patch), we have only (on AIX 6.1 and 7.2)
oneremaining random failure (dealing with src/bin/pgbench/t/001_pgbench.pl test), for PostgreSQL 9.6.6 and 10.1 . And,
onAIX 7.1, we have one more remaining failure that may be due to some other dependent software. Investigating. 
XLC 13.1.3.4 shows an issue with -O2 and I have a work-around that fixes it in ./src/backend/parser/gram.c . We have
openeda PMR (defect) against XLC. 
Note that our tests are now executed without the PG_FORCE_DISABLE_INLINE "inline" trick in src/include/port/aix.h that
suppressesthe inlining of routines on AIX. I think that older versions of XLC have shown issues that have now
disappeared(or, at least, many of them). 
I've been able to compare PostgreSQL compiled with XLC vs GCC 7.1 and, using times outputs provided by PostgreSQL
tests,XLC seems to provide at least 8% more speed. We also plan to run professional performance tests in order to
comparePostgreSQL 10.1 on AIX vs Linux/Power. I saw some 2017 performance slides, made with older versions of
PostgreSQLand XLC, that show bad PostgreSQL performance on AIX vs Linux/Power, and I cannot believe it. We plan to
investigatethis. 

Though I have very very little skills about PostgreSQL (I'm porting too now GCC Go on AIX), we can help, at least by
compiling/testing/investigating/stressingin a different AIX environment than the AIX ones (32/64bit, XLC/GCC) you have
inyour BuildFarm. 
Let me know how we can help.

Regards,

Cordialement,

Tony Reix

ATOS / Bull SAS
ATOS Expert
IBM Coop Architect & Technical Leader
Office : +33 (0) 4 76 29 72 67
1 rue de Provence - 38432 Échirolles - France
www.atos.net

________________________________________
De : Michael Paquier [michael.paquier@gmail.com]
Envoyé : mardi 16 janvier 2018 08:12
À : Noah Misch
Cc : Heikki Linnakangas; Konstantin Knizhnik; PostgreSQL Hackers; Bernd Helmle
Objet : Re: [HACKERS] Deadlock in XLogInsert at AIX

On Fri, Feb 03, 2017 at 12:26:50AM +0000, Noah Misch wrote:
> On Wed, Feb 01, 2017 at 02:39:25PM +0200, Heikki Linnakangas wrote:
>> @@ -73,11 +73,19 @@ pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr,
>>  static inline uint32
>>  pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_)
>>  {
>> +    uint32          ret;
>> +
>>      /*
>> -     * __fetch_and_add() emits a leading "sync" and trailing "isync", thereby
>> -     * providing sequential consistency.  This is undocumented.
>> +     * Use __sync() before and __isync() after, like in compare-exchange
>> +     * above.
>>       */
>> -    return __fetch_and_add((volatile int *)&ptr->value, add_);
>> +    __sync();
>> +
>> +    ret = __fetch_and_add((volatile int *)&ptr->value, add_);
>> +
>> +    __isync();
>> +
>> +    return ret;
>>  }
>
> Since this emits double syncs with older xlc, I recommend instead replacing
> the whole thing with inline asm.  As I opined in the last message of the
> thread you linked above, the intrinsics provide little value as abstractions
> if one checks the generated code to deduce how to use them.  Now that the
> generated code is xlc-version-dependent, the port is better off with
> compiler-independent asm like we have for ppc in s_lock.h.

Could it be cleaner to just use __xlc_ver__ to avoid double syncs on
past versions? I think that it would make the code more understandable
than just listing directly the instructions. As there have been other
bug reports from Tony Reix who has been working on AIX with XLC 13.1 and
that this thread got lost in the wild, I have added an entry in the next
CF:
https://commitfest.postgresql.org/17/1484/

As Heikki is not around these days, Noah, could you provide a new
version of the patch? This bug has been around for some time now, it
would be nice to move on.. I think I could have written patches myself,
but I don't have an AIX machine at hand. Of course not with XLC 13.1.
--
Michael


Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Michael Paquier
Дата:
On Tue, Jan 16, 2018 at 08:25:51AM +0000, REIX, Tony wrote:
> My team and my company (ATOS/Bull) are involved in improving the
> quality of PostgreSQL on AIX.

Cool to hear that!

> We have AIX 6.1, 7.1, and 7.2 Power8 systems, with several
> logical/physical processors. And I plan to have a more powerful (more
> processors) machine for running PostgreSQL stress tests.
> A DB-expert colleague has started to write some new not-too-complex
> stress tests that we'd like to submit to PostgreSQL project later.
> For now, using latest versions of XLC 12 (12.1.0.19) and 13 (13.1.3.4
> with a patch), we have only (on AIX 6.1 and 7.2) one remaining random
> failure (dealing with src/bin/pgbench/t/001_pgbench.pl test), for
> PostgreSQL 9.6.6 and 10.1 . And, on AIX 7.1, we have one more
> remaining failure that may be due to some other dependent
> software. Investigating.
> XLC 13.1.3.4 shows an issue with -O2 and I have a work-around that
> fixes it in ./src/backend/parser/gram.c . We have opened a PMR
> (defect) against XLC. Note that our tests are now executed without the
> PG_FORCE_DISABLE_INLINE "inline" trick in src/include/port/aix.h that
> suppresses the inlining of routines on AIX. I think that older
> versions of XLC have shown issues that have now disappeared (or, at
> least, many of them).
> I've been able to compare PostgreSQL compiled with XLC vs GCC 7.1 and,
> using times outputs provided by PostgreSQL tests, XLC seems to provide
> at least 8% more speed. We also plan to run professional performance
> tests in order to compare PostgreSQL 10.1 on AIX vs Linux/Power. I saw
> some 2017 performance slides, made with older versions of PostgreSQL
> and XLC, that show bad PostgreSQL performance on AIX vs Linux/Power,
> and I cannot believe it. We plan to investigate this.

That's interesting investigation. The community is always interested in
such stories. You could have material for a conference talk.

> Though I have very very little skills about PostgreSQL (I'm porting
> too now GCC Go on AIX), we can help, at least by
> compiling/testing/investigating/stressing in a different AIX
> environment than the AIX ones (32/64bit, XLC/GCC) you have in your
> BuildFarm.

Setting up a buildfarm member with the combination of compiler and
environment where you are seeing the failures would be the best answer
in my opinion:
https://wiki.postgresql.org/wiki/PostgreSQL_Buildfarm_Howto

This does not require special knowledge of PostgreSQL internals, and the
in-core testing framework has improved the last couple of years to allow
for more advanced tests. I do use it as well for some tests on my own
modules (company stuff). The buildfarm code has also followed the pace,
which really helps a lot, thanks to Andrew Dunstan.

Developers and committers are more pro-active if they can see automated
tests failing in the central community place. And buildfarm animals
usually don't stay red for more than a couple of days.
--
Michael

Вложения

RE:[HACKERS] Deadlock in XLogInsert at AIX

От
"REIX, Tony"
Дата:
Hi Michael

You said:

> Setting up a buildfarm member with the combination of compiler and
> environment where you are seeing the failures would be the best answer
> in my opinion:
> https://wiki.postgresql.org/wiki/PostgreSQL_Buildfarm_Howto
>
> This does not require special knowledge of PostgreSQL internals, and the
> in-core testing framework has improved the last couple of years to allow
> for more advanced tests. I do use it as well for some tests on my own
> modules (company stuff). The buildfarm code has also followed the pace,
> which really helps a lot, thanks to Andrew Dunstan.
>
> Developers and committers are more pro-active if they can see automated
> tests failing in the central community place. And buildfarm animals
> usually don't stay red for more than a couple of days.

Hummmm I quickly read this HowTo and I did not find any explanation about the "protocole"
used for exchanging data between my VM and the PostgreSQL BuildFarm.
My machine is behind firewalls and have restricted access to the outside.
Either I'll see when that does not work... or I can get some information about which port
(or anything else) I have to ask to be opened, if needed.
Anyway, I'll read it in depth now and I'll try to implement it.


About the random error we see, I guess that I may see it, though PostgreSQL BuildFarm AIX VMs do not see it,
because I'm using a not-too-small VM, using variable Physical Processing units (CPUs) since my VM is uncapped
(I may use all Physical CPU if available): up to 4 physical processors and up to 8 virtual processors.
And, on BuildFarm, I do not see any details about the logical/physical configuration of the AIX VMs, like hornet.
Being able to run real concurrent parallel stress programs, thus required multi-physical-CPU VM, would help.

Regards,

Tony


Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Andrew Dunstan
Дата:

On 01/16/2018 08:50 AM, REIX, Tony wrote:
> Hi Michael
>
> You said:
>
>> Setting up a buildfarm member with the combination of compiler and
>> environment where you are seeing the failures would be the best answer
>> in my opinion:
>> https://wiki.postgresql.org/wiki/PostgreSQL_Buildfarm_Howto
>>
>> This does not require special knowledge of PostgreSQL internals, and the
>> in-core testing framework has improved the last couple of years to allow
>> for more advanced tests. I do use it as well for some tests on my own
>> modules (company stuff). The buildfarm code has also followed the pace,
>> which really helps a lot, thanks to Andrew Dunstan.
>>
>> Developers and committers are more pro-active if they can see automated
>> tests failing in the central community place. And buildfarm animals
>> usually don't stay red for more than a couple of days.
> Hummmm I quickly read this HowTo and I did not find any explanation about the "protocole"
> used for exchanging data between my VM and the PostgreSQL BuildFarm.
> My machine is behind firewalls and have restricted access to the outside.
> Either I'll see when that does not work... or I can get some information about which port
> (or anything else) I have to ask to be opened, if needed.
> Anyway, I'll read it in depth now and I'll try to implement it.
>
>
>


Communication is only done via outbound port 443 (https). There are no
passwords required and no inbound connections, ever. Uploads are signed
using a shared secret. Communication can can be via a proxy. If you need
the client to use a proxy with git that's a bit more complex, but possible.

Ping me if you need help setting this up.

cheers

andrew

-- 

Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Noah Misch
Дата:
On Tue, Jan 16, 2018 at 01:50:29PM +0000, REIX, Tony wrote:
> And, on BuildFarm, I do not see any details about the logical/physical configuration of the AIX VMs, like hornet.
> Being able to run real concurrent parallel stress programs, thus required multi-physical-CPU VM, would help.

It has 48 virtual CPUs.  Here's the prtconf output:

System Model: IBM,8231-E2C
Machine Serial Number: 104C0CT
Processor Type: PowerPC_POWER7
Processor Implementation Mode: POWER 7
Processor Version: PV_7_Compat
Number Of Processors: 12
Processor Clock Speed: 3720 MHz
CPU Type: 64-bit
Kernel Type: 64-bit
LPAR Info: 1 10-4C0CT
Memory Size: 127488 MB
Good Memory Size: 127488 MB
Platform Firmware level: AL740_100
Firmware Version: IBM,AL740_100
Console Login: enable
Auto Restart: true
Full Core: false
NX Crypto Acceleration: Not Capable
 
Network Information
    Host Name: power-aix
    IP Address: 140.211.15.154
    Sub Netmask: 255.255.255.0
    Gateway: 140.211.15.1
    Name Server: 140.211.166.130
    Domain Name: osuosl.org
 
Paging Space Information
    Total Paging Space: 12288MB
    Percent Used: 1%
 
Volume Groups Information
============================================================================== 
Active VGs
============================================================================== 
homevg:
PV_NAME           PV STATE          TOTAL PPs   FREE PPs    FREE DISTRIBUTION
hdisk1            active            558         183         00..00..00..71..112
hdisk2            active            558         0           00..00..00..00..00
hdisk3            active            558         0           00..00..00..00..00
hdisk4            active            558         0           00..00..00..00..00
============================================================================== 
 
rootvg:
PV_NAME           PV STATE          TOTAL PPs   FREE PPs    FREE DISTRIBUTION
hdisk0            active            558         510         111..93..83..111..112
hdisk5            active            558         514         111..97..83..111..112
============================================================================== 
 
INSTALLED RESOURCE LIST

The following resources are installed on the machine.
+/- = Added or deleted from Resource List.
*   = Diagnostic support not available.
    
  Model Architecture: chrp
  Model Implementation: Multiple Processor, PCI bus
    
+ sys0                                         System Object
+ sysplanar0                                   System Planar
* vio0                                         Virtual I/O Bus
* vsa2             U78AB.001.WZSHZY0-P1-T2     LPAR Virtual Serial Adapter
* vty2             U78AB.001.WZSHZY0-P1-T2-L0  Asynchronous Terminal
* vsa1             U78AB.001.WZSHZY0-P1-T1     LPAR Virtual Serial Adapter
* vty1             U78AB.001.WZSHZY0-P1-T1-L0  Asynchronous Terminal
* vsa0             U8231.E2C.104C0CT-V1-C0     LPAR Virtual Serial Adapter
* vty0             U8231.E2C.104C0CT-V1-C0-L0  Asynchronous Terminal
* pci8             U78AB.001.WZSHZY0-P1        PCI Express Bus
* pci7             U78AB.001.WZSHZY0-P1        PCI Express Bus
* pci6             U78AB.001.WZSHZY0-P1        PCI Express Bus
* pci5             U78AB.001.WZSHZY0-P1        PCI Express Bus
* pci4             U78AB.001.WZSHZY0-P1        PCI Express Bus
* pci10            U78AB.001.WZSHZY0-P1-C2     PCI Bus
+ cor0             U78AB.001.WZSHZY0-P1-C2-T1  GXT145 Graphics Adapter
* pci3             U78AB.001.WZSHZY0-P1        PCI Express Bus
+ ent0             U78AB.001.WZSHZY0-P1-C7-T1  4-Port Gigabit Ethernet PCI-Express Adapter (e414571614102004)
+ ent1             U78AB.001.WZSHZY0-P1-C7-T2  4-Port Gigabit Ethernet PCI-Express Adapter (e414571614102004)
+ ent2             U78AB.001.WZSHZY0-P1-C7-T3  4-Port Gigabit Ethernet PCI-Express Adapter (e414571614102004)
+ ent3             U78AB.001.WZSHZY0-P1-C7-T4  4-Port Gigabit Ethernet PCI-Express Adapter (e414571614102004)
* pci2             U78AB.001.WZSHZY0-P1        PCI Express Bus
* pci1             U78AB.001.WZSHZY0-P1        PCI Express Bus
* pci9             U78AB.001.WZSHZY0-P1        PCI Bus
+ usbhc0           U78AB.001.WZSHZY0-P1        USB Host Controller (33103500)
+ usbhc1           U78AB.001.WZSHZY0-P1        USB Host Controller (33103500)
+ usbhc2           U78AB.001.WZSHZY0-P1        USB Enhanced Host Controller (3310e000)
* pci0             U78AB.001.WZSHZY0-P1        PCI Express Bus
+ sissas0          U78AB.001.WZSHZY0-P1-T9     PCIe x4 Planar 3Gb SAS Adapter
* sas0             U78AB.001.WZSHZY0-P1-T9     Controller SAS Protocol
* sfwcomm0                                     SAS Storage Framework Comm
+ hdisk0           U78AB.001.WZSHZY0-P3-D1     SAS Disk Drive (600000 MB)
+ hdisk1           U78AB.001.WZSHZY0-P3-D2     SAS Disk Drive (600000 MB)
+ hdisk2           U78AB.001.WZSHZY0-P3-D3     SAS Disk Drive (600000 MB)
+ hdisk3           U78AB.001.WZSHZY0-P3-D4     SAS Disk Drive (600000 MB)
+ hdisk4           U78AB.001.WZSHZY0-P3-D5     SAS Disk Drive (600000 MB)
+ hdisk5           U78AB.001.WZSHZY0-P3-D6     SAS Disk Drive (600000 MB)
+ ses0             U78AB.001.WZSHZY0-P2-Y1     SAS Enclosure Services Device
+ ses1             U78AB.001.WZSHZY0-P2-Y1     SAS Enclosure Services Device
* sata0            U78AB.001.WZSHZY0-P1-T9     Controller SATA Protocol
+ cd0              U78AB.001.WZSHZY0-P3-D7     SATA DVD-RAM Drive
+ L2cache0                                     L2 Cache
+ mem0                                         Memory
+ proc0                                        Processor
+ proc4                                        Processor
+ proc8                                        Processor
+ proc12                                       Processor
+ proc16                                       Processor
+ proc20                                       Processor
+ proc24                                       Processor
+ proc28                                       Processor
+ proc32                                       Processor
+ proc36                                       Processor
+ proc40                                       Processor
+ proc44                                       Processor


RE:[HACKERS] Deadlock in XLogInsert at AIX

От
"REIX, Tony"
Дата:
Thanks Noah !

Hummm You have a big machine, more powerful than mine. However, it seems that you do not see the random failure I see.


Cordialement,

Tony Reix

ATOS / Bull SAS
ATOS Expert
IBM Coop Architect & Technical Leader
Office : +33 (0) 4 76 29 72 67
1 rue de Provence - 38432 Échirolles - France
www.atos.net

________________________________________
De : Noah Misch [noah@leadboat.com]
Envoyé : mardi 16 janvier 2018 17:19
À : REIX, Tony
Cc : Michael Paquier; Heikki Linnakangas; Konstantin Knizhnik; PostgreSQL Hackers; Bernd Helmle; OLIVA, PASCAL;
EMPEREUR-MOT,SYLVIE 
Objet : Re: [HACKERS] Deadlock in XLogInsert at AIX

On Tue, Jan 16, 2018 at 01:50:29PM +0000, REIX, Tony wrote:
> And, on BuildFarm, I do not see any details about the logical/physical configuration of the AIX VMs, like hornet.
> Being able to run real concurrent parallel stress programs, thus required multi-physical-CPU VM, would help.

It has 48 virtual CPUs.  Here's the prtconf output:

System Model: IBM,8231-E2C
Machine Serial Number: 104C0CT
Processor Type: PowerPC_POWER7
Processor Implementation Mode: POWER 7
Processor Version: PV_7_Compat
Number Of Processors: 12
Processor Clock Speed: 3720 MHz
CPU Type: 64-bit
Kernel Type: 64-bit
LPAR Info: 1 10-4C0CT
Memory Size: 127488 MB
Good Memory Size: 127488 MB
Platform Firmware level: AL740_100
Firmware Version: IBM,AL740_100
Console Login: enable
Auto Restart: true
Full Core: false
NX Crypto Acceleration: Not Capable

Network Information
        Host Name: power-aix
        IP Address: 140.211.15.154
        Sub Netmask: 255.255.255.0
        Gateway: 140.211.15.1
        Name Server: 140.211.166.130
        Domain Name: osuosl.org

Paging Space Information
        Total Paging Space: 12288MB
        Percent Used: 1%

Volume Groups Information
==============================================================================
Active VGs
==============================================================================
homevg:
PV_NAME           PV STATE          TOTAL PPs   FREE PPs    FREE DISTRIBUTION
hdisk1            active            558         183         00..00..00..71..112
hdisk2            active            558         0           00..00..00..00..00
hdisk3            active            558         0           00..00..00..00..00
hdisk4            active            558         0           00..00..00..00..00
==============================================================================

rootvg:
PV_NAME           PV STATE          TOTAL PPs   FREE PPs    FREE DISTRIBUTION
hdisk0            active            558         510         111..93..83..111..112
hdisk5            active            558         514         111..97..83..111..112
==============================================================================

INSTALLED RESOURCE LIST

The following resources are installed on the machine.
+/- = Added or deleted from Resource List.
*   = Diagnostic support not available.

  Model Architecture: chrp
  Model Implementation: Multiple Processor, PCI bus

+ sys0                                         System Object
+ sysplanar0                                   System Planar
* vio0                                         Virtual I/O Bus
* vsa2             U78AB.001.WZSHZY0-P1-T2     LPAR Virtual Serial Adapter
* vty2             U78AB.001.WZSHZY0-P1-T2-L0  Asynchronous Terminal
* vsa1             U78AB.001.WZSHZY0-P1-T1     LPAR Virtual Serial Adapter
* vty1             U78AB.001.WZSHZY0-P1-T1-L0  Asynchronous Terminal
* vsa0             U8231.E2C.104C0CT-V1-C0     LPAR Virtual Serial Adapter
* vty0             U8231.E2C.104C0CT-V1-C0-L0  Asynchronous Terminal
* pci8             U78AB.001.WZSHZY0-P1        PCI Express Bus
* pci7             U78AB.001.WZSHZY0-P1        PCI Express Bus
* pci6             U78AB.001.WZSHZY0-P1        PCI Express Bus
* pci5             U78AB.001.WZSHZY0-P1        PCI Express Bus
* pci4             U78AB.001.WZSHZY0-P1        PCI Express Bus
* pci10            U78AB.001.WZSHZY0-P1-C2     PCI Bus
+ cor0             U78AB.001.WZSHZY0-P1-C2-T1  GXT145 Graphics Adapter
* pci3             U78AB.001.WZSHZY0-P1        PCI Express Bus
+ ent0             U78AB.001.WZSHZY0-P1-C7-T1  4-Port Gigabit Ethernet PCI-Express Adapter (e414571614102004)
+ ent1             U78AB.001.WZSHZY0-P1-C7-T2  4-Port Gigabit Ethernet PCI-Express Adapter (e414571614102004)
+ ent2             U78AB.001.WZSHZY0-P1-C7-T3  4-Port Gigabit Ethernet PCI-Express Adapter (e414571614102004)
+ ent3             U78AB.001.WZSHZY0-P1-C7-T4  4-Port Gigabit Ethernet PCI-Express Adapter (e414571614102004)
* pci2             U78AB.001.WZSHZY0-P1        PCI Express Bus
* pci1             U78AB.001.WZSHZY0-P1        PCI Express Bus
* pci9             U78AB.001.WZSHZY0-P1        PCI Bus
+ usbhc0           U78AB.001.WZSHZY0-P1        USB Host Controller (33103500)
+ usbhc1           U78AB.001.WZSHZY0-P1        USB Host Controller (33103500)
+ usbhc2           U78AB.001.WZSHZY0-P1        USB Enhanced Host Controller (3310e000)
* pci0             U78AB.001.WZSHZY0-P1        PCI Express Bus
+ sissas0          U78AB.001.WZSHZY0-P1-T9     PCIe x4 Planar 3Gb SAS Adapter
* sas0             U78AB.001.WZSHZY0-P1-T9     Controller SAS Protocol
* sfwcomm0                                     SAS Storage Framework Comm
+ hdisk0           U78AB.001.WZSHZY0-P3-D1     SAS Disk Drive (600000 MB)
+ hdisk1           U78AB.001.WZSHZY0-P3-D2     SAS Disk Drive (600000 MB)
+ hdisk2           U78AB.001.WZSHZY0-P3-D3     SAS Disk Drive (600000 MB)
+ hdisk3           U78AB.001.WZSHZY0-P3-D4     SAS Disk Drive (600000 MB)
+ hdisk4           U78AB.001.WZSHZY0-P3-D5     SAS Disk Drive (600000 MB)
+ hdisk5           U78AB.001.WZSHZY0-P3-D6     SAS Disk Drive (600000 MB)
+ ses0             U78AB.001.WZSHZY0-P2-Y1     SAS Enclosure Services Device
+ ses1             U78AB.001.WZSHZY0-P2-Y1     SAS Enclosure Services Device
* sata0            U78AB.001.WZSHZY0-P1-T9     Controller SATA Protocol
+ cd0              U78AB.001.WZSHZY0-P3-D7     SATA DVD-RAM Drive
+ L2cache0                                     L2 Cache
+ mem0                                         Memory
+ proc0                                        Processor
+ proc4                                        Processor
+ proc8                                        Processor
+ proc12                                       Processor
+ proc16                                       Processor
+ proc20                                       Processor
+ proc24                                       Processor
+ proc28                                       Processor
+ proc32                                       Processor
+ proc36                                       Processor
+ proc40                                       Processor
+ proc44                                       Processor


Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Andres Freund
Дата:
On 2018-01-16 16:12:11 +0900, Michael Paquier wrote:
> On Fri, Feb 03, 2017 at 12:26:50AM +0000, Noah Misch wrote:
> > On Wed, Feb 01, 2017 at 02:39:25PM +0200, Heikki Linnakangas wrote:
> >> @@ -73,11 +73,19 @@ pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr,
> >>  static inline uint32
> >>  pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_)
> >>  {
> >> +    uint32        ret;
> >> +
> >>      /*
> >> -     * __fetch_and_add() emits a leading "sync" and trailing "isync", thereby
> >> -     * providing sequential consistency.  This is undocumented.
> >> +     * Use __sync() before and __isync() after, like in compare-exchange
> >> +     * above.
> >>       */
> >> -    return __fetch_and_add((volatile int *)&ptr->value, add_);
> >> +    __sync();
> >> +
> >> +    ret = __fetch_and_add((volatile int *)&ptr->value, add_);
> >> +
> >> +    __isync();
> >> +
> >> +    return ret;
> >>  }
> > 
> > Since this emits double syncs with older xlc, I recommend instead replacing
> > the whole thing with inline asm.  As I opined in the last message of the
> > thread you linked above, the intrinsics provide little value as abstractions
> > if one checks the generated code to deduce how to use them.  Now that the
> > generated code is xlc-version-dependent, the port is better off with
> > compiler-independent asm like we have for ppc in s_lock.h.
> 
> Could it be cleaner to just use __xlc_ver__ to avoid double syncs on
> past versions? I think that it would make the code more understandable
> than just listing directly the instructions.

Given the quality of the intrinsics on AIX, see past commits and the
comment in the code quoted above, I think we're much better of doing
this via inline asm.

Greetings,

Andres Freund


Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Noah Misch
Дата:
On Tue, Jan 16, 2018 at 08:50:24AM -0800, Andres Freund wrote:
> On 2018-01-16 16:12:11 +0900, Michael Paquier wrote:
> > On Fri, Feb 03, 2017 at 12:26:50AM +0000, Noah Misch wrote:
> > > Since this emits double syncs with older xlc, I recommend instead replacing
> > > the whole thing with inline asm.  As I opined in the last message of the
> > > thread you linked above, the intrinsics provide little value as abstractions
> > > if one checks the generated code to deduce how to use them.  Now that the
> > > generated code is xlc-version-dependent, the port is better off with
> > > compiler-independent asm like we have for ppc in s_lock.h.
> > 
> > Could it be cleaner to just use __xlc_ver__ to avoid double syncs on
> > past versions? I think that it would make the code more understandable
> > than just listing directly the instructions.
> 
> Given the quality of the intrinsics on AIX, see past commits and the
> comment in the code quoted above, I think we're much better of doing
> this via inline asm.

For me, verifiability is the crucial benefit of inline asm.  Anyone with an
architecture manual can thoroughly review an inline asm implementation.  Given
intrinsics and __xlc_ver__ conditionals, the same level of review requires
access to every xlc version.

> > As there have been other
> > bug reports from Tony Reix who has been working on AIX with XLC 13.1 and
> > that this thread got lost in the wild, I have added an entry in the next
> > CF:
> > https://commitfest.postgresql.org/17/1484/

The most recent patch version is Returned with Feedback.  As a matter of
procedure, I discourage creating commitfest entries as a tool to solicit new
patch versions.  If I were the author of a RwF patch, I would dislike finding
a commitfest entry that I did not create with myself listed as author.

If you do choose to proceed, the entry should be Waiting on Author.

> > As Heikki is not around these days, Noah, could you provide a new
> > version of the patch? This bug has been around for some time now, it
> > would be nice to move on.. 

Not soon.

Note that fixing this bug is just the start of accepting XLC 13.1 as a
compiler of PostgreSQL.  If we get a buildfarm member with a few dozen clean
runs (blocked by, at a minimum, fixing this and the inlining bug), we'll have
something.  Until then, support for XLC 13.1 is an anti-feature.

nm


Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Bernd Helmle
Дата:
Am Dienstag, den 16.01.2018, 08:25 +0000 schrieb REIX, Tony:
> I've been able to compare PostgreSQL compiled with XLC vs GCC 7.1
> and, using times outputs provided by PostgreSQL tests, XLC seems to
> provide at least 8% more speed. We also plan to run professional
> performance tests in order to compare PostgreSQL 10.1 on AIX vs
> Linux/Power. I saw some 2017 performance slides, made with older
> versions of PostgreSQL and XLC, that show bad PostgreSQL performance
> on AIX vs Linux/Power, and I cannot believe it. We plan to
> investigate this.

I assume you are referring to the attached graph i've showed on
PGConf.US 2017 ?

The numbers we've got on that E850 machine (pgbench SELECT-only, scale
1000) weren't really good in comparison to Linux on the same machine.

We tried many options to make the performance better, overall the graph
shows the best performance from Linux *and* AIX with gcc. XL C We used
some knobs to get the best out on AIX:

export OBJECT_MODE=64; gcc -m64
ldedit -b forkpolicy:cor -b textpsize:64K -b datapsize:64K -b
stackpsize:64K postgres
export MALLOCOPTIONS=multiheap:16,considersize,pool,no_mallinfo
schedo -p -o vpm_fold_policy=4

There are many other things you can tune on AIX, but they didn't seem
to give the improvement we'd like to see.

Вложения

Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Michael Paquier
Дата:
On Wed, Jan 17, 2018 at 12:36:31AM -0800, Noah Misch wrote:
> For me, verifiability is the crucial benefit of inline asm.  Anyone with an
> architecture manual can thoroughly review an inline asm implementation.  Given
> intrinsics and __xlc_ver__ conditionals, the same level of review requires
> access to every xlc version.

Okay.

> The most recent patch version is Returned with Feedback.  As a matter of
> procedure, I discourage creating commitfest entries as a tool to solicit new
> patch versions.  If I were the author of a RwF patch, I would dislike finding
> a commitfest entry that I did not create with myself listed as author.

Per my understanding, this is a bug, and we don't want to lose track of
them.

> If you do choose to proceed, the entry should be Waiting on Author.

Right.

> Note that fixing this bug is just the start of accepting XLC 13.1 as a
> compiler of PostgreSQL.  If we get a buildfarm member with a few dozen clean
> runs (blocked by, at a minimum, fixing this and the inlining bug), we'll have
> something.  Until then, support for XLC 13.1 is an anti-feature.

Per my understanding of this thread, this is a bug. My point is that the
documentation states that AIX is supported from 4.3.3 to 6.1, however
there are no restrictions related to the compiler, hence I would have
thought that the docs imply XLC 13.1 as a supported compiler. And IBM
states that XLC 13.1 is supported from AIX 6.1:
https://www-01.ibm.com/support/docview.wss?uid=swg21326972

True that the docs tell as well to look at the buildfarm animals, which
don't use XLC 13.1 if you don't look at them closely. Perhaps an
explicit mention of the compiler compatibilities in the docs would help
making the support range clear to anybody then. I would expect more
people to look at the docs than the buildfarm internal contents.
--
Michael

Вложения

Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Robert Haas
Дата:
On Sat, Jan 20, 2018 at 6:16 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
>> The most recent patch version is Returned with Feedback.  As a matter of
>> procedure, I discourage creating commitfest entries as a tool to solicit new
>> patch versions.  If I were the author of a RwF patch, I would dislike finding
>> a commitfest entry that I did not create with myself listed as author.
>
> Per my understanding, this is a bug, and we don't want to lose track of
> them.

I agree with Noah.  It's true that having unfixed bugs isn't
particularly good, but it doesn't justify activating a CommitFest
entry under someone else's name.  If they don't decide to work further
on the problem, what are we going to do?  Keep the entry open forever,
nagging them continually until the end of time?  That won't net us
many new contributors.

If nobody is willing to put in the effort to keep AIX supported under
XLC, then we should just update the documentation to say that it isn't
supported.  Our support for that platform is pretty marginal anyway if
we're only supporting it up through 6.1; that was released in 2007 and
went out of support in Q2 of last year.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> I agree with Noah.  It's true that having unfixed bugs isn't
> particularly good, but it doesn't justify activating a CommitFest
> entry under someone else's name.

Agreed.  The CF app is not a bug tracker.

> If nobody is willing to put in the effort to keep AIX supported under
> XLC, then we should just update the documentation to say that it isn't
> supported.  Our support for that platform is pretty marginal anyway if
> we're only supporting it up through 6.1; that was released in 2007 and
> went out of support in Q2 of last year.

Huh?  We have somebody just upthread saying that they're working out
issues on newer AIX.  We should at least give them time to finish
that research before making any decisions.

            regards, tom lane


Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Michael Paquier
Дата:
On Mon, Jan 22, 2018 at 04:23:55PM -0500, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I agree with Noah.  It's true that having unfixed bugs isn't
>> particularly good, but it doesn't justify activating a CommitFest
>> entry under someone else's name.
>
> Agreed.  The CF app is not a bug tracker.

OK, thanks for the feedback. Based on that I am deleting this entry from
the CF. And my apologies for the noise.

>> If nobody is willing to put in the effort to keep AIX supported under
>> XLC, then we should just update the documentation to say that it isn't
>> supported.  Our support for that platform is pretty marginal anyway if
>> we're only supporting it up through 6.1; that was released in 2007 and
>> went out of support in Q2 of last year.
>
> Huh?  We have somebody just upthread saying that they're working out
> issues on newer AIX.  We should at least give them time to finish
> that research before making any decisions.

Yes, Tony stated so, still he has not published any patches
yet. Regarding the potential documentation impact, please let me suggest
just to mention that support of XLC up to 12.1 is supported, per the
buildfarm this is rather stable and does not face the same issues as
what we are seeing with XLC 13.1. Is there any point to wait for the
docs to not say that now? As far as I know, if support for XLC 13.1
moves on in the future we can update the docs so as the support range is
increased later on.
--
Michael

Вложения

Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Robert Haas
Дата:
On Mon, Jan 22, 2018 at 4:23 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> If nobody is willing to put in the effort to keep AIX supported under
>> XLC, then we should just update the documentation to say that it isn't
>> supported.  Our support for that platform is pretty marginal anyway if
>> we're only supporting it up through 6.1; that was released in 2007 and
>> went out of support in Q2 of last year.
>
> Huh?  We have somebody just upthread saying that they're working out
> issues on newer AIX.  We should at least give them time to finish
> that research before making any decisions.

Well, I certainly have no problem with that in theory.  That's why my
sentence started with the word "if".

On a practical level, this thread has been dead for almost a year, and
has revived slightly because Michael bumped it, but nobody has really
made a firm commitment to do any specific thing to move this issue
forward.  I'm happy if they do, but it might not happen.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Noah Misch
Дата:
On Wed, Jan 17, 2018 at 12:36:31AM -0800, Noah Misch wrote:
> On Tue, Jan 16, 2018 at 08:50:24AM -0800, Andres Freund wrote:
> > On 2018-01-16 16:12:11 +0900, Michael Paquier wrote:
> > > On Fri, Feb 03, 2017 at 12:26:50AM +0000, Noah Misch wrote:
> > > > Since this emits double syncs with older xlc, I recommend instead replacing
> > > > the whole thing with inline asm.  As I opined in the last message of the
> > > > thread you linked above, the intrinsics provide little value as abstractions
> > > > if one checks the generated code to deduce how to use them.  Now that the
> > > > generated code is xlc-version-dependent, the port is better off with
> > > > compiler-independent asm like we have for ppc in s_lock.h.
> > > 
> > > Could it be cleaner to just use __xlc_ver__ to avoid double syncs on
> > > past versions? I think that it would make the code more understandable
> > > than just listing directly the instructions.
> > 
> > Given the quality of the intrinsics on AIX, see past commits and the
> > comment in the code quoted above, I think we're much better of doing
> > this via inline asm.
> 
> For me, verifiability is the crucial benefit of inline asm.  Anyone with an
> architecture manual can thoroughly review an inline asm implementation.  Given
> intrinsics and __xlc_ver__ conditionals, the same level of review requires
> access to every xlc version.

> > > As Heikki is not around these days, Noah, could you provide a new
> > > version of the patch? This bug has been around for some time now, it
> > > would be nice to move on.. 
> 
> Not soon.

Done.  fetch-add-variable-test-v1.patch just adds tests for non-constant
addends and 16-bit edge cases.  Today's implementation handles those,
PostgreSQL doesn't use them, and I might easily have broken them.
fetch-add-xlc-asm-v1.patch moves xlc builds from the __fetch_and_add()
intrinsic to inline asm.  fetch-add-gcc-xlc-unify-v1.patch moves fetch_add to
inline asm for all other ppc compilers.  gcc-7.2.0 generates equivalent code
before and after.  I plan to keep the third patch HEAD-only, back-patching the
other two.  I tested with xlc v12 and v13.

Вложения

Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Tom Lane
Дата:
Noah Misch <noah@leadboat.com> writes:
> Done.  fetch-add-variable-test-v1.patch just adds tests for non-constant
> addends and 16-bit edge cases.  Today's implementation handles those,
> PostgreSQL doesn't use them, and I might easily have broken them.
> fetch-add-xlc-asm-v1.patch moves xlc builds from the __fetch_and_add()
> intrinsic to inline asm.  fetch-add-gcc-xlc-unify-v1.patch moves fetch_add to
> inline asm for all other ppc compilers.  gcc-7.2.0 generates equivalent code
> before and after.  I plan to keep the third patch HEAD-only, back-patching the
> other two.  I tested with xlc v12 and v13.

Hm, no objection to the first two patches, but I don't understand
why the third patch goes to so much effort just to use "addi" rather
than (one assumes) "li" then "add"?  It doesn't seem likely that
that's buying much.

            regards, tom lane



Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Noah Misch
Дата:
On Sat, Aug 31, 2019 at 02:27:55PM -0400, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
> > Done.  fetch-add-variable-test-v1.patch just adds tests for non-constant
> > addends and 16-bit edge cases.  Today's implementation handles those,
> > PostgreSQL doesn't use them, and I might easily have broken them.
> > fetch-add-xlc-asm-v1.patch moves xlc builds from the __fetch_and_add()
> > intrinsic to inline asm.  fetch-add-gcc-xlc-unify-v1.patch moves fetch_add to
> > inline asm for all other ppc compilers.  gcc-7.2.0 generates equivalent code
> > before and after.  I plan to keep the third patch HEAD-only, back-patching the
> > other two.  I tested with xlc v12 and v13.
> 
> Hm, no objection to the first two patches, but I don't understand
> why the third patch goes to so much effort just to use "addi" rather
> than (one assumes) "li" then "add"?  It doesn't seem likely that
> that's buying much.

Changing an addi to li+add may not show up on benchmarks, but I can't claim
it's immaterial.  I shouldn't unify the code if that makes the compiled code
materially worse than what the gcc intrinsics produce today, hence the
nontrivial (~50 line) bits to match the intrinsics' capabilities.



Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Noah Misch
Дата:
On Sat, Aug 31, 2019 at 03:30:26PM -0700, Noah Misch wrote:
> On Sat, Aug 31, 2019 at 02:27:55PM -0400, Tom Lane wrote:
> > Noah Misch <noah@leadboat.com> writes:
> > > Done.  fetch-add-variable-test-v1.patch just adds tests for non-constant
> > > addends and 16-bit edge cases.  Today's implementation handles those,
> > > PostgreSQL doesn't use them, and I might easily have broken them.
> > > fetch-add-xlc-asm-v1.patch moves xlc builds from the __fetch_and_add()
> > > intrinsic to inline asm.  fetch-add-gcc-xlc-unify-v1.patch moves fetch_add to
> > > inline asm for all other ppc compilers.  gcc-7.2.0 generates equivalent code
> > > before and after.  I plan to keep the third patch HEAD-only, back-patching the
> > > other two.  I tested with xlc v12 and v13.
> > 
> > Hm, no objection to the first two patches, but I don't understand
> > why the third patch goes to so much effort just to use "addi" rather
> > than (one assumes) "li" then "add"?  It doesn't seem likely that
> > that's buying much.
> 
> Changing an addi to li+add may not show up on benchmarks, but I can't claim
> it's immaterial.  I shouldn't unify the code if that makes the compiled code
> materially worse than what the gcc intrinsics produce today, hence the
> nontrivial (~50 line) bits to match the intrinsics' capabilities.

The first two patches have worked so far, but fetch-add-gcc-xlc-unify-v1.patch
broke older gcc: https://postgr.es/m/flat/7517.1568470247@sss.pgh.pa.us.  That
side thread settled on putting pg_atomic_compare_exchange_u{32,64}_impl
implementations in arch-ppc.h.  Attached fetch-add-gcc-xlc-unify-v2.patch does
so.  For compare_exchange, the generated code doesn't match gcc intrinsics
exactly; code comments discuss this.  Compared to v1, its other change is to
extract common __asm__ templates into macros instead of repeating them four
times (32/64bit, variable/const).

Вложения

Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Tom Lane
Дата:
Noah Misch <noah@leadboat.com> writes:
> [ fetch-add-gcc-xlc-unify-v2.patch ]

This still fails on Apple's compilers.  The first failure I get is

ccache gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -g -O2 -I../../../src/include
-I/usr/local/include-isysroot /Developer/SDKs/MacOSX10.5.sdk    -c -o nodeHashjoin.o nodeHashjoin.c 
/var/tmp//ccXUM8ep.s:449:Parameter error: r0 not allowed for parameter 2 (code as 0 not r0)

Line 449 of the assembly file is the addi in

LM87:
                sync
        lwarx   r0,0,r2
        addi    r11,r0,1
        stwcx.  r11,0,r2
        bne             $-12
        isync

which I suppose comes out of PG_PPC_FETCH_ADD.  I find this idea of
constructing assembly code by string-pasting pretty unreadable and am not
tempted to try to debug it, but I don't immediately see why this doesn't
work when the existing s_lock.h code does.  I think that the assembler
error message is probably misleading: while it seems to be saying to
s/r0/0/ in the addi, gcc consistently uses "rN" syntax for the second
parameter elsewhere.  I do note that gcc never generates r0 as addi's
second parameter in several files I checked through, so maybe what it
means is "you need to use some other register"?  (Which would imply that
the constraint for this asm argument is too loose.)

I'm also wondering why this isn't following s_lock.h's lead as to
USE_PPC_LWARX_MUTEX_HINT and USE_PPC_LWSYNC.

            regards, tom lane



Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Noah Misch
Дата:
On Mon, Oct 07, 2019 at 03:06:35PM -0400, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
> > [ fetch-add-gcc-xlc-unify-v2.patch ]
> 
> This still fails on Apple's compilers.  The first failure I get is
> 
> ccache gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -g -O2 -I../../../src/include
-I/usr/local/include-isysroot /Developer/SDKs/MacOSX10.5.sdk    -c -o nodeHashjoin.o nodeHashjoin.c
 
> /var/tmp//ccXUM8ep.s:449:Parameter error: r0 not allowed for parameter 2 (code as 0 not r0)
> 
> Line 449 of the assembly file is the addi in
> 
> LM87:
>                 sync
>         lwarx   r0,0,r2
>         addi    r11,r0,1
>         stwcx.  r11,0,r2
>         bne             $-12
>         isync
> 
> which I suppose comes out of PG_PPC_FETCH_ADD.  I find this idea of
> constructing assembly code by string-pasting pretty unreadable and am not
> tempted to try to debug it, but I don't immediately see why this doesn't
> work when the existing s_lock.h code does.  I think that the assembler
> error message is probably misleading: while it seems to be saying to
> s/r0/0/ in the addi, gcc consistently uses "rN" syntax for the second
> parameter elsewhere.  I do note that gcc never generates r0 as addi's
> second parameter in several files I checked through, so maybe what it
> means is "you need to use some other register"?  (Which would imply that
> the constraint for this asm argument is too loose.)

Thanks for testing.  That error boils down to "need to use some other
register".  The second operand of addi is one of the ppc instruction operands
that can hold a constant zero or a register number[1], so the proper
constraint is "b".  I've made it so and added a comment.  I should probably
update s_lock.h, too, in a later patch.  I don't know how it has
mostly-avoided this failure mode, but its choice of constraint could explain
https://postgr.es/m/flat/36E70B06-2C5C-11D8-A096-0005024EF27F%40ifrance.com

> I'm also wondering why this isn't following s_lock.h's lead as to
> USE_PPC_LWARX_MUTEX_HINT and USE_PPC_LWSYNC.

Most or all of today's pg_atomic_compare_exchange_u32() usage does not have
the property that the mutex hint would signal.

pg_atomic_compare_exchange_u32() specifies "Full barrier semantics", which
lwsync does not provide.  We might want to relax the specification to make
lwsync acceptable, but that would be a separate, architecture-independent
project.  (generic-gcc.h:pg_atomic_compare_exchange_u32_impl() speculates
along those lines, writing "FIXME: we can probably use a lower consistency
model".)


[1] "Notice that addi and addis use the value 0, not the contents of GPR 0, if
RA=0." -- https://openpowerfoundation.org/?resource_lib=power-isa-version-3-0

Вложения

Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Tom Lane
Дата:
Noah Misch <noah@leadboat.com> writes:
> On Mon, Oct 07, 2019 at 03:06:35PM -0400, Tom Lane wrote:
>> This still fails on Apple's compilers. ...

> Thanks for testing.  That error boils down to "need to use some other
> register".  The second operand of addi is one of the ppc instruction operands
> that can hold a constant zero or a register number[1], so the proper
> constraint is "b".  I've made it so and added a comment.

Ah-hah.  This version does compile and pass check-world for me.

> I should probably
> update s_lock.h, too, in a later patch.  I don't know how it has
> mostly-avoided this failure mode, but its choice of constraint could explain
> https://postgr.es/m/flat/36E70B06-2C5C-11D8-A096-0005024EF27F%40ifrance.com

Indeed.  It's a bit astonishing that more people haven't hit that.
This should be back-patched.


Now that the patch passes mechanical checks, I took a closer look,
and there are some things I don't like:

* I still think that the added configure test is a waste of build cycles.
It'd be sufficient to test "#ifdef HAVE__BUILTIN_CONSTANT_P" where you
are testing HAVE_I_CONSTRAINT__BUILTIN_CONSTANT_P, because our previous
buildfarm go-round with this showed that all supported compilers
interpret "i" this way.

* I really dislike building the asm calls with macros as you've done
here.  The macros violate project style, and are not remotely general-
purpose, because they have hard-wired references to variables that are
not in their argument lists.  While that could be fixed with more
arguments, I don't think that the approach is readable or maintainable
--- it's impossible for example to understand the register constraints
without looking simultaneously at the calls and the macro definition.
And, as we've seen in this "b" issue, the interactions between the chosen
instruction types and the constraints are subtle enough to make me wonder
whether you won't need even more arguments to allow some of the other
constraints to be variable.  I think it'd be far better just to write out
the asm in-line and accept the not-very-large amount of code duplication
you'd get.

* src/tools/pginclude/headerscheck needs the same adjustment as you
made in cpluspluscheck.

            regards, tom lane



Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Noah Misch
Дата:
On Wed, Oct 09, 2019 at 01:15:29PM -0400, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
> > On Mon, Oct 07, 2019 at 03:06:35PM -0400, Tom Lane wrote:
> >> This still fails on Apple's compilers. ...
> 
> > Thanks for testing.  That error boils down to "need to use some other
> > register".  The second operand of addi is one of the ppc instruction operands
> > that can hold a constant zero or a register number[1], so the proper
> > constraint is "b".  I've made it so and added a comment.
> 
> Ah-hah.  This version does compile and pass check-world for me.
> 
> > I should probably
> > update s_lock.h, too, in a later patch.  I don't know how it has
> > mostly-avoided this failure mode, but its choice of constraint could explain
> > https://postgr.es/m/flat/36E70B06-2C5C-11D8-A096-0005024EF27F%40ifrance.com
> 
> Indeed.  It's a bit astonishing that more people haven't hit that.
> This should be back-patched.

I may as well do that first, so there's no time when s_lock.h disagrees with
arch-ppc.h about the constraint to use.  I'm attaching that patch, too.

> * I still think that the added configure test is a waste of build cycles.
> It'd be sufficient to test "#ifdef HAVE__BUILTIN_CONSTANT_P" where you
> are testing HAVE_I_CONSTRAINT__BUILTIN_CONSTANT_P, because our previous
> buildfarm go-round with this showed that all supported compilers
> interpret "i" this way.

xlc does not interpret "i" that way:
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=hornet&dt=2019-09-14%2016%3A42%3A32&stg=config

> * I really dislike building the asm calls with macros as you've done
> here.  The macros violate project style, and are not remotely general-
> purpose, because they have hard-wired references to variables that are
> not in their argument lists.  While that could be fixed with more
> arguments, I don't think that the approach is readable or maintainable
> --- it's impossible for example to understand the register constraints
> without looking simultaneously at the calls and the macro definition.
> And, as we've seen in this "b" issue, the interactions between the chosen
> instruction types and the constraints are subtle enough to make me wonder
> whether you won't need even more arguments to allow some of the other
> constraints to be variable.  I think it'd be far better just to write out
> the asm in-line and accept the not-very-large amount of code duplication
> you'd get.

For a macro local to one C file, I think readability is the relevant metric.
In particular, it would be wrong to add arguments to make these more like
header file macros.  I think the macros make the code somewhat more readable,
and you think they make the code less readable.  I have removed the macros.

> * src/tools/pginclude/headerscheck needs the same adjustment as you
> made in cpluspluscheck.

Done.

Вложения

Re: [HACKERS] Deadlock in XLogInsert at AIX

От
Tom Lane
Дата:
Noah Misch <noah@leadboat.com> writes:
> On Wed, Oct 09, 2019 at 01:15:29PM -0400, Tom Lane wrote:
>> * I still think that the added configure test is a waste of build cycles.
>> It'd be sufficient to test "#ifdef HAVE__BUILTIN_CONSTANT_P" where you
>> are testing HAVE_I_CONSTRAINT__BUILTIN_CONSTANT_P, because our previous
>> buildfarm go-round with this showed that all supported compilers
>> interpret "i" this way.

> xlc does not interpret "i" that way:
> https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=hornet&dt=2019-09-14%2016%3A42%3A32&stg=config

Hm, how'd I miss that while looking at the buildfarm results?  Anyway,
you're clearly right on this one --- objection withdrawn.

>> * I really dislike building the asm calls with macros as you've done
>> here.

> For a macro local to one C file, I think readability is the relevant metric.
> In particular, it would be wrong to add arguments to make these more like
> header file macros.  I think the macros make the code somewhat more readable,
> and you think they make the code less readable.  I have removed the macros.

Personally I definitely find this way more readable, though I agree
beauty is in the eye of the beholder.

I've checked that this patch set compiles and passes regression tests
on my old Apple machines, so it's good to go as far as I can see.

            regards, tom lane