RE:[HACKERS] Deadlock in XLogInsert at AIX

Поиск
Список
Период
Сортировка
От REIX, Tony
Тема RE:[HACKERS] Deadlock in XLogInsert at AIX
Дата
Msg-id B37989F2852398498001550C29155BE5184AC1F0@FRCRPVV9EX3MSX.ww931.my-it-solutions.net
обсуждение исходный текст
Ответ на Re: [HACKERS] Deadlock in XLogInsert at AIX  (Michael Paquier <michael.paquier@gmail.com>)
Ответы Re: [HACKERS] Deadlock in XLogInsert at AIX
Re: [HACKERS] Deadlock in XLogInsert at AIX
Список pgsql-hackers
Hi Michael,

My team and my company (ATOS/Bull) are involved in improving the quality of PostgreSQL on AIX.

We have AIX 6.1, 7.1, and 7.2 Power8 systems, with several logical/physical processors.
And I plan to have a more powerful (more processors) machine for running PostgreSQL stress tests.
A DB-expert colleague has started to write some new not-too-complex stress tests that we'd like to submit to PostgreSQL
projectlater. 
For now, using latest versions of XLC 12 (12.1.0.19) and 13 (13.1.3.4 with a patch), we have only (on AIX 6.1 and 7.2)
oneremaining random failure (dealing with src/bin/pgbench/t/001_pgbench.pl test), for PostgreSQL 9.6.6 and 10.1 . And,
onAIX 7.1, we have one more remaining failure that may be due to some other dependent software. Investigating. 
XLC 13.1.3.4 shows an issue with -O2 and I have a work-around that fixes it in ./src/backend/parser/gram.c . We have
openeda PMR (defect) against XLC. 
Note that our tests are now executed without the PG_FORCE_DISABLE_INLINE "inline" trick in src/include/port/aix.h that
suppressesthe inlining of routines on AIX. I think that older versions of XLC have shown issues that have now
disappeared(or, at least, many of them). 
I've been able to compare PostgreSQL compiled with XLC vs GCC 7.1 and, using times outputs provided by PostgreSQL
tests,XLC seems to provide at least 8% more speed. We also plan to run professional performance tests in order to
comparePostgreSQL 10.1 on AIX vs Linux/Power. I saw some 2017 performance slides, made with older versions of
PostgreSQLand XLC, that show bad PostgreSQL performance on AIX vs Linux/Power, and I cannot believe it. We plan to
investigatethis. 

Though I have very very little skills about PostgreSQL (I'm porting too now GCC Go on AIX), we can help, at least by
compiling/testing/investigating/stressingin a different AIX environment than the AIX ones (32/64bit, XLC/GCC) you have
inyour BuildFarm. 
Let me know how we can help.

Regards,

Cordialement,

Tony Reix

ATOS / Bull SAS
ATOS Expert
IBM Coop Architect & Technical Leader
Office : +33 (0) 4 76 29 72 67
1 rue de Provence - 38432 Échirolles - France
www.atos.net

________________________________________
De : Michael Paquier [michael.paquier@gmail.com]
Envoyé : mardi 16 janvier 2018 08:12
À : Noah Misch
Cc : Heikki Linnakangas; Konstantin Knizhnik; PostgreSQL Hackers; Bernd Helmle
Objet : Re: [HACKERS] Deadlock in XLogInsert at AIX

On Fri, Feb 03, 2017 at 12:26:50AM +0000, Noah Misch wrote:
> On Wed, Feb 01, 2017 at 02:39:25PM +0200, Heikki Linnakangas wrote:
>> @@ -73,11 +73,19 @@ pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr,
>>  static inline uint32
>>  pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_)
>>  {
>> +    uint32          ret;
>> +
>>      /*
>> -     * __fetch_and_add() emits a leading "sync" and trailing "isync", thereby
>> -     * providing sequential consistency.  This is undocumented.
>> +     * Use __sync() before and __isync() after, like in compare-exchange
>> +     * above.
>>       */
>> -    return __fetch_and_add((volatile int *)&ptr->value, add_);
>> +    __sync();
>> +
>> +    ret = __fetch_and_add((volatile int *)&ptr->value, add_);
>> +
>> +    __isync();
>> +
>> +    return ret;
>>  }
>
> Since this emits double syncs with older xlc, I recommend instead replacing
> the whole thing with inline asm.  As I opined in the last message of the
> thread you linked above, the intrinsics provide little value as abstractions
> if one checks the generated code to deduce how to use them.  Now that the
> generated code is xlc-version-dependent, the port is better off with
> compiler-independent asm like we have for ppc in s_lock.h.

Could it be cleaner to just use __xlc_ver__ to avoid double syncs on
past versions? I think that it would make the code more understandable
than just listing directly the instructions. As there have been other
bug reports from Tony Reix who has been working on AIX with XLC 13.1 and
that this thread got lost in the wild, I have added an entry in the next
CF:
https://commitfest.postgresql.org/17/1484/

As Heikki is not around these days, Noah, could you provide a new
version of the patch? This bug has been around for some time now, it
would be nice to move on.. I think I could have written patches myself,
but I don't have an AIX machine at hand. Of course not with XLC 13.1.
--
Michael


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Langote
Дата:
Сообщение: Re: [Sender Address Forgery]Re: [Sender Address Forgery]Re: [HACKERS]path toward faster partition pruning
Следующее
От: Jeevan Chalke
Дата:
Сообщение: Re: [HACKERS] Partition-wise aggregation/grouping