Обсуждение: Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.

Поиск
Список
Период
Сортировка

Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> Allow Pin/UnpinBuffer to operate in a lockfree manner.

This commit has broken buildfarm member gaur, and no doubt pademelon
will be equally unhappy once it catches up to HEAD.  The reason is that
you've caused localbuf.c to perform a whole bunch of atomic operations
on its buffer headers; and on machines that don't have native atomic
ops, there's a spinlock underlying those; and you did not bother to
ensure that appropriate SpinLockInit operations happen for local-buffer
headers.  (HPPA, possibly alone among supported platforms, does not
think that SpinLockInit is equivalent to memset-to-zeroes.)

While we could fix this by performing suitable SpinLockInit's on
local-buffer headers, I have to think that that's fundamentally the
wrong direction.  The *entire* *point* of having local buffers is
that they are not subject to concurrency overhead.  So IMO, sticking
atomic-ops calls into localbuf.c is broken on its face.
        regards, tom lane



Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.

От
Andres Freund
Дата:
On 2016-04-11 23:24:36 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > Allow Pin/UnpinBuffer to operate in a lockfree manner.
> 
> This commit has broken buildfarm member gaur, and no doubt pademelon
> will be equally unhappy once it catches up to HEAD.  The reason is that
> you've caused localbuf.c to perform a whole bunch of atomic operations
> on its buffer headers; and on machines that don't have native atomic
> ops, there's a spinlock underlying those; and you did not bother to
> ensure that appropriate SpinLockInit operations happen for local-buffer
> headers.  (HPPA, possibly alone among supported platforms, does not
> think that SpinLockInit is equivalent to memset-to-zeroes.)

That's obviously borked, and need to be fixed.

> While we could fix this by performing suitable SpinLockInit's on
> local-buffer headers, I have to think that that's fundamentally the
> wrong direction.  The *entire* *point* of having local buffers is
> that they are not subject to concurrency overhead.  So IMO, sticking
> atomic-ops calls into localbuf.c is broken on its face.

Note that localbuf.c tries to be careful to only use
pg_atomic_read_u32/pg_atomic_write_u32 - which don't have a concurrency
overhead as they don't utilize atomic ops.

The issue is likely that either Alexander or I somehow made
MarkLocalBufferDirty() use pg_atomic_fetch_or_u32(), instead of the
proper pg_atomic_read_u32()/pg_atomic_write_u32().

Greetings,

Andres Freund



Re: Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> The issue is likely that either Alexander or I somehow made
> MarkLocalBufferDirty() use pg_atomic_fetch_or_u32(), instead of the
> proper pg_atomic_read_u32()/pg_atomic_write_u32().

The stack trace I'm seeing is

#5  0x7279cc in elog_finish (elevel=6,    fmt=0x40057cf8 '\177' <repeats 200 times>...) at elog.c:1378
#6  0x5cecd8 in s_lock_stuck (p=0x402995b8, file=0x21bae0 "s_lock.c", line=92)   at s_lock.c:81
#7  0x5cedd4 in perform_spin_delay (status=0x7b03b8c8) at s_lock.c:130
#8  0x5ced40 in s_lock (lock=0x6, file=0x20 <Address 0x20 out of bounds>,    line=6) at s_lock.c:96
#9  0x53a4b0 in pg_atomic_compare_exchange_u32_impl (ptr=0x402995b8,    expected=0x5c, newval=58982400) at
atomics.c:122
#10 0x5a280c in MarkLocalBufferDirty (buffer=6)   at ../../../../src/include/port/atomics/generic.h:224
#11 0x59bba0 in MarkBufferDirty (buffer=6) at bufmgr.c:1489
#12 0x2c9cd0 in heap_multi_insert (relation=0x401c41d0, tuples=0x40201888,    ntuples=561, cid=0, options=0,
bistate=0x40153128)at heapam.c:2760
 

        regards, tom lane



Re: Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.

От
Andres Freund
Дата:
On 2016-04-11 23:41:10 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > The issue is likely that either Alexander or I somehow made
> > MarkLocalBufferDirty() use pg_atomic_fetch_or_u32(), instead of the
> > proper pg_atomic_read_u32()/pg_atomic_write_u32().
>
> The stack trace I'm seeing is
>
> #5  0x7279cc in elog_finish (elevel=6,
>     fmt=0x40057cf8 '\177' <repeats 200 times>...) at elog.c:1378
> #6  0x5cecd8 in s_lock_stuck (p=0x402995b8, file=0x21bae0 "s_lock.c", line=92)
>     at s_lock.c:81
> #7  0x5cedd4 in perform_spin_delay (status=0x7b03b8c8) at s_lock.c:130
> #8  0x5ced40 in s_lock (lock=0x6, file=0x20 <Address 0x20 out of bounds>,
>     line=6) at s_lock.c:96
> #9  0x53a4b0 in pg_atomic_compare_exchange_u32_impl (ptr=0x402995b8,
>     expected=0x5c, newval=58982400) at atomics.c:122
> #10 0x5a280c in MarkLocalBufferDirty (buffer=6)
>     at ../../../../src/include/port/atomics/generic.h:224

Ok, so the theory above fits.

Will fix (both initialization and use of pg_atomic_fetch_or_u32), and
expand the documentation on why only atomic read/write are supposed to
be used.

Thanks,

Andres



Re: Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
>>> The issue is likely that either Alexander or I somehow made
>>> MarkLocalBufferDirty() use pg_atomic_fetch_or_u32(), instead of the
>>> proper pg_atomic_read_u32()/pg_atomic_write_u32().

> Ok, so the theory above fits.

Yah, especially in view of localbuf.c:297 ;-)

> Will fix (both initialization and use of pg_atomic_fetch_or_u32), and
> expand the documentation on why only atomic read/write are supposed to
> be used.

FWIW, I'd vote against adding a SpinLockInit there.  What it would mostly
do is prevent noticing future mistakes of the same ilk.  It would be
better no doubt if we didn't have to rely on a nearly-dead platform
to detect this; but having such detection of a performance bug is better
than having no detection.
        regards, tom lane



Re: Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.

От
Andres Freund
Дата:
On 2016-04-11 23:59:21 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > Will fix (both initialization and use of pg_atomic_fetch_or_u32), and
> > expand the documentation on why only atomic read/write are supposed to
> > be used.
> 
> FWIW, I'd vote against adding a SpinLockInit there.

Well, it'd not be a SpinLockInit, but a pg_atomic_init_u32(), but ...

> What it would mostly
> do is prevent noticing future mistakes of the same ilk.  It would be
> better no doubt if we didn't have to rely on a nearly-dead platform
> to detect this; but having such detection of a performance bug is better
> than having no detection.

Ok, works for me as well. I guess it'd be useful to add a "modern"
animal that disables spinlocks & atomics...

- Andres



Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.

От
Tom Lane
Дата:
I wrote:
> This commit has broken buildfarm member gaur, and no doubt pademelon
> will be equally unhappy once it catches up to HEAD.

Spoke too soon, actually: pademelon did not get as far as initdb.

cc: "bufmgr.c", line 4032: error 1521: Incorrect initialization.
cc: "bufmgr.c", line 4058: error 1521: Incorrect initialization.

Apparently, assigning the result of init_spin_delay() is not as portable
as you thought.  Maybe convert that into
init_spin_delay(SpinDelayStatus *target, whatever-the-other-arg-is) ?
        regards, tom lane



Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.

От
Andres Freund
Дата:
On 2016-04-12 00:32:13 -0400, Tom Lane wrote:
> I wrote:
> > This commit has broken buildfarm member gaur, and no doubt pademelon
> > will be equally unhappy once it catches up to HEAD.
> 
> Spoke too soon, actually: pademelon did not get as far as initdb.
> 
> cc: "bufmgr.c", line 4032: error 1521: Incorrect initialization.
> cc: "bufmgr.c", line 4058: error 1521: Incorrect initialization.
> 
> Apparently, assigning the result of init_spin_delay() is not as portable
> as you thought.

Hm. I'm not sure why not though? Is it because delayStatus isn't static
or global scope?


> Maybe convert that into init_spin_delay(SpinDelayStatus *target,
> whatever-the-other-arg-is) ?

Too bad, my compiler generates a bit nicer code for the current version.


Greetings,

Andres Freund



Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2016-04-12 00:32:13 -0400, Tom Lane wrote:
>> Apparently, assigning the result of init_spin_delay() is not as portable
>> as you thought.

> Hm. I'm not sure why not though? Is it because delayStatus isn't static
> or global scope?

It looks like that compiler adheres to the C89 restriction that an
initializer for an array or struct must contain only link-time-constant
expressions, even if the target object is of dynamic scope.
The macro works with a link-time-constant pointer argument, but not
with one that must be evaluated at runtime.
        regards, tom lane



Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.

От
Tom Lane
Дата:
I wrote:
> It looks like that compiler adheres to the C89 restriction that an
> initializer for an array or struct must contain only link-time-constant
> expressions, even if the target object is of dynamic scope.
> The macro works with a link-time-constant pointer argument, but not
> with one that must be evaluated at runtime.

It strikes me that that means you could stick with this initialization
method if you made the macro argument be a literal constant string name,
like "buffer spinlock", and printed that rather than the address in
s_lock_stuck.  This would be different from what we do now, but not
necessarily any less useful.  I'm pretty dubious of the usefulness of
the addresses you're printing right now, because most of them are indeed
not the spinlock itself.
        regards, tom lane



Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.

От
Piotr Stefaniak
Дата:
On 2016-04-12 07:00, Andres Freund wrote:
> On 2016-04-12 00:32:13 -0400, Tom Lane wrote:
>> I wrote:
>>> This commit has broken buildfarm member gaur, and no doubt pademelon
>>> will be equally unhappy once it catches up to HEAD.
>>
>> Spoke too soon, actually: pademelon did not get as far as initdb.
>>
>> cc: "bufmgr.c", line 4032: error 1521: Incorrect initialization.
>> cc: "bufmgr.c", line 4058: error 1521: Incorrect initialization.
>>
>> Apparently, assigning the result of init_spin_delay() is not as portable
>> as you thought.
>
> Hm. I'm not sure why not though? Is it because delayStatus isn't static
> or global scope?

It's because C89 requires initializers for aggregate and union types to 
be constant expressions (3.5.7p3):
> All the expressions in an initializer for an object that has static
> storage duration or in an initializer list for an object that has
> aggregate or union type shall be constant expressions.

C99 removes this constraint (6.7.8p4):
> All the expressions in an initializer for an object that has static storage duration shall be
> constant expressions or string literals.





Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.

От
Andres Freund
Дата:
On 2016-04-12 11:52:01 -0400, Tom Lane wrote:
> I wrote:
> > It looks like that compiler adheres to the C89 restriction that an
> > initializer for an array or struct must contain only link-time-constant
> > expressions, even if the target object is of dynamic scope.
> > The macro works with a link-time-constant pointer argument, but not
> > with one that must be evaluated at runtime.
> 
> It strikes me that that means you could stick with this initialization
> method if you made the macro argument be a literal constant string name,
> like "buffer spinlock", and printed that rather than the address in
> s_lock_stuck.  This would be different from what we do now, but not
> necessarily any less useful.

I'm not sure anybody really benefits from those addresses; I guess the
idea was that they'd allow you to figure out which exact spinlock got
stuck; file + line doesn't necessarily help there. But it doesn't seem
super useful, ASLR makes the addesses unpredictable, so you need a core
file anyway - in which case you can just walk the stack.

So I think I'm on board with replacing the argument; although I'm
wondering if we shouldn't just remove it entirely, rather than replacing
it with a string that's likely just going to duplicate the file/line
information.

Greetings,

Andres Freund



Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2016-04-12 11:52:01 -0400, Tom Lane wrote:
>> It strikes me that that means you could stick with this initialization
>> method if you made the macro argument be a literal constant string name,
>> like "buffer spinlock", and printed that rather than the address in
>> s_lock_stuck.  This would be different from what we do now, but not
>> necessarily any less useful.

> I'm not sure anybody really benefits from those addresses; I guess the
> idea was that they'd allow you to figure out which exact spinlock got
> stuck; file + line doesn't necessarily help there. But it doesn't seem
> super useful, ASLR makes the addesses unpredictable, so you need a core
> file anyway - in which case you can just walk the stack.

True.

> So I think I'm on board with replacing the argument; although I'm
> wondering if we shouldn't just remove it entirely, rather than replacing
> it with a string that's likely just going to duplicate the file/line
> information.

Good point, it would be absolutely duplicative.  What I'd suggest,
actually, is that we convert this to the same info as what elog
provides (file+line+function name).  The line number is often hard
to decipher if you're not sure exactly what PG version you're dealing
with, so I think having the function name would be a sufficient advance
to rebut any claim that we'd gone backwards.
        regards, tom lane



Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.

От
Andres Freund
Дата:
On 2016-04-13 14:16:15 -0400, Tom Lane wrote:
> Good point, it would be absolutely duplicative.  What I'd suggest,
> actually, is that we convert this to the same info as what elog
> provides (file+line+function name).

Heh, I was wondering the same aftering sending the last email.  Will do
that then.

Greetings,

Andres Freund



Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.

От
Andres Freund
Дата:
On 2016-04-13 11:20:19 -0700, Andres Freund wrote:
> On 2016-04-13 14:16:15 -0400, Tom Lane wrote:
> > Good point, it would be absolutely duplicative.  What I'd suggest,
> > actually, is that we convert this to the same info as what elog
> > provides (file+line+function name).
> 
> Heh, I was wondering the same aftering sending the last email.  Will do
> that then.

Pushed. Let's see what pademelon says...



Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2016-04-13 11:20:19 -0700, Andres Freund wrote:
>> Heh, I was wondering the same aftering sending the last email.  Will do
>> that then.

> Pushed. Let's see what pademelon says...

I lit off a run, but it'll be a few hours till we have results...
        regards, tom lane



Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.

От
Andres Freund
Дата:
On 2016-04-13 20:09:46 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2016-04-13 11:20:19 -0700, Andres Freund wrote:
> >> Heh, I was wondering the same aftering sending the last email.  Will do
> >> that then.
> 
> > Pushed. Let's see what pademelon says...
> 
> I lit off a run, but it'll be a few hours till we have results...

In hindsight obviously, this isn't sufficient because s_lock reports the
caller's file/line/function, not __FILE__ etc. So I guess we're left
moving initialization to an inline function - will do so tomorrow.

I'm also putting up an animal with clang that uses
CFLAGS='-std=c89 -Wc99-extensions -Werror=c99-extensions'
which actually catches this.

I'd previously tested with gcc and "-Wc90-c99-compat -Werror=c90-c99-compat
-Wno-long-long -Wno-error=long-long -std=c90", but that doesn't catch
that error.

(both find a couple trailing commas in enums, will fix)


- Andres



Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.

От
Andres Freund
Дата:
On 2016-04-13 23:31:21 -0700, Andres Freund wrote:
> I'm also putting up an animal with clang that uses
> CFLAGS='-std=c89 -Wc99-extensions -Werror=c99-extensions'
> which actually catches this.

Hm. Doing so I found the following in 9.3:

/home/andres/src/postgresql-9.3/src/bin/pg_dump/parallel.c:561:23: error: initializer for aggregate is not a
compile-timeconstant [-Werror,-Wc99-extensions]                       int                     pipefd[2] =
{pipeMW[PIPE_READ],pipeWM[PIPE_WRITE]};                                                            ^~~~~~~~~~~~~~~~~
 

which is, afaics, the same class of issue we're hitting on master right
now. I apparently fixed that, via Robert, back in 59202fae0. I'd planned
to put up that animal (mylodon) for all branches, but given that that's
been unfixed for years I'm not sure. What do you think?

- Andres



Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2016-04-13 23:31:21 -0700, Andres Freund wrote:
>> I'm also putting up an animal with clang that uses
>> CFLAGS='-std=c89 -Wc99-extensions -Werror=c99-extensions'
>> which actually catches this.

> Hm. Doing so I found the following in 9.3:

> /home/andres/src/postgresql-9.3/src/bin/pg_dump/parallel.c:561:23: error: initializer for aggregate is not a
compile-timeconstant [-Werror,-Wc99-extensions]
 
>                         int                     pipefd[2] = {pipeMW[PIPE_READ], pipeWM[PIPE_WRITE]};
>                                                              ^~~~~~~~~~~~~~~~~

> which is, afaics, the same class of issue we're hitting on master right
> now. I apparently fixed that, via Robert, back in 59202fae0. I'd planned
> to put up that animal (mylodon) for all branches, but given that that's
> been unfixed for years I'm not sure. What do you think?

Huh.  I just tried pademelon's compiler on the 9.3 branch, and sure
enough it spits up on that:

cc: "parallel.c", line 561: error 1521: Incorrect initialization.
cc: "parallel.c", line 561: error 1521: Incorrect initialization.
make: *** [parallel.o] Error 1

I am not sure how come I missed seeing that at the time, but I certainly
would have complained about it if I had chanced to try that compiler
later in 9.3 development.  I stopped using that machine actively in
mid-2013, though, so maybe I just failed to try that compiler for a
period of a few months.  IIRC, when I got around to setting up pademelon
as a buildfarm animal, I didn't have it building anything older than 9.4,
so I missed seeing it on that end too.

I'd vote for back-patching 59202fae0 and enforcing c89 compatibility
all along the line.
        regards, tom lane



Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.

От
Andres Freund
Дата:
On 2016-04-14 22:46:01 -0400, Tom Lane wrote:
> > Hm. Doing so I found the following in 9.3:
>
> > /home/andres/src/postgresql-9.3/src/bin/pg_dump/parallel.c:561:23: error: initializer for aggregate is not a
compile-timeconstant [-Werror,-Wc99-extensions]
 
> >                         int                     pipefd[2] = {pipeMW[PIPE_READ], pipeWM[PIPE_WRITE]};
> >                                                              ^~~~~~~~~~~~~~~~~
>
> > which is, afaics, the same class of issue we're hitting on master right
> > now. I apparently fixed that, via Robert, back in 59202fae0. I'd planned
> > to put up that animal (mylodon) for all branches, but given that that's
> > been unfixed for years I'm not sure. What do you think?
>
> Huh.  I just tried pademelon's compiler on the 9.3 branch, and sure
> enough it spits up on that:
>
> cc: "parallel.c", line 561: error 1521: Incorrect initialization.
> cc: "parallel.c", line 561: error 1521: Incorrect initialization.
> make: *** [parallel.o] Error 1
>
> I am not sure how come I missed seeing that at the time, but I certainly
> would have complained about it if I had chanced to try that compiler
> later in 9.3 development.  I stopped using that machine actively in
> mid-2013, though, so maybe I just failed to try that compiler for a
> period of a few months.  IIRC, when I got around to setting up pademelon
> as a buildfarm animal, I didn't have it building anything older than 9.4,
> so I missed seeing it on that end too.
>
> I'd vote for back-patching 59202fae0 and enforcing c89 compatibility
> all along the line.

Done.  Will be a bit till all branches have reported (that VM's getting
busy, it now runs four animals, one of them skink which is kinda
expensive CPU wise). But I checked before that I can do a basic compile
with the respective flags all the way to 9.1 (after pushing the enum
fixes).

Andres