Обсуждение: valgrind versus pg_atomic_init()

Поиск
Список
Период
Сортировка

valgrind versus pg_atomic_init()

От
Tom Lane
Дата:
I experimented with running "make check" on ARM64 under a reasonably
bleeding-edge valgrind (3.16.0).  One thing I ran into is that
regress.c's test_atomic_ops fails; valgrind shows the stack trace

   fun:__aarch64_cas8_acq_rel
   fun:pg_atomic_compare_exchange_u64_impl
   fun:pg_atomic_exchange_u64_impl
   fun:pg_atomic_write_u64_impl
   fun:pg_atomic_init_u64_impl
   fun:pg_atomic_init_u64
   fun:test_atomic_uint64
   fun:test_atomic_ops
   fun:ExecInterpExpr

Now, this is basically the same thing as is already memorialized in
src/tools/valgrind.supp:

# Atomic writes to 64bit atomic vars uses compare/exchange to
# guarantee atomic writes of 64bit variables. pg_atomic_write is used
# during initialization of the atomic variable; that leads to an
# initial read of the old, undefined, memory value. But that's just to
# make sure the swap works correctly.
{
    uninitialized_atomic_init_u64
    Memcheck:Cond
    fun:pg_atomic_exchange_u64_impl
    fun:pg_atomic_write_u64_impl
    fun:pg_atomic_init_u64_impl
}

so my first thought was that we just needed an architecture-specific
variant of that.  But on thinking more about this, it seems like
generic.h's version of pg_atomic_init_u64_impl is just fundamentally
misguided.  Why isn't it simply assigning the value with an ordinary
unlocked write?  By definition, we had better not be using this function
in any circumstance where there might be conflicting accesses, so I don't
see why we should need to tolerate a valgrind exception here.  Moreover,
if a simple assignment *isn't* good enough, then surely the spinlock
version in atomics.c is 100% broken.

            regards, tom lane



Re: valgrind versus pg_atomic_init()

От
Andres Freund
Дата:
Hi,

On 2020-06-07 00:23:35 -0400, Tom Lane wrote:
> I experimented with running "make check" on ARM64 under a reasonably
> bleeding-edge valgrind (3.16.0).  One thing I ran into is that
> regress.c's test_atomic_ops fails; valgrind shows the stack trace
> 
>    fun:__aarch64_cas8_acq_rel
>    fun:pg_atomic_compare_exchange_u64_impl
>    fun:pg_atomic_exchange_u64_impl
>    fun:pg_atomic_write_u64_impl
>    fun:pg_atomic_init_u64_impl
>    fun:pg_atomic_init_u64
>    fun:test_atomic_uint64
>    fun:test_atomic_ops
>    fun:ExecInterpExpr
> 
> Now, this is basically the same thing as is already memorialized in
> src/tools/valgrind.supp:
> 
> # Atomic writes to 64bit atomic vars uses compare/exchange to
> # guarantee atomic writes of 64bit variables. pg_atomic_write is used
> # during initialization of the atomic variable; that leads to an
> # initial read of the old, undefined, memory value. But that's just to
> # make sure the swap works correctly.
> {
>     uninitialized_atomic_init_u64
>     Memcheck:Cond
>     fun:pg_atomic_exchange_u64_impl
>     fun:pg_atomic_write_u64_impl
>     fun:pg_atomic_init_u64_impl
> }
> 
> so my first thought was that we just needed an architecture-specific
> variant of that.  But on thinking more about this, it seems like
> generic.h's version of pg_atomic_init_u64_impl is just fundamentally
> misguided.  Why isn't it simply assigning the value with an ordinary
> unlocked write?  By definition, we had better not be using this function
> in any circumstance where there might be conflicting accesses, so I don't
> see why we should need to tolerate a valgrind exception here.  Moreover,
> if a simple assignment *isn't* good enough, then surely the spinlock
> version in atomics.c is 100% broken.

Yea, it could just do that. It seemed slightly easier/clearer, back when
I wrote it, to just use pg_atomic_write* for the initialization, but
this seems enough of a reason to stop doing so. Will change it in all
branches, unless somebody sees a reason to not do so?

Greetings,

Andres Freund



Re: valgrind versus pg_atomic_init()

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2020-06-07 00:23:35 -0400, Tom Lane wrote:
>> so my first thought was that we just needed an architecture-specific
>> variant of that.  But on thinking more about this, it seems like
>> generic.h's version of pg_atomic_init_u64_impl is just fundamentally
>> misguided.  Why isn't it simply assigning the value with an ordinary
>> unlocked write?  By definition, we had better not be using this function
>> in any circumstance where there might be conflicting accesses, so I don't
>> see why we should need to tolerate a valgrind exception here.  Moreover,
>> if a simple assignment *isn't* good enough, then surely the spinlock
>> version in atomics.c is 100% broken.

> Yea, it could just do that. It seemed slightly easier/clearer, back when
> I wrote it, to just use pg_atomic_write* for the initialization, but
> this seems enough of a reason to stop doing so. Will change it in all
> branches, unless somebody sees a reason to not do so?

Works for me.

            regards, tom lane



Re: valgrind versus pg_atomic_init()

От
Andres Freund
Дата:
Hi,

On 2020-06-08 18:21:06 -0400, Tom Lane wrote:
> > Yea, it could just do that. It seemed slightly easier/clearer, back when
> > I wrote it, to just use pg_atomic_write* for the initialization, but
> > this seems enough of a reason to stop doing so. Will change it in all
> > branches, unless somebody sees a reason to not do so?
>
> Works for me.

And done.

- Andres



Re: valgrind versus pg_atomic_init()

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> And done.

LGTM, thanks!

            regards, tom lane



Re: valgrind versus pg_atomic_init()

От
Noah Misch
Дата:
On Sun, Jun 07, 2020 at 12:23:35AM -0400, Tom Lane wrote:
> I experimented with running "make check" on ARM64 under a reasonably
> bleeding-edge valgrind (3.16.0).  One thing I ran into is that
> regress.c's test_atomic_ops fails; valgrind shows the stack trace
> 
>    fun:__aarch64_cas8_acq_rel
>    fun:pg_atomic_compare_exchange_u64_impl
>    fun:pg_atomic_exchange_u64_impl
>    fun:pg_atomic_write_u64_impl
>    fun:pg_atomic_init_u64_impl
>    fun:pg_atomic_init_u64
>    fun:test_atomic_uint64
>    fun:test_atomic_ops
>    fun:ExecInterpExpr
> 
> Now, this is basically the same thing as is already memorialized in
> src/tools/valgrind.supp:
> 
> # Atomic writes to 64bit atomic vars uses compare/exchange to
> # guarantee atomic writes of 64bit variables. pg_atomic_write is used
> # during initialization of the atomic variable; that leads to an
> # initial read of the old, undefined, memory value. But that's just to
> # make sure the swap works correctly.
> {
>     uninitialized_atomic_init_u64
>     Memcheck:Cond
>     fun:pg_atomic_exchange_u64_impl
>     fun:pg_atomic_write_u64_impl
>     fun:pg_atomic_init_u64_impl
> }
> 
> so my first thought was that we just needed an architecture-specific
> variant of that.  But on thinking more about this, it seems like
> generic.h's version of pg_atomic_init_u64_impl is just fundamentally
> misguided.  Why isn't it simply assigning the value with an ordinary
> unlocked write?  By definition, we had better not be using this function
> in any circumstance where there might be conflicting accesses

Does something guarantee the write will be globally-visible by the time the
first concurrent accessor shows up?  (If not, one could (a) do an unlocked
ptr->value=0, then the atomic write, or (b) revert and improve the
suppression.)  I don't doubt it's fine for the ways PostgreSQL uses atomics
today, which generally initialize an atomic before the concurrent-accessor
processes even exist.

> , so I don't
> see why we should need to tolerate a valgrind exception here.  Moreover,
> if a simple assignment *isn't* good enough, then surely the spinlock
> version in atomics.c is 100% broken.

Are you saying it would imply a bug in atomics.c pg_atomic_init_u64_impl(),
pg_atomic_compare_exchange_u64_impl(), or pg_atomic_fetch_add_u64_impl()?  Can
you explain that more?  If you were referring to unlocked "*(lock) = 0", that
is different since it's safe to have a delay in propagation of the change from
locked state to unlocked state.



Re: valgrind versus pg_atomic_init()

От
Tom Lane
Дата:
Noah Misch <noah@leadboat.com> writes:
> On Sun, Jun 07, 2020 at 12:23:35AM -0400, Tom Lane wrote:
>> ...  But on thinking more about this, it seems like
>> generic.h's version of pg_atomic_init_u64_impl is just fundamentally
>> misguided.  Why isn't it simply assigning the value with an ordinary
>> unlocked write?  By definition, we had better not be using this function
>> in any circumstance where there might be conflicting accesses

> Does something guarantee the write will be globally-visible by the time the
> first concurrent accessor shows up?  (If not, one could (a) do an unlocked
> ptr->value=0, then the atomic write, or (b) revert and improve the
> suppression.)  I don't doubt it's fine for the ways PostgreSQL uses atomics
> today, which generally initialize an atomic before the concurrent-accessor
> processes even exist.

Perhaps it'd be worth putting a memory barrier at the end of the _init
function(s)?  As you say, this is hypothetical right now, but that'd be
a cheap improvement.

>> if a simple assignment *isn't* good enough, then surely the spinlock
>> version in atomics.c is 100% broken.

> Are you saying it would imply a bug in atomics.c pg_atomic_init_u64_impl(),
> pg_atomic_compare_exchange_u64_impl(), or pg_atomic_fetch_add_u64_impl()?  Can
> you explain that more?

My point was that doing SpinLockInit while somebody else is already trying
to acquire or release the spinlock is not going to work out well.  So
there has to be a really clear boundary between "initialization" mode
and "use" mode; which is more or less the same point you make above.

In practice, if that line is so fine that we need a memory sync operation
to enforce it, things are probably broken anyhow.

            regards, tom lane



Re: valgrind versus pg_atomic_init()

От
Andres Freund
Дата:
Hi,

On 2020-06-14 18:55:27 -0700, Noah Misch wrote:
> Does something guarantee the write will be globally-visible by the time the
> first concurrent accessor shows up?

The function comments say:

 *
 * Has to be done before any concurrent usage..
 *
 * No barrier semantics.


> (If not, one could (a) do an unlocked ptr->value=0, then the atomic
> write, or (b) revert and improve the suppression.)  I don't doubt it's
> fine for the ways PostgreSQL uses atomics today, which generally
> initialize an atomic before the concurrent-accessor processes even
> exist.

I think it's unlikely that there are cases where you could safely
initialize the atomic without needing some form of synchronization
before it can be used. If a barrier were needed, what'd guarantee the
concurrent access happened after the initialization in the first place?

Greetings,

Andres Freund



Re: valgrind versus pg_atomic_init()

От
Andres Freund
Дата:
Hi,

On 2020-06-14 22:30:25 -0400, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
> > On Sun, Jun 07, 2020 at 12:23:35AM -0400, Tom Lane wrote:
> >> ...  But on thinking more about this, it seems like
> >> generic.h's version of pg_atomic_init_u64_impl is just fundamentally
> >> misguided.  Why isn't it simply assigning the value with an ordinary
> >> unlocked write?  By definition, we had better not be using this function
> >> in any circumstance where there might be conflicting accesses
> 
> > Does something guarantee the write will be globally-visible by the time the
> > first concurrent accessor shows up?  (If not, one could (a) do an unlocked
> > ptr->value=0, then the atomic write, or (b) revert and improve the
> > suppression.)  I don't doubt it's fine for the ways PostgreSQL uses atomics
> > today, which generally initialize an atomic before the concurrent-accessor
> > processes even exist.
> 
> Perhaps it'd be worth putting a memory barrier at the end of the _init
> function(s)?  As you say, this is hypothetical right now, but that'd be
> a cheap improvement.

I don't think it'd be that cheap for some cases. There's an atomic for
every shared buffer, making their initialization full memory barriers
would likely be noticable for larger shared_buffers values.

As you say:

> In practice, if that line is so fine that we need a memory sync operation
> to enforce it, things are probably broken anyhow.

Greetings,

Andres Freund



Re: valgrind versus pg_atomic_init()

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2020-06-14 22:30:25 -0400, Tom Lane wrote:
>> Perhaps it'd be worth putting a memory barrier at the end of the _init
>> function(s)?  As you say, this is hypothetical right now, but that'd be
>> a cheap improvement.

> I don't think it'd be that cheap for some cases. There's an atomic for
> every shared buffer, making their initialization full memory barriers
> would likely be noticable for larger shared_buffers values.

Fair point --- if we did need to do something to make this safer, doing it
at the level of individual atomic values would be the wrong thing anyway.

            regards, tom lane



Re: valgrind versus pg_atomic_init()

От
Noah Misch
Дата:
On Sun, Jun 14, 2020 at 09:16:20PM -0700, Andres Freund wrote:
> On 2020-06-14 18:55:27 -0700, Noah Misch wrote:
> > Does something guarantee the write will be globally-visible by the time the
> > first concurrent accessor shows up?
> 
> The function comments say:
> 
>  *
>  * Has to be done before any concurrent usage..
>  *
>  * No barrier semantics.
> 
> 
> > (If not, one could (a) do an unlocked ptr->value=0, then the atomic
> > write, or (b) revert and improve the suppression.)  I don't doubt it's
> > fine for the ways PostgreSQL uses atomics today, which generally
> > initialize an atomic before the concurrent-accessor processes even
> > exist.
> 
> I think it's unlikely that there are cases where you could safely
> initialize the atomic without needing some form of synchronization
> before it can be used. If a barrier were needed, what'd guarantee the
> concurrent access happened after the initialization in the first place?

Suppose the initializing process does:

  pg_atomic_init_u64(&somestruct->atomic, 123);
  somestruct->atomic_ready = true;

In released versions, any process observing atomic_ready==true will observe
the results of the pg_atomic_init_u64().  After the commit from this thread,
that's no longer assured.  Having said that, I agree with a special case of
Tom's assertion about spinlocks, namely that this has same problem:

  /* somestruct->lock already happens to contain value 0 */
  SpinLockInit(&somestruct->lock);
  somestruct->lock_ready = true;



Re: valgrind versus pg_atomic_init()

От
Andres Freund
Дата:
Hi,

On June 16, 2020 8:24:29 PM PDT, Noah Misch <noah@leadboat.com> wrote:
>On Sun, Jun 14, 2020 at 09:16:20PM -0700, Andres Freund wrote:
>> On 2020-06-14 18:55:27 -0700, Noah Misch wrote:
>> > Does something guarantee the write will be globally-visible by the
>time the
>> > first concurrent accessor shows up?
>>
>> The function comments say:
>>
>>  *
>>  * Has to be done before any concurrent usage..
>>  *
>>  * No barrier semantics.
>>
>>
>> > (If not, one could (a) do an unlocked ptr->value=0, then the atomic
>> > write, or (b) revert and improve the suppression.)  I don't doubt
>it's
>> > fine for the ways PostgreSQL uses atomics today, which generally
>> > initialize an atomic before the concurrent-accessor processes even
>> > exist.
>>
>> I think it's unlikely that there are cases where you could safely
>> initialize the atomic without needing some form of synchronization
>> before it can be used. If a barrier were needed, what'd guarantee the
>> concurrent access happened after the initialization in the first
>place?
>
>Suppose the initializing process does:
>
>  pg_atomic_init_u64(&somestruct->atomic, 123);
>  somestruct->atomic_ready = true;
>
>In released versions, any process observing atomic_ready==true will
>observe
>the results of the pg_atomic_init_u64().  After the commit from this
>thread,
>that's no longer assured.

Why did that hold true before? There wasn't a barrier in platforms already (wherever we know what 64 bit reads/writes
havesingle copy atomicity). 

Andres

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: valgrind versus pg_atomic_init()

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On June 16, 2020 8:24:29 PM PDT, Noah Misch <noah@leadboat.com> wrote:
>> Suppose the initializing process does:
>>
>> pg_atomic_init_u64(&somestruct->atomic, 123);
>> somestruct->atomic_ready = true;
>>
>> In released versions, any process observing atomic_ready==true will
>> observe
>> the results of the pg_atomic_init_u64().  After the commit from this
>> thread,
>> that's no longer assured.

> Why did that hold true before? There wasn't a barrier in platforms already (wherever we know what 64 bit reads/writes
havesingle copy atomicity). 

I'm confused as to why this is even an interesting discussion.  If the
timing is so tight that another process could possibly observe partially-
initialized state in shared memory, how could we have confidence that the
other process doesn't look before we've initialized the atomic variable or
spinlock at all?

I think in practice all we need depend on in this area is that fork()
provides a full memory barrier.

            regards, tom lane



Re: valgrind versus pg_atomic_init()

От
Noah Misch
Дата:
On Tue, Jun 16, 2020 at 08:35:58PM -0700, Andres Freund wrote:
> On June 16, 2020 8:24:29 PM PDT, Noah Misch <noah@leadboat.com> wrote:
> >Suppose the initializing process does:
> >
> >  pg_atomic_init_u64(&somestruct->atomic, 123);
> >  somestruct->atomic_ready = true;
> >
> >In released versions, any process observing atomic_ready==true will
> >observe
> >the results of the pg_atomic_init_u64().  After the commit from this
> >thread,
> >that's no longer assured.
> 
> Why did that hold true before? There wasn't a barrier in platforms already (wherever we know what 64 bit reads/writes
havesingle copy atomicity).
 

You are right.  It didn't hold before.