Re: [HACKERS] [sqlsmith] Crash reading pg_stat_activity

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: [HACKERS] [sqlsmith] Crash reading pg_stat_activity
Дата
Msg-id CAEepm=1vSunMb-sZ1wVYUuUYEj=Uu9qiUBUcnp=cdo7D6cZLbg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] [sqlsmith] Crash reading pg_stat_activity  (Andreas Seltenreich <seltenreich@gmx.de>)
Ответы Re: [HACKERS] [sqlsmith] Crash reading pg_stat_activity  (Andreas Seltenreich <seltenreich@gmx.de>)
Re: [HACKERS] [sqlsmith] Crash reading pg_stat_activity  (Thomas Munro <thomas.munro@enterprisedb.com>)
Список pgsql-hackers
On Wed, Dec 28, 2016 at 11:38 AM, Andreas Seltenreich
<seltenreich@gmx.de> wrote:
> Thomas Munro writes:
>
>> On Wed, Dec 28, 2016 at 10:40 AM, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
>>> On Wed, Dec 28, 2016 at 10:01 AM, Andreas Seltenreich <seltenreich@gmx.de> wrote:
>>>> testing master as of fe591f8bf6 produced a crash reading
>>>> pg_stat_activity (backtrace below).  Digging around with with gdb
>>>> revealed that pgstat_get_wait_event() returned an invalid pointer for a
>>>> classId PG_WAIT_LWLOCK.
>>>>
>>>> I think the culprit is dsa.c passing a pointer to memory that goes away
>>>> on dsa_free() as a name to LWLockRegisterTranche.
> [..]
>>> Maybe we should replace it with another value when the DSA area is
>>> detached, using a constant string.  Something like
>
> I'm wondering: Is it safe to pass a pointer into a DSA at all?  If I
> understand the comments correctly, they are not necessarily mapped (at
> the same address) in an unrelated backend looking into pg_stat_activity,
> and in this case a dsa_free() is not actually needed to trigger a crash.

It is safe, as long as the segment remains mapped.  Each backend that
attaches calls LWLockRegisterTranche giving it the address of the name
in its virtual address space.  The problem is that after the segment
is unmapped, that address is now garbage, which is clearly a bug.
That's why I started talking about how to 'unregister' it (or register
a replacement constant name).  We have some code that always runs
before the control segment containing the name is detached, so that'd
be the perfect place to do that.

>> That line of thinking suggests another potential solution: go and
>> register the name in RegisterLWLockTranches, and stop registering it
>> in dsa.c.  For other potential uses of DSA including extensions that
>> call LWLockNewTrancheId() we'd have to decide whether to make the name
>> an optional argument, or require those users to register it
>> themselves.
>
> Maybe instead of copying the name, just put the passed pointer itself
> into the area?  Extensions using LWLockNewTrancheId need to use
> shared_preload_libraries anyway, so static strings would be mapped in
> all backends.

Yeah that would be another way.  I had this idea that only the process
that creates a DSA area should name it, and then processes attaching
would see the existing tranche ID and name, so could use a narrower
interface.  We could instead do as you say and make processes that
attach provide a pointer to the name too, and make it the caller's
problem to ensure that the pointers remain valid long enough; or go
one step further and make them register/unregister it themselves.

But I'm starting to think that the best way might be to do BOTH of the
things I said in my previous message: make dsa.c register on
create/attach and also unregister before detaching iff the name was
supplied at creation time for the benefit of extension writers, but
make it not do anything at all about tranche name
registration/unregistration if NULL was passed in at create time.
Then register this particular tranche (LWTRANCHE_PARALLEL_QUERY_DSA)
in every process in RegisterLWLockTranches.  That way, you'd get a
useful name in pg_stat_activity for other backends that are running
parallel queries if they are ever waiting for these locks (unlikely
but interesting to know abotu if it happens).

-- 
Thomas Munro
http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Merlin Moncure
Дата:
Сообщение: Re: [HACKERS] merging some features from plpgsql2 project
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers