Обсуждение: [HACKERS] Error: dsa_area could not attach to a segment that has been freed

Поиск
Список
Период
Сортировка

[HACKERS] Error: dsa_area could not attach to a segment that has been freed

От
Gaddam Sai Ram
Дата:
Hello everyone, 

Based on the discussion in the below thread, I built a an extension using DSA(postgres-10 beta-3, linux machine).

Use _PG_init and the shmem hook to reserve a little bit of
traditional shared memory and initialise it to zero.  This will be
used just to share the DSA handle, but you can't actually create the
DSA area in postmaster.  In other words, this little bit of shared
memory is for "discovery", since it can be looked up by name from any
backend.
Yes, I have created memory for DSA handle in shared memory, but not the actual DSA area.

In each backend that wants to use your new in-memory index system,
you need to be able to attach or create the DSA area on-demand.
Perhaps you could have a get_my_shared_state() function (insert better
name) that uses a static local variable to hold a pointer to some
state.  If it's NULL, you know you need to create the state.  That
should happen only once in each backend, the first time through the
function.  In that case you need to create or attach to the DSA area
as appropriate, which you should wrap in
LWLockAcquire(AddinShmemInitLock,
LW_EXCLUSIVE)/LWLockRelease(AddinShmemInitLock) to serialise the code
block.  First, look up the bit of traditional shared memory to see if
there is a DSA handle published in it already.  If there is you can
attach.  If there isn't, you are the first so you need to create, and
publish the handle for others to attach to.  Remember whatever state
you need to remember, such as the dsa_area, in static local variables
so that all future calls to get_my_shared_state() in that backend will
be fast.
Yes, the code is present in gstore_shmem.c(pfa) and the first process to use DSA will create the area, and rest all new processes will either attach it or if it is already attached, it will use the DSA area which is already pinned.


==> I have created a bgworker in pg_init and when it starts it will be the first process to access DSA, so it will create DSA area.
==> I have a small UDF function(simple_udf_func) which I call in a new backend(process). So it will attach the DSA area already created.
==> When I make a call to same UDF function again in the same process, since the area is already attached and pinned, I use the same area which I store in a global variable while attaching/creating. Here I get the problem...

Error details: dsa_area could not attach to a segment that has been freed

While examining in detail, I found this data.
I used dsa_dump() for debugging and I found that during my error case, i get this log:

dsa_area handle 1:
  max_total_segment_size: 0
  total_segment_size: 0
  refcnt: 0
  pinned: f
  segment bins:
  segment bin 0 (at least -2147483648 contiguous pages free):


Clearly, the data in my DSA area has been corrupted in latter case, but my bgworker continues to work proper with same dsa_area handle.

At this stage, the dsa_dump() in my bgworker is as below:

dsa_area handle 1814e630:
  max_total_segment_size: 18446744073709551615
  total_segment_size: 1048576
  refcnt: 3
  pinned: t
  segment bins:
    segment bin 8 (at least 128 contiguous pages free):
      segment index 0, usable_pages = 253, contiguous_pages = 220, mapped at 0x7f0abbd58000

As i'm pinning the dsa mapping after attach, it has to stay through out the backend session. But not sure why its freed/corrupted.

Kindly help me in fixing this issue. Attached the copy of my extension, which will reproduce the same issue. 


Regards
G. Sai Ram





Вложения

[HACKERS] Re: Error: dsa_area could not attach to a segment that has beenfreed

От
Gaddam Sai Ram
Дата:
Kindly help me with the above thread..

Thanks
G. Sai Ram


---- On Fri, 15 Sep 2017 13:21:33 +0530 Gaddam Sai Ram <gaddamsairam.n@zohocorp.com> wrote ----

Hello everyone, 

Based on the discussion in the below thread, I built a an extension using DSA(postgres-10 beta-3, linux machine).

Use _PG_init and the shmem hook to reserve a little bit of
traditional shared memory and initialise it to zero.  This will be
used just to share the DSA handle, but you can't actually create the
DSA area in postmaster.  In other words, this little bit of shared
memory is for "discovery", since it can be looked up by name from any
backend.
Yes, I have created memory for DSA handle in shared memory, but not the actual DSA area.

In each backend that wants to use your new in-memory index system,
you need to be able to attach or create the DSA area on-demand.
Perhaps you could have a get_my_shared_state() function (insert better
name) that uses a static local variable to hold a pointer to some
state.  If it's NULL, you know you need to create the state.  That
should happen only once in each backend, the first time through the
function.  In that case you need to create or attach to the DSA area
as appropriate, which you should wrap in
LWLockAcquire(AddinShmemInitLock,
LW_EXCLUSIVE)/LWLockRelease(AddinShmemInitLock) to serialise the code
block.  First, look up the bit of traditional shared memory to see if
there is a DSA handle published in it already.  If there is you can
attach.  If there isn't, you are the first so you need to create, and
publish the handle for others to attach to.  Remember whatever state
you need to remember, such as the dsa_area, in static local variables
so that all future calls to get_my_shared_state() in that backend will
be fast.
Yes, the code is present in gstore_shmem.c(pfa) and the first process to use DSA will create the area, and rest all new processes will either attach it or if it is already attached, it will use the DSA area which is already pinned.


==> I have created a bgworker in pg_init and when it starts it will be the first process to access DSA, so it will create DSA area.
==> I have a small UDF function(simple_udf_func) which I call in a new backend(process). So it will attach the DSA area already created.
==> When I make a call to same UDF function again in the same process, since the area is already attached and pinned, I use the same area which I store in a global variable while attaching/creating. Here I get the problem...

Error details: dsa_area could not attach to a segment that has been freed

While examining in detail, I found this data.
I used dsa_dump() for debugging and I found that during my error case, i get this log:

dsa_area handle 1:
  max_total_segment_size: 0
  total_segment_size: 0
  refcnt: 0
  pinned: f
  segment bins:
  segment bin 0 (at least -2147483648 contiguous pages free):


Clearly, the data in my DSA area has been corrupted in latter case, but my bgworker continues to work proper with same dsa_area handle.

At this stage, the dsa_dump() in my bgworker is as below:

dsa_area handle 1814e630:
  max_total_segment_size: 18446744073709551615
  total_segment_size: 1048576
  refcnt: 3
  pinned: t
  segment bins:
    segment bin 8 (at least 128 contiguous pages free):
      segment index 0, usable_pages = 253, contiguous_pages = 220, mapped at 0x7f0abbd58000

As i'm pinning the dsa mapping after attach, it has to stay through out the backend session. But not sure why its freed/corrupted.

Kindly help me in fixing this issue. Attached the copy of my extension, which will reproduce the same issue. 


Regards
G. Sai Ram






Re: [HACKERS] Error: dsa_area could not attach to a segment that hasbeen freed

От
Thomas Munro
Дата:
On Fri, Sep 15, 2017 at 7:51 PM, Gaddam Sai Ram
<gaddamsairam.n@zohocorp.com> wrote:
> As i'm pinning the dsa mapping after attach, it has to stay through out the
> backend session. But not sure why its freed/corrupted.
>
> Kindly help me in fixing this issue. Attached the copy of my extension,
> which will reproduce the same issue.

Your DSA area is pinned and the mapping is pinned, but there is one
more thing that goes away automatically unless you nail it to the
table: the backend-local dsa_area object which dsa_create() and
dsa_attach() return.  That's allocated in the "current memory
context", so if you do it from your procedure simple_udf_func without
making special arrangements it gets automatically freed at end of
transaction.  If you're going to cache it for the whole life of the
backend, you'd better make sure it's allocated in memory context that
lives long enough.  Where you have dsa_create() and dsa_attach()
calls, try coding like this:
 MemoryContext old_context;
 old_context = MemoryContextSwitchTo(TopMemoryContext); area = dsa_create(...); MemoryContextSwitchTo(old_context);
 old_context = MemoryContextSwitchTo(TopMemoryContext); area = dsa_attach(...); MemoryContextSwitchTo(old_context);

You'll need to #include "utils/memutils.h".

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error: dsa_area could not attach to a segment thathas been freed

От
Gaddam Sai Ram
Дата:
Thank you very much! That fixed my issue! :)
I was in an assumption that pinning the area will increase its lifetime but yeah after taking memory context into consideration its working fine!

Regards
G. Sai Ram


---- On Wed, 20 Sep 2017 11:16:19 +0530 Thomas Munro <thomas.munro@enterprisedb.com> wrote ----

On Fri, Sep 15, 2017 at 7:51 PM, Gaddam Sai Ram
> As i'm pinning the dsa mapping after attach, it has to stay through out the
> backend session. But not sure why its freed/corrupted.
>
> Kindly help me in fixing this issue. Attached the copy of my extension,
> which will reproduce the same issue.

Your DSA area is pinned and the mapping is pinned, but there is one
more thing that goes away automatically unless you nail it to the
table: the backend-local dsa_area object which dsa_create() and
dsa_attach() return. That's allocated in the "current memory
context", so if you do it from your procedure simple_udf_func without
making special arrangements it gets automatically freed at end of
transaction. If you're going to cache it for the whole life of the
backend, you'd better make sure it's allocated in memory context that
lives long enough. Where you have dsa_create() and dsa_attach()
calls, try coding like this:

MemoryContext old_context;

old_context = MemoryContextSwitchTo(TopMemoryContext);
area = dsa_create(...);
MemoryContextSwitchTo(old_context);

old_context = MemoryContextSwitchTo(TopMemoryContext);
area = dsa_attach(...);
MemoryContextSwitchTo(old_context);

You'll need to #include "utils/memutils.h".

--
Thomas Munro

Re: [HACKERS] Error: dsa_area could not attach to a segment that hasbeen freed

От
Thomas Munro
Дата:
On Wed, Sep 20, 2017 at 6:14 PM, Gaddam Sai Ram
<gaddamsairam.n@zohocorp.com> wrote:
> Thank you very much! That fixed my issue! :)
> I was in an assumption that pinning the area will increase its lifetime but
> yeah after taking memory context into consideration its working fine!

So far the success rate in confusing people who first try to make
long-lived DSA areas and DSM segments is 100%.  Basically, this is all
designed to ensure automatic cleanup of resources in short-lived
scopes.

Good luck for your graph project.  I think you're going to have to
expend a lot of energy trying to avoid memory leaks if your DSA lives
as long as the database cluster, since error paths won't automatically
free any memory you allocated in it.  Right now I don't have any
particularly good ideas for mechanisms to deal with that.  PostgreSQL
C has exception-like error handling, but doesn't (and probably can't)
have a language feature like scoped destructors from C++.  IMHO
exceptions need either destructors or garbage collection to keep you
sane.  There is a kind of garbage collection for palloc'd memory and
also for other resources like file handles, but if you're using a big
long lived DSA area you have nothing like that.  You can use
PG_TRY/PG_CATCH very carefully to clean up, or (probably better) you
can try to make sure that all your interaction with shared memory is
no-throw (note that that means using dsa_allocate_extended(x,
DSA_ALLOC_NO_OOM), because dsa_allocate itself can raise errors). The
first thing I'd try would probably be to keep all shmem-allocating
code in as few routines as possible, and use only no-throw operations
in the 'critical' regions of them, and maybe look into some kind of
undo log of things to free or undo in case of error to manage
multi-allocation operations if that turned out to be necessary.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error: dsa_area could not attach to a segment that hasbeen freed

От
Craig Ringer
Дата:
On 20 September 2017 at 16:55, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
On Wed, Sep 20, 2017 at 6:14 PM, Gaddam Sai Ram
<gaddamsairam.n@zohocorp.com> wrote:
> Thank you very much! That fixed my issue! :)
> I was in an assumption that pinning the area will increase its lifetime but
> yeah after taking memory context into consideration its working fine!

So far the success rate in confusing people who first try to make
long-lived DSA areas and DSM segments is 100%.  Basically, this is all
designed to ensure automatic cleanup of resources in short-lived
scopes.

90% ;)

I got it working with no significant issues for a long lived segment used to store a pool of shm_mq pairs used for a sort of "connection listener" bgworker. Though I only used DSM+ToC, not DSA. But TBH that may well be luck, as I tend to routinely use memory contexts scoped to the operational lifetime of a subsystem, making most problems like this just vanish without my realising they were there in the first place. Usually.

I pretty much shamelessly cribbed from test_shm_mq for the ToC stuff though. It's simple enough when you read it in use, but I'd be lucky to do it without an example.

I had lots more problems with shm_mq than DSM. shm_mq is very obviously designed for short-lived scopes, and falls down badly if you have a pool of queues you want to re-use after the peer detaches. You have to track "in use" flags separately to the shm_mq's own, because it doesn't clear its stored PGPROC entries for receiver/sender on detach. Once you know neither sender nor receiver is still attached, you can memset() the area and create a new queue in it.

You can't just reset the queue for a new peer, and have to do quite a dance to make sure it's safe detach from, overwrite, re-create and re-attach to.
 
Good luck for your graph project.  I think you're going to have to
expend a lot of energy trying to avoid memory leaks if your DSA lives
as long as the database cluster, since error paths won't automatically
free any memory you allocated in it.

Yeah, that's going to be hard. You might land up having lots and lots of little DSM segments.

 
There is a kind of garbage collection for palloc'd memory and
also for other resources like file handles, but if you're using a big
long lived DSA area you have nothing like that.

We need, IMO, a DSA-backed heirachical MemoryContext system.

We can't use the exact MemoryContext API as-is due to the need for far pointers though :(

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Error: dsa_area could not attach to a segment that hasbeen freed

От
Craig Ringer
Дата:
On 20 September 2017 at 17:52, Craig Ringer <craig@2ndquadrant.com> wrote:
On 20 September 2017 at 16:55, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
On Wed, Sep 20, 2017 at 6:14 PM, Gaddam Sai Ram
<gaddamsairam.n@zohocorp.com> wrote:
> Thank you very much! That fixed my issue! :)
> I was in an assumption that pinning the area will increase its lifetime but
> yeah after taking memory context into consideration its working fine!

So far the success rate in confusing people who first try to make
long-lived DSA areas and DSM segments is 100%.  Basically, this is all
designed to ensure automatic cleanup of resources in short-lived
scopes.

90% ;)

I got it working with no significant issues for a long lived segment used to store a pool of shm_mq pairs used for a sort of "connection listener" bgworker. Though I only used DSM+ToC, not DSA.


By the way, dsa.c really needs a cross-reference to shm_toc.c and vice versa. With a hint as to when each is appropriate. 

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Error: dsa_area could not attach to a segment that hasbeen freed

От
Robert Haas
Дата:
On Wed, Sep 20, 2017 at 5:54 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
> By the way, dsa.c really needs a cross-reference to shm_toc.c and vice
> versa. With a hint as to when each is appropriate.

/me blinks.

Aren't those almost-entirely-unrelated facilities?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error: dsa_area could not attach to a segment that has been freed

От
Tom Lane
Дата:
Craig Ringer <craig@2ndquadrant.com> writes:
> On 20 September 2017 at 16:55, Thomas Munro <thomas.munro@enterprisedb.com>
> wrote:
>> There is a kind of garbage collection for palloc'd memory and
>> also for other resources like file handles, but if you're using a big
>> long lived DSA area you have nothing like that.

> We need, IMO, a DSA-backed heirachical MemoryContext system.

Perhaps the ResourceManager subsystem would help here.
        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error: dsa_area could not attach to a segment that hasbeen freed

От
Thomas Munro
Дата:
On Thu, Sep 21, 2017 at 12:59 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Sep 20, 2017 at 5:54 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
>> By the way, dsa.c really needs a cross-reference to shm_toc.c and vice
>> versa. With a hint as to when each is appropriate.
>
> /me blinks.
>
> Aren't those almost-entirely-unrelated facilities?

I think I see what Craig means.

1.  A DSM segment works if you know how much space you'll need up
front so that you can size it. shm_toc provides a way to exchange
pointers into it with other backends in the form of shm_toc keys
(perhaps implicitly, in the form of well known keys or a convention
like executor node ID -> shm_toc key).  Examples: Fixed sized state
for parallel-aware executor nodes, and fixed size parallel executor
infrastructure.

2.  A DSA area is good if you don't know how much space you'll need
yet.  dsa_pointer provides a way to exchange pointers into it with
other backends.  Examples: A shared cache, an in-memory database
object like Gaddam Sai Ram's graph index extension, variable sized
state for parallel-aware executor nodes, the shared record typmod
registry stuff.

Perhaps confusingly we also support DSA areas inside DSM segments,
there are DSM segments inside DSA areas.  We also use DSM segments as
a kind of shared resource cleanup mechanism, and don't yet provide an
equivalent for DSA.  I haven't proposed anything like that because I
feel like there may be a better abstraction of reliable scoped cleanup
waiting to be discovered (as I think Craig was also getting at).

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error: dsa_area could not attach to a segment that hasbeen freed

От
Craig Ringer
Дата:
On 21 September 2017 at 05:50, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
On Thu, Sep 21, 2017 at 12:59 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Sep 20, 2017 at 5:54 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
>> By the way, dsa.c really needs a cross-reference to shm_toc.c and vice
>> versa. With a hint as to when each is appropriate.
>
> /me blinks.
>
> Aren't those almost-entirely-unrelated facilities?

I think I see what Craig means.

1.  A DSM segment works if you know how much space you'll need up
front so that you can size it. shm_toc provides a way to exchange
pointers into it with other backends in the form of shm_toc keys
(perhaps implicitly, in the form of well known keys or a convention
like executor node ID -> shm_toc key).  Examples: Fixed sized state
for parallel-aware executor nodes, and fixed size parallel executor
infrastructure.

2.  A DSA area is good if you don't know how much space you'll need
yet.  dsa_pointer provides a way to exchange pointers into it with
other backends.  Examples: A shared cache, an in-memory database
object like Gaddam Sai Ram's graph index extension, variable sized
state for parallel-aware executor nodes, the shared record typmod
registry stuff.

Perhaps confusingly we also support DSA areas inside DSM segments,
there are DSM segments inside DSA areas.  We also use DSM segments as
a kind of shared resource cleanup mechanism, and don't yet provide an
equivalent for DSA.  I haven't proposed anything like that because I
feel like there may be a better abstraction of reliable scoped cleanup
waiting to be discovered (as I think Craig was also getting at).

Well said, and what I would've wanted to say if I could've figured it out well enough to express it.

Hence needing some kind of README or cross reference to help people know which facility/facilities are suitable for their needs... and actually discover them.

(A hint on RequestAddinShmemSpace etc pointing to DSM + DSA would be good too)

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Error: dsa_area could not attach to a segment thathas been freed

От
Gaddam Sai Ram
Дата:
Hi Thomas,
      Thanks for cautioning us about possible memory leaks(during error cases) incase of long-lived DSA segements.

      Actually we are following an approach to avoid this DSA memory leaks. Let me explain our implementation and please validate and correct us in-case we       miss anything.

      Implementation:
      
      Basically we have to put our index data into memory (Index Column Value Vs Ctid) which we get in aminsert callback function.
      
      Coming to the implementation, in aminsert Callback function, 
  • We Switch to CurTransactionContext 
  • Cache the DMLs of a transaction into dlist(global per process)
  • Even if different clients work parallel, it won't be a problem because every client gets one dlist in separate process and it'll have it's own CurTransactionContext
  • We have registered transaction callback (using RegisterXactCallback() function). And during event pre-commit(XACT_EVENT_PRE_COMMIT), we populate all the transaction specific DMLs (from dlist) into our in-memory index(DSA) obviously inside PG_TRY/PG_CATCH block.
  • In case we got some errors(because of dsa_allocate() or something else) while processing dlist(while populating in-memory index), we cleanup the DSA memory in PG_CATCH block that is allocated/used till that point.
  • During other error cases, typically transactions gets aborted and PRE_COMMIT event is not called and hence we don't touch DSA at that time. Hence no need to bother about leaks.
  • Even sub transaction case is handled with sub transaction callbacks.
  • CurTransactionContext(dlist basically) is automatically cleared after that particular transaction.

I want to know if this approach is good and works well in all cases. Kindly provide your feedback on this.

Regards
G. Sai Ram


---- On Wed, 20 Sep 2017 14:25:43 +0530 Thomas Munro <thomas.munro@enterprisedb.com> wrote ----

On Wed, Sep 20, 2017 at 6:14 PM, Gaddam Sai Ram
> Thank you very much! That fixed my issue! :)
> I was in an assumption that pinning the area will increase its lifetime but
> yeah after taking memory context into consideration its working fine!

So far the success rate in confusing people who first try to make
long-lived DSA areas and DSM segments is 100%. Basically, this is all
designed to ensure automatic cleanup of resources in short-lived
scopes.

Good luck for your graph project. I think you're going to have to
expend a lot of energy trying to avoid memory leaks if your DSA lives
as long as the database cluster, since error paths won't automatically
free any memory you allocated in it. Right now I don't have any
particularly good ideas for mechanisms to deal with that. PostgreSQL
C has exception-like error handling, but doesn't (and probably can't)
have a language feature like scoped destructors from C++. IMHO
exceptions need either destructors or garbage collection to keep you
sane. There is a kind of garbage collection for palloc'd memory and
also for other resources like file handles, but if you're using a big
long lived DSA area you have nothing like that. You can use
PG_TRY/PG_CATCH very carefully to clean up, or (probably better) you
can try to make sure that all your interaction with shared memory is
no-throw (note that that means using dsa_allocate_extended(x,
DSA_ALLOC_NO_OOM), because dsa_allocate itself can raise errors). The
first thing I'd try would probably be to keep all shmem-allocating
code in as few routines as possible, and use only no-throw operations
in the 'critical' regions of them, and maybe look into some kind of
undo log of things to free or undo in case of error to manage
multi-allocation operations if that turned out to be necessary.

--
Thomas Munro