Обсуждение: windows 8 RTM compatibility issue (could not reserve shared memory region for child)

Поиск
Список
Период
Сортировка

windows 8 RTM compatibility issue (could not reserve shared memory region for child)

От
Dave Vitek
Дата:
Hello pgsql-bugs list,

I have attached a patch file that I believe resolves a compatibility
issue with Windows 8 RTM and postgresql.  The impatient might want to
just read the patch, this email is longer than it probably should be.  I
have CC'd Seiko Ishida who expressed an interest in Windows 8
compatibility on this list about a year ago.

We test postgres pretty heavily at my place of work (probably thousands
of DBs created and exercised each day) on a number of platforms.  We've
been doing compatibility testing with the Windows 8 previews and
everything has been working well.  We are using the latest postgres release.

However, last week we upgraded from a preview version to the RTM version
of Windows 8 x64, and it is clear that something changed. Since
upgrading, we have been getting this error message a few times a day.
Still very rare, but it never happened before the upgrade.

LOG:  could not reserve shared memory region (addr=0000000001410000) for child
0000000000000F8C: 487
LOG:  could not fork new process for connection: A blocking operation was
interrupted by a call to WSACancelBlockingCall.


This corresponds to VirtualAllocEx failing with ERROR_INVALID_ADDRESS
inside win32_shmem.c (search for the error message).

Postgres uses a shared memory block to do much of its IPC.  This shared
memory block presumably stores pointers to itself, and so must be
allocated at the same address inside every postgres process.  In order
to maximize the probability that this address will be available in child
processes, the address should be reserved as early as possible in the
lifetime of the child process (before the address space gets polluted).
In order to achieve this goal, the postmaster starts its children in a
suspended state and reserves the address before any code has executed in
the child process.

However, there are a bunch of chunks of the virtual address space
already reserved even when the child process is in this suspended
state.  At least some of them are memory mapped images of binaries
(duh).  I believe VirtualAllocEx is failing because something is already
mapped (in the child) to the address the postmaster wants the shared
memory segment to live at.

I wrote a small program that repeatedly starts postgres.exe in suspended
mode and then tries to VirtualAllocEx 0x1410000.  The address is never
blocked on Windows 7, but is blocked 2% of the time on Windows 8.  I
attached windbg to the troublesome postgres process and used "!vadump
-v" to see that there is a file mapped to the contentious address while
postgres is in the suspended state.  I don't know if the failure rate is
this bad for all addresses or just this one, but the possibility of
conflict exists, since the postmaster was willing to use this address in
at least one run.

So why hasn't this ever happened before?  I'm guessing that ASLR got
better in the latest windows 8 patch, or maybe there's just more stuff
in the virtual address space of a newborn process.

The postmaster originally decides where to place the shared memory
segment by letting Windows (MapViewOfFileEx) choose where to put it.  So
if the postmaster ends up using address 0x1410000, and then the
postgres.exe image (for example) gets mapped to that same address in the
child, you'll end up with the error message above.

I assume Windows changed so that the addresses in use inside a newborn
process can now conflict with the addresses returned by
MapViewOfFileEx(..., NULL).  These sets must have been disjoint in
previous versions of windows, and postgres was relying on that behavior.

One straightforward "fix" is to specify a hardcoded address to
MapViewOfFileEx instead of NULL.  This address should be carefully
selected such that it is in an area disjoint from the portions of the
address space that are potentially reserved in a newborn process, and
also unlikely to be in use inside the postmaster when it first maps the
shared memory.  This is pretty trivial to do for a particular
version/configuration of Windows.  However, I see no future-proof
solution (besides making the shared segment position independent).  If
the hardcoded address is not available, you can always fall back on the
current behavior.

On 64-bit versions of Windows, processes that do not use more than 4G or
so of address space seem to always have a huge hole from about 00000000
80000000  ...  00000700 00000000.  Note that you cannot reserve
addresses above 8TB, so it would need to go somewhere in this hole,
above 4G is probably preferable.

32-bit Windows 8 also exists.  We haven't been testing on it, and so I
can't confirm that the problem exists there.  Assuming it does, 32-bit
processes are likely to be trickier since address space is more scarce.
In practice, it appears that there is usually a big hole from 10000000
... 70000000.

There is a security problem with the fix I outline above.  It bypasses
ASLR to a limited degree, since the shared memory would likely end up
always living at the same address.  I am not certain that MapViewOfFile
even tries to be unpredictable, but let's assume it does or will be someday.

This security problem can be addressed by adding a random number to the
hardcoded address.  Interfacing with a suitable entropy source/PRNG
might prove to be a PITA, but there is a way of avoiding that.  We can
invoke MapViewOfFile once with NULL in order to get a "random address"
and then sum the least significant bits of that with our hardcoded base
address to get the preferred address for the shared segment.  This way
we end up with an address that is no less secure than the one currently
returned by MapViewOfFile, insofar as MapViewOfFile doesn't select high
addresses.

I've attached a patch that implements the stuff above.  I can share the
code for the program that tests whether an address is reliably available
in a newborn postgres process, if anyone is interested.

- Dave Vitek

Вложения

Re: windows 8 RTM compatibility issue (could not reserve shared memory region for child)

От
Noah Misch
Дата:
Hi Dave,

On Tue, Sep 04, 2012 at 11:45:47PM -0400, Dave Vitek wrote:
> LOG:  could not reserve shared memory region (addr=0000000001410000) for child
> 0000000000000F8C: 487
> LOG:  could not fork new process for connection: A blocking operation was
> interrupted by a call to WSACancelBlockingCall.

> So why hasn't this ever happened before?  I'm guessing that ASLR got
> better in the latest windows 8 patch, or maybe there's just more stuff
> in the virtual address space of a newborn process.

> One straightforward "fix" is to specify a hardcoded address to
> MapViewOfFileEx instead of NULL.  This address should be carefully
> selected such that it is in an area disjoint from the portions of the
> address space that are potentially reserved in a newborn process, and
> also unlikely to be in use inside the postmaster when it first maps the
> shared memory.  This is pretty trivial to do for a particular
> version/configuration of Windows.  However, I see no future-proof
> solution (besides making the shared segment position independent).  If
> the hardcoded address is not available, you can always fall back on the
> current behavior.

Given the strong dedication to backward-compatibility in Windows, I would
expect a way to bypass the new ASLR measures.  Some web searching suggests
linking postgres.exe with "/highentropyva:no" and/or "/dynamicbase:no" might
help, but nothing conclusive.  Thoughts?  That would be preferable to relying
on experimentally-derived safe addresses, which could cease to be safe after a
mere Windows update or similar.

> There is a security problem with the fix I outline above.  It bypasses
> ASLR to a limited degree, since the shared memory would likely end up
> always living at the same address.  I am not certain that MapViewOfFile
> even tries to be unpredictable, but let's assume it does or will be
> someday.

I wouldn't worry about it too much.  ASLR is a defense-in-depth measure; it
comes into play when your software already has a flaw and potentially reduces
the impact of that flaw.

> I've attached a patch that implements the stuff above.  I can share the
> code for the program that tests whether an address is reliably available
> in a newborn postgres process, if anyone is interested.

Great detective work.  Seeing that program could be helpful.

Thanks,
nm

--
Noah Misch
EnterpriseDB                                 http://www.enterprisedb.com

Re: Re: windows 8 RTM compatibility issue (could not reserve shared memory region for child)

От
Michael Paquier
Дата:
On Fri, Jul 5, 2013 at 1:12 AM, Noah Misch <noah@leadboat.com> wrote:
> Hi Dave,
>
> On Tue, Sep 04, 2012 at 11:45:47PM -0400, Dave Vitek wrote:
>> LOG:  could not reserve shared memory region (addr=0000000001410000) for child
>> 0000000000000F8C: 487
>> LOG:  could not fork new process for connection: A blocking operation was
>> interrupted by a call to WSACancelBlockingCall.
>
>> So why hasn't this ever happened before?  I'm guessing that ASLR got
>> better in the latest windows 8 patch, or maybe there's just more stuff
>> in the virtual address space of a newborn process.
>
>> One straightforward "fix" is to specify a hardcoded address to
>> MapViewOfFileEx instead of NULL.  This address should be carefully
>> selected such that it is in an area disjoint from the portions of the
>> address space that are potentially reserved in a newborn process, and
>> also unlikely to be in use inside the postmaster when it first maps the
>> shared memory.  This is pretty trivial to do for a particular
>> version/configuration of Windows.  However, I see no future-proof
>> solution (besides making the shared segment position independent).  If
>> the hardcoded address is not available, you can always fall back on the
>> current behavior.
>
> Given the strong dedication to backward-compatibility in Windows, I would
> expect a way to bypass the new ASLR measures.  Some web searching suggests
> linking postgres.exe with "/highentropyva:no" and/or "/dynamicbase:no" might
> help, but nothing conclusive.  Thoughts?  That would be preferable to relying
> on experimentally-derived safe addresses, which could cease to be safe after a
> mere Windows update or similar.
>
>> There is a security problem with the fix I outline above.  It bypasses
>> ASLR to a limited degree, since the shared memory would likely end up
>> always living at the same address.  I am not certain that MapViewOfFile
>> even tries to be unpredictable, but let's assume it does or will be
>> someday.
>
> I wouldn't worry about it too much.  ASLR is a defense-in-depth measure; it
> comes into play when your software already has a flaw and potentially reduces
> the impact of that flaw.
>
>> I've attached a patch that implements the stuff above.  I can share the
>> code for the program that tests whether an address is reliably available
>> in a newborn postgres process, if anyone is interested.
>
> Great detective work.  Seeing that program could be helpful.

(reviving this old thread)

So, it happens that it is still possible to hit this issue on at least
Win2k12 boxes (received some complaints about that) even if
RandomizedBaseAddress is disabled in build, as per a result of the
following thread:
http://www.postgresql.org/message-id/BD0D89EC2438455C9DE0DC94D36912F4@maumau

This has happened on a box that was surely running with more than 125
connections
(https://wiki.postgresql.org/wiki/Running_%26_Installing_PostgreSQL_On_Native_Windows#I_cannot_run_with_more_than_about_125_connections_at_once.2C_despite_having_capable_hardware),
and I am sure that it was not using more than 512MB of shared_buffers.

I am wondering if Perhaps we could do better than what we have now
with a retry logic in the thread fork loop as it seems like a stopover
to use a non-NULL lpBaseAddress in MapViewOfFileEx to make the address
selection more random as this base address selection would be
system-dependent.
Thoughts?
--
Michael
On Wed, Jun 24, 2015 at 10:06:00AM +0900, Michael Paquier wrote:

> > On Tue, Sep 04, 2012 at 11:45:47PM -0400, Dave Vitek wrote:
> >> LOG:  could not reserve shared memory region (addr=0000000001410000) for child
> >> 0000000000000F8C: 487
> >> LOG:  could not fork new process for connection: A blocking operation was
> >> interrupted by a call to WSACancelBlockingCall.

> So, it happens that it is still possible to hit this issue on at least
> Win2k12 boxes (received some complaints about that) even if
> RandomizedBaseAddress is disabled in build, as per a result of the
> following thread:
> http://www.postgresql.org/message-id/BD0D89EC2438455C9DE0DC94D36912F4@maumau

That report led to the RandomizedBaseAddress="FALSE" commit, so the report was
not based on such a build.  If you have received complaints definitively
involving a RandomizedBaseAddress="FALSE" build, that is novel evidence.  If
these complaints involved publicly-available binaries, which exact binaries
(download URL)?  If not, what do you know about how the binaries were built?

> I am wondering if Perhaps we could do better than what we have now
> with a retry logic in the thread fork loop as it seems like a stopover
> to use a non-NULL lpBaseAddress in MapViewOfFileEx to make the address
> selection more random as this base address selection would be
> system-dependent.

I don't understand exactly what you're proposing here.  Are you proposing to
retry backend creation after a child can't reattach to shared memory?  That is
better than a user-facing failure, but let's start with a diligent attempt to
root-cause the complaints you have received.

Re: Re: windows 8 RTM compatibility issue (could not reserve shared memory region for child)

От
Michael Paquier
Дата:
On Wed, Jun 24, 2015 at 12:29 PM, Noah Misch <noah@leadboat.com> wrote:
> On Wed, Jun 24, 2015 at 10:06:00AM +0900, Michael Paquier wrote:
>
>> > On Tue, Sep 04, 2012 at 11:45:47PM -0400, Dave Vitek wrote:
>> >> LOG:  could not reserve shared memory region (addr=0000000001410000) for child
>> >> 0000000000000F8C: 487
>> >> LOG:  could not fork new process for connection: A blocking operation was
>> >> interrupted by a call to WSACancelBlockingCall.
>
>> So, it happens that it is still possible to hit this issue on at least
>> Win2k12 boxes (received some complaints about that) even if
>> RandomizedBaseAddress is disabled in build, as per a result of the
>> following thread:
>> http://www.postgresql.org/message-id/BD0D89EC2438455C9DE0DC94D36912F4@maumau
>
> That report led to the RandomizedBaseAddress="FALSE" commit, so the report was
> not based on such a build.  If you have received complaints definitively
> involving a RandomizedBaseAddress="FALSE" build, that is novel evidence.  If
> these complaints involved publicly-available binaries, which exact binaries
> (download URL)?  If not, what do you know about how the binaries were built?

They are not publicly available, but the build is done using the
community perl scripts with Visual 2008, with a slight difference
though in the VC spec file, AdditionalOptions includes /DLL to tell
the linker to build DLLs, but I don't think that it is much related to
the failure except if I am missing a crucial piece of information
regarding Visual.

By the way, the failure is too similar to the one of this thread and
the one of MauMau
(http://www.postgresql.org/message-id/BD0D89EC2438455C9DE0DC94D36912F4@maumau)
2015-06-23 13:00:24.989 PDT 55898d60.1388 0 LOG: could not reserve
shared memory region (addr=00000000013C0000) for child
0000000000001868: error code 487
2015-06-23 13:00:24.989 PDT 55898d60.1388 0 LOG: could not fork
autovacuum worker process: A blocking operation was interrupted by a
call to WSACancelBlockingCall.

This happens periodically at a rhythm of 10~20 minutes, most of the
time with autovacuum, and sometimes impacting with child backend,
leading to connection failures. But I am assuming that we get higher
chances to hit this failure with a high number of concurrent
connections, and a high amount of memory used by the system.

>> I am wondering if Perhaps we could do better than what we have now
>> with a retry logic in the thread fork loop as it seems like a stopover
>> to use a non-NULL lpBaseAddress in MapViewOfFileEx to make the address
>> selection more random as this base address selection would be
>> system-dependent.
>
> I don't understand exactly what you're proposing here.  Are you proposing to
> retry backend creation after a child can't reattach to shared memory?

Yes, in the context of Windows to alleviate the failure for
applications impacted by that as disabling ASLR does not seem enough
for some contexts.

> That is
> better than a user-facing failure, but let's start with a diligent attempt to
> root-cause the complaints you have received.

Sure.
--
Michael
On Wed, Jun 24, 2015 at 04:03:53PM +0900, Michael Paquier wrote:
> On Wed, Jun 24, 2015 at 12:29 PM, Noah Misch <noah@leadboat.com> wrote:
> > On Wed, Jun 24, 2015 at 10:06:00AM +0900, Michael Paquier wrote:
> >> > On Tue, Sep 04, 2012 at 11:45:47PM -0400, Dave Vitek wrote:
> >> >> LOG:  could not reserve shared memory region (addr=0000000001410000) for child
> >> >> 0000000000000F8C: 487
> >> >> LOG:  could not fork new process for connection: A blocking operation was
> >> >> interrupted by a call to WSACancelBlockingCall.
> >
> >> So, it happens that it is still possible to hit this issue on at least
> >> Win2k12 boxes (received some complaints about that) even if
> >> RandomizedBaseAddress is disabled in build, as per a result of the
> >> following thread:
> >> http://www.postgresql.org/message-id/BD0D89EC2438455C9DE0DC94D36912F4@maumau
> >
> > That report led to the RandomizedBaseAddress="FALSE" commit, so the report was
> > not based on such a build.  If you have received complaints definitively
> > involving a RandomizedBaseAddress="FALSE" build, that is novel evidence.  If
> > these complaints involved publicly-available binaries, which exact binaries
> > (download URL)?  If not, what do you know about how the binaries were built?
>
> They are not publicly available, but the build is done using the
> community perl scripts with Visual 2008, with a slight difference
> though in the VC spec file, AdditionalOptions includes /DLL to tell
> the linker to build DLLs, but I don't think that it is much related to
> the failure except if I am missing a crucial piece of information
> regarding Visual.

Here are some of the things I would try, then:

- Reproduce it with postgresql.org official binaries.
- Run "dumpbin /headers" on postgres.exe and every dll it links at startup,
  including dependencies.  Compare to the corresponding data from
  postgresql.org official binaries.
- Rebuild with VS2013 and see if the problem persists.
- Identify the allocation(s) overlapping the region needed for shared memory.

Re: Re: windows 8 RTM compatibility issue (could not reserve shared memory region for child)

От
Michael Paquier
Дата:
On Thu, Jun 25, 2015 at 12:22 PM, Noah Misch <noah@leadboat.com> wrote:
> On Wed, Jun 24, 2015 at 04:03:53PM +0900, Michael Paquier wrote:
>> On Wed, Jun 24, 2015 at 12:29 PM, Noah Misch <noah@leadboat.com> wrote:
>> > On Wed, Jun 24, 2015 at 10:06:00AM +0900, Michael Paquier wrote:
>> >> > On Tue, Sep 04, 2012 at 11:45:47PM -0400, Dave Vitek wrote:
>> >> >> LOG:  could not reserve shared memory region (addr=0000000001410000) for child
>> >> >> 0000000000000F8C: 487
>> >> >> LOG:  could not fork new process for connection: A blocking operation was
>> >> >> interrupted by a call to WSACancelBlockingCall.
>> >
>> >> So, it happens that it is still possible to hit this issue on at least
>> >> Win2k12 boxes (received some complaints about that) even if
>> >> RandomizedBaseAddress is disabled in build, as per a result of the
>> >> following thread:
>> >> http://www.postgresql.org/message-id/BD0D89EC2438455C9DE0DC94D36912F4@maumau
>> >
>> > That report led to the RandomizedBaseAddress="FALSE" commit, so the report was
>> > not based on such a build.  If you have received complaints definitively
>> > involving a RandomizedBaseAddress="FALSE" build, that is novel evidence.  If
>> > these complaints involved publicly-available binaries, which exact binaries
>> > (download URL)?  If not, what do you know about how the binaries were built?
>>
>> They are not publicly available, but the build is done using the
>> community perl scripts with Visual 2008, with a slight difference
>> though in the VC spec file, AdditionalOptions includes /DLL to tell
>> the linker to build DLLs, but I don't think that it is much related to
>> the failure except if I am missing a crucial piece of information
>> regarding Visual.
>
> Here are some of the things I would try, then:
>
> - Reproduce it with postgresql.org official binaries.
> - Run "dumpbin /headers" on postgres.exe and every dll it links at startup,
>   including dependencies.  Compare to the corresponding data from
>   postgresql.org official binaries.
> - Rebuild with VS2013 and see if the problem persists.
> - Identify the allocation(s) overlapping the region needed for shared memory.

I guess that it is the way to do it then. I will try with a build
compiled directly from the community sources and from those builds and
see if it is reproducible or not at least once. It may be possible
that the dll loaded by postgres.exe have an effect on it, but well
let's see with a configuration under memory pressure...
--
Michael