Re: dynamic shared memory

Поиск
Список
Период
Сортировка
От Noah Misch
Тема Re: dynamic shared memory
Дата
Msg-id 20130901132400.GA100090@tornado.leadboat.com
обсуждение исходный текст
Ответ на Re: dynamic shared memory  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: dynamic shared memory  (Andres Freund <andres@2ndquadrant.com>)
Список pgsql-hackers
On Sat, Aug 31, 2013 at 08:27:14AM -0400, Robert Haas wrote:
> On Fri, Aug 30, 2013 at 11:45 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> >> I shared your opinion that preferred_address is never going to be
> >> reliable, although FWIW Noah thinks it can be made reliable with a
> >> large-enough hammer.
> >
> > I think we need to have the arguments for that on list then. Those are
> > pretty damn fundamental design decisions.

I somewhat disfavor having a vague "preferred_address" parameter.  mmap()'s
first argument is specified that way, but mmap()'s specification caters to an
open-ended range of implementations and clients.  A PostgreSQL backend
interface can be more rigid.  If we choose to support fixed-address callers,
let those receive either the requested address or an ereport(ERROR).  If the
caller does not care, make no effort to provide a consistent address.  (Better
still, under --enable-cassert, try to force the address to differ across
processes.)

[quotations reordered]
> >> But even if it isn't reliable, there doesn't seem to be all that much
> >> value in forbidding access to that part of the OS-provided API.  In

That's also valid, though.  Even if no core code exploits the flexibility,
3rd-party code might do so.

> >> the world where it's not reliable, it may still be convenient to map
> >> things at the same address when you can, so that pointers can't be
> >> used.  Of course you'd have to have some fallback strategy for when
> >> you don't get the same mapping, and maybe that's painful enough that
> >> there's no point after all.  Or maybe it's worth having one code path
> >> for relativized pointers and another for non-relativized pointers.
> >
> > It seems likely to me that will end up with untested code in that
> > case. Or even unsupported platforms.

I agree.  It would take an exceptional use case to justify such parallel code
paths; I won't expect that to ever happen for core code.

> > I for one cannot see how you even remotely could make that work a) on
> > windows (check the troubles we have to go through to get s_b
> > consistently placed, and that's directly after startup) b) 32bit systems.
> 
> Noah?

The difficulty depends on whether processes other than the segment's creator
will attach anytime or only as they start.  Attachment at startup is enough
for parallel query, but it's not enough for something like lock table
expansion.  I'll focus on the attach-anytime case since it's more general.

On a system supporting MAP_FIXED, implement this by having the postmaster
reserve address space under a PROT_NONE mapping, then carving out from that
mapping for each fixed-address dynamic segment.  The size of the reservation
would be controlled by a GUC; one might set it to several times anticipated
peak usage.  (The overhead of doing that depends on the kernel.)  Windows
permits the same technique with its own primitives.

A system where mmap() accepts only a zero address in practice (HP-UX,
according to Gnulib, although HP docs suggest it has improved over time)
requires a different technique.  For those systems, expand the regular shared
memory segment and carve from that to make "dynamic" segments.  This amounts
to adding ShmemFree() to supplement ShmemAlloc().  If a core platform had to
use this implementation, its disadvantages would be sufficient to discard the
whole idea of reliable fixed addresses.  But I find it acceptable if it's a
crutch for older kernels, rare hardware, etc.

I don't foresee fundamental differences on 32-bit.  All the allocation
maximums scale down, but that's the usual story for 32-bit.

-- 
Noah Misch
EnterpriseDB                                 http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Craig Ringer
Дата:
Сообщение: Re: [v9.4] row level security
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: [v9.4] row level security