Re: POSIX shared memory redux

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: POSIX shared memory redux
Дата
Msg-id BANLkTi=X-q8e3tSJV+Jft-_hKuVpq1T+bA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: POSIX shared memory redux  (A.M. <agentm@themactionfaction.com>)
Ответы Re: POSIX shared memory redux  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: POSIX shared memory redux  ("A.M." <agentm@themactionfaction.com>)
Список pgsql-hackers
On Wed, Apr 13, 2011 at 7:20 AM, A.M. <agentm@themactionfaction.com> wrote:
> The goal of this patch is to eliminate SysV shared memory, not to implement NFS-capable locking which, as you point
out,is virtually impossible. 
>
> As far as I can tell, in the worst case, my patch does not change how postgresql handles the NFS case. SysV shared
memorywon't work across NFS, so that interlock won't catch, so postgresql is left with looking at a lock file with PID
ofprocess on another machine, so that won't catch either. This patch does not alter the lock file semantics, but merely
augmentsthe file with file locking. 
>
> At least with this patch, there is a chance the lock might work across NFS. In the best case, it can allow for
shared-storagepostgresql failover, which is a new feature. 
>
> Furthermore, there is an improvement in shared memory handling in that it is unlinked immediately after creation, so
onlythe postmaster and its children have access to it (through file descriptor inheritance). This means shared memory
cannotbe stomped on by any other process. 
>
> Considering that possibly working NFS locking is a side-effect of this patch and not its goal and, in the worst
possiblescenario, it doesn't change current behavior, I don't see how this can be a ding against this patch. 

I don't see why we need to get rid of SysV shared memory; needing less
of it seems just as good.

In answer to your off-list question, one of the principle ways I've
seen fcntl() locking fall over and die is when someone removes the
lock file.  You might think that this could be avoided by picking
something important like pg_control as the log file, but it turns out
that doesn't really work:

http://0pointer.de/blog/projects/locking.html

Tom's point is valid too.  Many storage appliances present themselves
as an NFS server, so it's very plausible for the data directory to be
on an NFS server, and there's no guarantee that flock() won't be
broken there.  If our current interlock were known to be unreliable
also maybe we wouldn't care very much, but AFAICT it's been extremely
robust.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Brar Piening
Дата:
Сообщение: Re: Windows build issues
Следующее
От: Tom Lane
Дата:
Сообщение: Re: POSIX shared memory redux