Re: POSIX shared memory redux

Поиск
Список
Период
Сортировка
От A.M.
Тема Re: POSIX shared memory redux
Дата
Msg-id 69D09C0C-B16A-4BE6-B360-CFB72ABB83EA@themactionfaction.com
обсуждение исходный текст
Ответ на Re: POSIX shared memory redux  (Florian Weimer <fweimer@bfk.de>)
Список pgsql-hackers
On Apr 14, 2011, at 8:22 AM, Florian Weimer wrote:

> * Tom Lane:
>
>> Well, the fundamental point is that "ignoring NFS" is not the real
>> world.  We can't tell people not to put data directories on NFS,
>> and even if we did tell them not to, they'd still do it.  And NFS
>> locking is not trustworthy, because the remote lock daemon can crash
>> and restart (forgetting everything it ever knew) while your own machine
>> and the postmaster remain blissfully awake.
>
> Is this still the case with NFSv4?  Does the local daemon still keep
> the lock state?

The lock handling has been fixed in NFSv4.

http://nfs.sourceforge.net/
"NFS Version 4 introduces support for byte-range locking and share reservation. Locking in NFS Version 4 is
lease-based,so an NFS Version 4 client must maintain contact with an NFS Version 4 server to continue extending its
openand lock leases." 

http://linux.die.net/man/2/flock
"flock(2) does not lock files over NFS. Use fcntl(2) instead: that does work over NFS, given a sufficiently recent
versionof Linux and a server which supports locking." 

I would need some more time to dig up what "recent version of Linux" specifies, but NFSv4 is likely required.

>
>> None of this is to say that an fcntl lock might not be a useful addition
>> to what we do already.  It is to say that fcntl can't just replace what
>> we do already, because there are real-world failure cases that the
>> current solution handles and fcntl alone wouldn't.
>
> If it requires NFS misbehavior (possibly in an older version), and you
> have to start postmasters on separate nodes (which you normally
> wouldn't do), doesn't this make it increasingly unlikely that it's
> going to be triggered in the wild?

With the patch I offer, it would be possible to use shared storage and failover postgresql nodes on different machines
overNFS. (The second postmaster blocks and waits for the lock to be released.) Obviously, such as a setup isn't as
strongas using replication, but given a sufficiently fail-safe shared storage setup, it could be made reliable. 

Cheers,
M




В списке pgsql-hackers по дате отправления:

Предыдущее
От: "A.M."
Дата:
Сообщение: Re: POSIX shared memory redux
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Single client performance on trivial SELECTs