Re: POSIX shared memory redux

Поиск
Список
Период
Сортировка
От A.M.
Тема Re: POSIX shared memory redux
Дата
Msg-id D9EDACF7-53F1-4355-84F8-2E74CD19D22D@themactionfaction.com
обсуждение исходный текст
Ответ на Re: POSIX shared memory redux  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: POSIX shared memory redux  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
Hello,

Based on feedback from Tom Lane and Robert Haas, I have amended the POSIX shared memory patch to account for
multiple-postmasterstart race conditions (which is currently based on SysV shared memory checks). 

https://github.com/agentm/postgres/tree/posix_shmem


To ensure that no two postmasters can startup in the same data directory, I use fcntl range locking on the data
directorylock file, which also works properly on (properly configured) NFS volumes. Whenever a postmaster or postmaster
childstarts, it acquires a read (non-exclusive) lock on the data directory's lock file. When a new postmaster starts,
itqueries if anything would block a write (exclusive) lock on the lock file which returns a lock-holding PID in the
casewhen other postgresql processes are running. 

Because POSIX fcntl locking is per-process and released in the kernel when a process ends, as long as there is a single
processrunning and holding a read lock, no new postmaster can be started for that data directory. Furthermore, the
fcntlsyscall allows us to get a live PID for a conflicting lock-holding process, so the postgresql startup can print a
livePID on a conflict startup. The contents of the data directory lock file remain the same, however, the PID stored in
thelock file becomes less vital. 

The cost of this change is one additional file descriptor open in each postgresql process (for the full life of the
process).

As a gimmick, I also implemented a process-failover feature based on the F_SETLKW flag which allows a new postmaster to
startupimmediately if the running postmaster and all its children exit for any reason. This may also be useful to queue
postmasterstartup which could be controlled by a secondary non-postgresql process, pending some action (such as in a
failoverscenario). This feature is controlled via "postgres -b" (for "blocking"), but it is not vital to the shared
memoryimplementation. 

Note that this implementation of the fcntl locking is effectively independent of the shared memory interface, i.e. this
samelocking could be used with the existing SysV shared memory scheme. 

Is this approach good enough to push to the next CommitFest? I am happy to amend the patch as necessary. Thanks!

Cheers,
M
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeremiah Peschka
Дата:
Сообщение: Re: k-neighbourhood search in databases
Следующее
От: Fujii Masao
Дата:
Сообщение: Re: [COMMITTERS] pgsql: Don't make "replication" magical as a user name, only as a datab