Обсуждение: Proposal to add a QNX 6.5 port to PostgreSQL

Поиск

Список

Период

Сортировка

Proposal to add a QNX 6.5 port to PostgreSQL

От

"Baker, Keith [OCDUS Non-J&J]"

Дата:

25 июля 2014 г., 20:17:05

<div class="WordSection1"><p class="MsoNormal">I propose that a QNX 6.5 port be introduced to PostgreSQL.<p
class="MsoNormal"> <pclass="MsoNormal">I am new to PostgreSQL development, so please bear with me.<p
class="MsoNormal"> <pclass="MsoNormal">I have made good progress (with 1 outstanding issue, details below):<p
class="MsoListParagraph"style="text-indent:-.25in;mso-list:l2 level1 lfo4"><span style="font-family:Symbol"><span
style="mso-list:Ignore">·<spanstyle="font:7.0pt "Times New Roman"">         </span></span></span>I created a QNX 6.5
portof PostgreSQL 9.3.4 which passes regression tests.<p class="MsoListParagraph" style="text-indent:-.25in;mso-list:l2
level1lfo4"><span style="font-family:Symbol"><span style="mso-list:Ignore">·<span style="font:7.0pt "Times New
Roman"">        </span></span></span>I merged my changes into 9.4beta2, and with a few minor changes, it passes
regressiontests.<p class="MsoListParagraph" style="text-indent:-.25in;mso-list:l2 level1 lfo4"><span
style="font-family:Symbol"><spanstyle="mso-list:Ignore">·<span style="font:7.0pt "Times New Roman"">        
</span></span></span>QNXsupport states that QNX 6.5 SP1 binaries run on QNX 6.6 without modification, which I confirmed
witha few quick tests.<p class="MsoNormal"> <p class="MsoNormal">Summary of changes required for PostgreSQL 9.3.4 on
QNX6.5:<p class="MsoListParagraph" style="text-indent:-.25in;mso-list:l0 level1 lfo1"><span
style="font-family:Symbol"><spanstyle="mso-list:Ignore">·<span style="font:7.0pt "Times New Roman"">        
</span></span></span>Typicalchanges required for any new port (template, configure.in, dynloader, etc.)<p
class="MsoListParagraph"style="text-indent:-.25in;mso-list:l0 level1 lfo1"><span style="font-family:Symbol"><span
style="mso-list:Ignore">·<spanstyle="font:7.0pt "Times New Roman"">         </span></span></span>QNX lacks System V
sharedmemory: I created “src/backend/port/posix_shmem.c” which replaces System V calls (shmget, shmat, shmdt, …) with
POSIXcalls (shm_open, mmap, munmap, shm_unlink)<p class="MsoListParagraph" style="text-indent:-.25in;mso-list:l0 level1
lfo1"><spanstyle="font-family:Symbol"><span style="mso-list:Ignore">·<span style="font:7.0pt "Times New
Roman"">        </span></span></span>QNX lacks sigaction SA_RESTART: I modified “src/include/port.h” to define macros
toretry system calls upon EINTR (open,read,write,…) when compiled on QNX<p class="MsoListParagraph"
style="text-indent:-.25in;mso-list:l0level1 lfo1"><span style="font-family:Symbol"><span style="mso-list:Ignore">·<span
style="font:7.0pt"Times New Roman"">         </span></span></span>A few files required addition of #include
<sys/select.h>on QNX (for fd_set).<p class="MsoNormal"> <p class="MsoNormal">Additional changes required for
PostgreSQL9.4beta2on QNX 6.5:<p class="MsoListParagraph" style="text-indent:-.25in;mso-list:l1 level1 lfo3"><span
style="font-family:Symbol"><spanstyle="mso-list:Ignore">·<span style="font:7.0pt "Times New Roman"">        
</span></span></span>“DSM”changes introduced in 9.4 (R. Haas) required that I make minor updates to my new
“posix_shmem.c”code.<p class="MsoListParagraph" style="text-indent:-.25in;mso-list:l1 level1 lfo3"><span
style="font-family:Symbol"><spanstyle="mso-list:Ignore">·<span style="font:7.0pt "Times New Roman"">        
</span></span></span>src\include\replication\logical.h:struct LogicalDecodingContext field “write” interferes with my
“write”retry macro.  Renaming field “write” to “do_write” solved this problem.<p class="MsoNormal"> <p
class="MsoNormal">OutstandingIssue #1: <p class="MsoNormal" style="text-indent:.5in">src/backend/commands/dbcommands.c
::createdb() complains when copying template1 to template0 (apparently a locale issue)<p class="MsoNormal"
style="margin-left:.5in;text-indent:.5in">“FATAL: 22023: new LC_CTYPE (C;collate:POSIX;ctype:POSIX) is incompatible
withthe LC_CTYPE of the template database (POSIX;messages:C)”<p class="MsoNormal" style="text-indent:.5in">I would
appreciatehelp from an experienced PostgreSQL hacker to address this.<p class="MsoNormal" style="text-indent:.5in">I
havetemporarily disabled this check on QNX (I can live with the assumption/limitation that template0 and template1
containstrictly ASCII).<p class="MsoNormal"> <p class="MsoNormal">I can work toward setting up a build farm member
shouldthis proposal be accepted.<p class="MsoNormal">Your feedback and guidance on next steps is appreciated.<p
class="MsoNormal"> <pclass="MsoNormal">Thank you.<p class="MsoNormal"> <p class="MsoNormal"><b><span
style="font-size:10.0pt;font-family:"Arial","sans-serif"">KeithBaker</span></b> <span
style="font-size:8.0pt;font-family:"Arial","sans-serif""><br/><br /></span><p class="MsoNormal"> </div>

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Tom Lane

Дата:

25 июля 2014 г., 22:30:02

"Baker, Keith [OCDUS Non-J&J]" <KBaker9@its.jnj.com> writes:
> I propose that a QNX 6.5 port be introduced to PostgreSQL.

Hmm ... you're aware that there used to be a QNX port?  We removed it
back in 2006 for lack of interest and maintainers, and AFAIR you're
the first person to show any interest in reintroducing it since then.

I'm a bit concerned about reintroducing something that seems to have so
little usage, especially if the port is going to be as invasive as you
suggest:

> *         QNX lacks System V shared memory: I created "src/backend/port/posix_shmem.c" which replaces System V calls
(shmget,shmat, shmdt, ...) with POSIX calls (shm_open, mmap, munmap, shm_unlink)

This isn't really acceptable for production usage; if it were, we'd have
done it already.  The POSIX APIs lack any way to tell how many processes
are attached to a shmem segment, which is *necessary* functionality for
us (it's a critical part of the interlock against starting multiple
postmasters in one data directory).

> *         QNX lacks sigaction SA_RESTART: I modified "src/include/port.h" to define macros to retry system calls upon
EINTR(open,read,write,...) when compiled on QNX

That's pretty scary too.  For one thing, such macros would affect every
call site whether it's running with SA_RESTART or not.  Do you really
need it?  It looks to me like we just turn off HAVE_POSIX_SIGNALS if
you don't have SA_RESTART.  Maybe that code has bit-rotted by now, but
it did work at one time.
        regards, tom lane

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Merlin Moncure

Дата:

28 июля 2014 г., 16:19:55

On Fri, Jul 25, 2014 at 3:16 PM, Baker, Keith [OCDUS Non-J&J]
<KBaker9@its.jnj.com> wrote:
> I propose that a QNX 6.5 port be introduced to PostgreSQL.
>
> I am new to PostgreSQL development, so please bear with me.
>
>
>
> I have made good progress (with 1 outstanding issue, details below):
>
> ·         I created a QNX 6.5 port of PostgreSQL 9.3.4 which passes
> regression tests.
>
> ·         I merged my changes into 9.4beta2, and with a few minor changes,
> it passes regression tests.
>
> ·         QNX support states that QNX 6.5 SP1 binaries run on QNX 6.6
> without modification, which I confirmed with a few quick tests.
>
>
>
> Summary of changes required for PostgreSQL 9.3.4 on QNX 6.5:
>
> ·         Typical changes required for any new port (template, configure.in,
> dynloader, etc.)
>
> ·         QNX lacks System V shared memory: I created
> “src/backend/port/posix_shmem.c” which replaces System V calls (shmget,
> shmat, shmdt, …) with POSIX calls (shm_open, mmap, munmap, shm_unlink)
>
> ·         QNX lacks sigaction SA_RESTART: I modified “src/include/port.h” to
> define macros to retry system calls upon EINTR (open,read,write,…) when
> compiled on QNX
>
> ·         A few files required addition of #include <sys/select.h> on QNX
> (for fd_set).
>
>
>
> Additional changes required for PostgreSQL9.4beta2 on QNX 6.5:
>
> ·         “DSM” changes introduced in 9.4 (R. Haas) required that I make
> minor updates to my new “posix_shmem.c” code.
>
> ·         src\include\replication\logical.h: struct LogicalDecodingContext
> field “write” interferes with my “write” retry macro.  Renaming field
> “write” to “do_write” solved this problem.
>
>
>
> Outstanding Issue #1:
>
> src/backend/commands/dbcommands.c :: createdb() complains when copying
> template1 to template0 (apparently a locale issue)
>
> “FATAL:  22023: new LC_CTYPE (C;collate:POSIX;ctype:POSIX) is incompatible
> with the LC_CTYPE of the template database (POSIX;messages:C)”
>
> I would appreciate help from an experienced PostgreSQL hacker to address
> this.
>
> I have temporarily disabled this check on QNX (I can live with the
> assumption/limitation that template0 and template1 contain strictly ASCII).
>
> I can work toward setting up a build farm member should this proposal be
> accepted.

Maybe step #1 is to get a buildfarm member set up.  Is there any
policy against unsupported environments in the buildfarm? (I hope not)

You're going to have to run it against a git repository containing
your custom patches.  It's a long and uncertain road to getting a new
port (re-) accepted, but demonstrated commitment to support is a
necessary first step. It will also advertise support for the platform.

merlin

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Andres Freund

Дата:

28 июля 2014 г., 16:22:25

On 2014-07-28 11:19:48 -0500, Merlin Moncure wrote:
> Maybe step #1 is to get a buildfarm member set up.  Is there any
> policy against unsupported environments in the buildfarm? (I hope not)
> 
> You're going to have to run it against a git repository containing
> your custom patches.  It's a long and uncertain road to getting a new
> port (re-) accepted, but demonstrated commitment to support is a
> necessary first step. It will also advertise support for the platform.

I don't think a buildfarm animal that doesn't run the actual upstream
code is a good idea. That'll make it a lot harder to understand what's
going on when something breaks after a commit.  It'd also require the
custom patches being rebased ontop of $branch before every run...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Merlin Moncure

Дата:

28 июля 2014 г., 16:42:04

On Mon, Jul 28, 2014 at 11:22 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-07-28 11:19:48 -0500, Merlin Moncure wrote:
>> Maybe step #1 is to get a buildfarm member set up.  Is there any
>> policy against unsupported environments in the buildfarm? (I hope not)
>>
>> You're going to have to run it against a git repository containing
>> your custom patches.  It's a long and uncertain road to getting a new
>> port (re-) accepted, but demonstrated commitment to support is a
>> necessary first step. It will also advertise support for the platform.
>
> I don't think a buildfarm animal that doesn't run the actual upstream
> code is a good idea. That'll make it a lot harder to understand what's
> going on when something breaks after a commit.  It'd also require the
> custom patches being rebased ontop of $branch before every run...

hm. oh well.  maybe if there was a separate page for custom builds
(basically, an unsupported section).

merlin

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Peter Geoghegan

Дата:

29 июля 2014 г., 00:13:35

On Mon, Jul 28, 2014 at 9:41 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> I don't think a buildfarm animal that doesn't run the actual upstream
>> code is a good idea. That'll make it a lot harder to understand what's
>> going on when something breaks after a commit.  It'd also require the
>> custom patches being rebased ontop of $branch before every run...
>
> hm. oh well.  maybe if there was a separate page for custom builds
> (basically, an unsupported section).

I think that's a bad idea. The QNX OS seems to be mostly used in
safety-critical systems; it has a microkernel design. I think it would
be particularly bad to have iffy support for something like that.


-- 
Peter Geoghegan

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Robert Haas

Дата:

29 июля 2014 г., 20:18:00

On Fri, Jul 25, 2014 at 6:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> * QNX lacks System V shared memory: I created "src/backend/port/posix_shmem.c" which replaces System V calls
(shmget,shmat, shmdt, ...) with POSIX calls (shm_open, mmap, munmap, shm_unlink)

>
> This isn't really acceptable for production usage; if it were, we'd have
> done it already. The POSIX APIs lack any way to tell how many processes
> are attached to a shmem segment, which is *necessary* functionality for
> us (it's a critical part of the interlock against starting multiple
> postmasters in one data directory).

I think it would be good to spend some energy figuring out what to do
about this. The Linux developers, for reasons I have not been able to
understand, appear to hate System V shared memory, and rumors have
circulated here that they would like to get rid of it altogether. And
quite apart from that, even using a few bytes of System V shared
memory is apparently inconvenient for people who run many copies of
PostgreSQL on the same machine or who run in environments where it's
not available, such as FreeBSD jails for which it hasn't been
specifically enabled.[1]

Now, in fairness, all of the alternative systems have their own share
of problems. POSIX shared memory isn't available everywhere, and the
anonymous mmap we're now using doesn't work in EXEC_BACKEND builds,
can't be used for dynamic shared memory, and apparently performs
poorly on BSD systems.[1] In spite of that, I think that having an
option to use POSIX shared memory would make a reasonable number of
PostgreSQL users happier than they are today; and maybe even attract a
few new ones.

In our last discussion on this topic, we talked about using file locks
as a substitute for nattch. You concluded that fcntl was totally
broken for this purpose because of the possibility of some other piece
of code accidentally opening and closing the lock file.[2] lockf
appears to have the same problem, but flock might not, at least on
some systems. The semantics as described in my copy of the Linux man
pages are that a child created by fork() inherits a copy of the
filehandle pointing to the same lock, and that the lock is released
when either ANY process with a copy of that filehandle makes an
explicit unlock request or ALL copies of the filehandle are closed.
That seems like it'd be OK for our purposes, though the Linux guys
seem to think the semantics might be different on other platforms, and
note that it won't work over NFS.

Another thing that strikes me is that lsof works on just about every
platform I've ever used, and it tells you who has got a certain file
open. Of course it has to use different methods to do that on
different platforms, but at least on Linux, /proc/self/fd/N is a
symlink to the file you've got open, and shared memory segments are
files in /dev/shm. So maybe at least on particular platforms where we
care enough, we could install operating-system-specific code to
provide an interlock using a mechanism of this type. Not sure if that
will fly, but it's a thought.

Yet another idea is to somehow use POSIX semaphores, which are
distinct from POSIX shared memory. semop() has a SEM_UNDO flag which
causes whatever operation you perform to reversed out a process exit.
So you could have each new postgres process increment the semaphore
value in such a way that it would be decremented on exit, although I'm
not sure how to avoid a race if the postmaster dies before a new child
has a chance to increment the semaphore.

Finally, how about named pipes? Linux says that trying to open a
named pipe for write when there are no readers will return ENXIO, and
attempting to write to an already-open pipe with no remaining readers
will cause SIGPIPE. So: create a permanent named pipe in the data
directory that all PostgreSQL processes keep open. When the
postmaster starts, it opens the pipe for read, then for write, then
closes it for read. It then tries to write to the pipe. If this
fails to result in SIGPIPE, then somebody else has got the thing open;
so the new postmaster should die at once. But if does get a SIGPIPE
then there are as of that moment no other readers.

I'm not sure if any of this helps QNX or not, but maybe if we figure
out which of these mechanisms (or others) might be acceptable we can
cross-check that against what QNX supports.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] See comments on
http://rhaas.blogspot.com/2012/06/absurd-shared-memory-limits.html
[2] http://www.postgresql.org/message-id/18958.1340764854@sss.pgh.pa.us

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Tom Lane

Дата:

29 июля 2014 г., 23:06:28

Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Jul 25, 2014 at 6:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> This isn't really acceptable for production usage; if it were, we'd have
>> done it already.  The POSIX APIs lack any way to tell how many processes
>> are attached to a shmem segment, which is *necessary* functionality for
>> us (it's a critical part of the interlock against starting multiple
>> postmasters in one data directory).

> I think it would be good to spend some energy figuring out what to do
> about this.

Well, we've been around on this multiple times before, but if we have
any new ideas, sure ...

> In our last discussion on this topic, we talked about using file locks
> as a substitute for nattch.  You concluded that fcntl was totally
> broken for this purpose because of the possibility of some other piece
> of code accidentally opening and closing the lock file.[2]  lockf
> appears to have the same problem, but flock might not, at least on
> some systems.

My Linux man page for flock says
      flock()  does not lock files over NFS.  Use fcntl(2) instead: that does      work over NFS, given a sufficiently
recent version  of  Linux  and  a      server which supports locking.
 

which seems like a showstopper problem; we might try to tell people not to
put their databases on NFS, but they're not gonna listen.  It also says
      flock()  and  fcntl(2)  locks  have different semantics with respect to      forked processes and dup(2).  On
systemsthat implement  flock()  using      fcntl(2),  the  semantics  of  flock()  will  be  different  from those
describedin this manual page.
 

which is pretty scary if it's accurate for any still-extant platforms;
we might think we're using flock and still get fcntl behavior.  It's
also of concern that (AFAICS) flock is not in POSIX, which means we
can't even expect that platforms will agree on how it *should* behave.

I also noted that flock does not support atomic downgrade of exclusive
lock to shared lock, which seems like a problem for the lock inheritance
scheme sketched in
http://www.postgresql.org/message-id/18162.1340761845@sss.pgh.pa.us
... but OTOH, it sounds like flock locks are not only inherited through
fork() but even preserved across exec(), which would mean that we don't
need that scheme for file lock inheritance, even with EXEC_BACKEND.
Still, it's not clear to me how we could put much faith in flock.

> Finally, how about named pipes? Linux says that trying to open a
> named pipe for write when there are no readers will return ENXIO, and
> attempting to write to an already-open pipe with no remaining readers
> will cause SIGPIPE.  So: create a permanent named pipe in the data
> directory that all PostgreSQL processes keep open.  When the
> postmaster starts, it opens the pipe for read, then for write, then
> closes it for read.  It then tries to write to the pipe.  If this
> fails to result in SIGPIPE, then somebody else has got the thing open;
> so the new postmaster should die at once.   But if does get a SIGPIPE
> then there are as of that moment no other readers.

Hm.  That particular protocol is broken: two postmasters doing it at the
same time would both pass (because neither has it open for read at the
instant where they try to write).  But we could possibly frob the idea
until it works.  Bigger question is how portable is this behavior?
I see named pipes (fifos) in SUS v2, which is our usual baseline
assumption about what's portable across Unixen, so maybe it would work.
But does NFS support named pipes?
        regards, tom lane

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

"Baker, Keith [OCDUS Non-J&J]"

Дата:

29 июля 2014 г., 23:23:58

Thank you to all who have responded to this proposal.
PostgreSQL manages to meet all production requirements on Windows without System V shared memory, so I would think this
canbe achieved on QNX/Linux. 

The old PostgreSQL QNX port ran on the very old "QNX4" (1991), so I understand why it would be of little value today.
Currently, QNX Neutrino 6.5 is well established (and QNX 6.6 is emerging) and that is where a PostgreSQL port would be
wellreceived. 

I have attached my current work-in-progress patches for 9.3.4 and 9.4beta2 for the curious.
To minimize risk, I have been careful to ensure my changes will have effect only QNX builds, existing ports should see
zeroimpact. 
To minimize addition of new files, I have used the "linux" template rather than add qnx6 as a separate port/template.

All regression tests pass on my system, so while not perfect it is at least a reasonable start.
posix_shmem.c is still in need of some cleanup and mitigations to make it "production-strength".

If there are existing tests I can run to ensure the QNX port meets your criteria for robust failure handling in this
areaI would be happy to run them. 
If not, perhaps someone can provide a quick list of failure modes to consider.
As-is:
- starting of a second postmaster fails with message 'FATAL: lock file "postmaster.pid" already exists'
- Kill -9 of postmaster followed by a pg_ctl start seems to go through recovery, although the original shared memory
segmentshang out in /dev/shmem until reboot (that could be better). 

Thanks again and please let me know if I can be of any assistance.

Keith Baker

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, July 29, 2014 7:06 PM
To: Robert Haas
Cc: Baker, Keith [OCDUS Non-J&J]; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL

Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Jul 25, 2014 at 6:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> This isn't really acceptable for production usage; if it were, we'd
>> have done it already.  The POSIX APIs lack any way to tell how many
>> processes are attached to a shmem segment, which is *necessary*
>> functionality for us (it's a critical part of the interlock against
>> starting multiple postmasters in one data directory).

> I think it would be good to spend some energy figuring out what to do
> about this.

Well, we've been around on this multiple times before, but if we have any new ideas, sure ...

> In our last discussion on this topic, we talked about using file locks
> as a substitute for nattch.  You concluded that fcntl was totally
> broken for this purpose because of the possibility of some other piece
> of code accidentally opening and closing the lock file.[2]  lockf
> appears to have the same problem, but flock might not, at least on
> some systems.

My Linux man page for flock says

       flock()  does not lock files over NFS.  Use fcntl(2) instead: that does
       work over NFS, given a sufficiently  recent  version  of  Linux  and  a
       server which supports locking.

which seems like a showstopper problem; we might try to tell people not to put their databases on NFS, but they're not
gonnalisten.  It also says 

       flock()  and  fcntl(2)  locks  have different semantics with respect to
       forked processes and dup(2).  On systems that implement  flock()  using
       fcntl(2),  the  semantics  of  flock()  will  be  different  from those
       described in this manual page.

which is pretty scary if it's accurate for any still-extant platforms; we might think we're using flock and still get
fcntlbehavior.  It's also of concern that (AFAICS) flock is not in POSIX, which means we can't even expect that
platformswill agree on how it *should* behave. 

I also noted that flock does not support atomic downgrade of exclusive lock to shared lock, which seems like a problem
forthe lock inheritance scheme sketched in http://www.postgresql.org/message-id/18162.1340761845@sss.pgh.pa.us 
... but OTOH, it sounds like flock locks are not only inherited through
fork() but even preserved across exec(), which would mean that we don't need that scheme for file lock inheritance,
evenwith EXEC_BACKEND. 
Still, it's not clear to me how we could put much faith in flock.

> Finally, how about named pipes? Linux says that trying to open a named
> pipe for write when there are no readers will return ENXIO, and
> attempting to write to an already-open pipe with no remaining readers
> will cause SIGPIPE.  So: create a permanent named pipe in the data
> directory that all PostgreSQL processes keep open.  When the
> postmaster starts, it opens the pipe for read, then for write, then
> closes it for read.  It then tries to write to the pipe.  If this
> fails to result in SIGPIPE, then somebody else has got the thing open;
> so the new postmaster should die at once.   But if does get a SIGPIPE
> then there are as of that moment no other readers.

Hm.  That particular protocol is broken: two postmasters doing it at the same time would both pass (because neither has
itopen for read at the instant where they try to write).  But we could possibly frob the idea until it works.  Bigger
questionis how portable is this behavior? 
I see named pipes (fifos) in SUS v2, which is our usual baseline assumption about what's portable across Unixen, so
maybeit would work. 
But does NFS support named pipes?

            regards, tom lane

Вложения

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Tom Lane

Дата:

29 июля 2014 г., 23:34:53

"Baker, Keith [OCDUS Non-J&J]" <KBaker9@its.jnj.com> writes:
> If there are existing tests I can run to ensure the QNX port meets your criteria for robust failure handling in this
areaI would be happy to run them.

> If not, perhaps someone can provide a quick list of failure modes to consider.
> As-is:
> - starting of a second postmaster fails with message 'FATAL: lock file "postmaster.pid" already exists'
> - Kill -9 of postmaster followed by a pg_ctl start seems to go through recovery, although the original shared memory
segmentshang out in /dev/shmem until reboot (that could be better).

Unfortunately, that probably proves it's broken rather than that it works.
The behavior we need is that after kill -9'ing the postmaster, subsequent
postmaster start attempts *fail* until all the original postmaster's child
processes are gone.  Otherwise you end up with two independent sets of
processes scribbling on the same files (and not sharing shmem either).
Kiss consistency goodbye ...

It's possible that all the children automatically exited, especially if
you had only background processes active; but if you had a live regular
session it would not exit just because the parent process died.
        regards, tom lane

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Robert Haas

Дата:

30 июля 2014 г., 14:43:04

On Tue, Jul 29, 2014 at 7:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think it would be good to spend some energy figuring out what to do
>> about this.
>
> Well, we've been around on this multiple times before, but if we have
> any new ideas, sure ...

Well, I tried to compile a more comprehensive list of possible
techniques in that email than I've seen anyone post before.

> Still, it's not clear to me how we could put much faith in flock.

Yeah, after some more research, I think you're right.  Apparently, as
recently as 2010, the Linux kernel transparently converted flock()
requests to fcntl()-style locks when running on NFS:

http://0pointer.de/blog/projects/locking.html

Maybe someday this will be reliable enough to use, but the odds of it
happening in the next decade don't look good.

>> Finally, how about named pipes? Linux says that trying to open a
>> named pipe for write when there are no readers will return ENXIO, and
>> attempting to write to an already-open pipe with no remaining readers
>> will cause SIGPIPE.  So: create a permanent named pipe in the data
>> directory that all PostgreSQL processes keep open.  When the
>> postmaster starts, it opens the pipe for read, then for write, then
>> closes it for read.  It then tries to write to the pipe.  If this
>> fails to result in SIGPIPE, then somebody else has got the thing open;
>> so the new postmaster should die at once.   But if does get a SIGPIPE
>> then there are as of that moment no other readers.
>
> Hm.  That particular protocol is broken: two postmasters doing it at the
> same time would both pass (because neither has it open for read at the
> instant where they try to write).  But we could possibly frob the idea
> until it works.  Bigger question is how portable is this behavior?
> I see named pipes (fifos) in SUS v2, which is our usual baseline
> assumption about what's portable across Unixen, so maybe it would work.
> But does NFS support named pipes?

Looks iffy, on a quick search.  Sigh.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Tom Lane

Дата:

30 июля 2014 г., 15:02:26

Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Jul 29, 2014 at 7:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Hm.  That particular protocol is broken: two postmasters doing it at the
>> same time would both pass (because neither has it open for read at the
>> instant where they try to write).  But we could possibly frob the idea
>> until it works.  Bigger question is how portable is this behavior?
>> I see named pipes (fifos) in SUS v2, which is our usual baseline
>> assumption about what's portable across Unixen, so maybe it would work.
>> But does NFS support named pipes?

> Looks iffy, on a quick search.  Sigh.

I poked around, and it seems like a lot of the people who think it's flaky
are imagining that they should be able to use a named pipe on an NFS
server to pass data between two different machines.  That doesn't work,
but it's not what we need, either.  For communication between processes
on the same server, all that's needed is that the filesystem entry looks
like a pipe to the local kernel --- and that's been required NFS
functionality since RFC1813 (v3, in 1995).

So it seems like we could possibly go this route, assuming we can think
of a variant of your proposal that's race-condition-free.  A disadvantage
compared to a true file lock is that it would not protect against people
trying to start postmasters from two different NFS client machines --- but
we don't have protection against that now.  (Maybe we could do this *and*
do a regular file lock to offer some protection against that case, even if
it's not bulletproof?)
        regards, tom lane

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

"Baker, Keith [OCDUS Non-J&J]"

Дата:

30 июля 2014 г., 22:59:53

Robert and Tom,

Please let me know if either of you are ready to experiment with the "named pipe" idea anytime soon.
If not, I would be happy to take a crack at it, but would appreciate your expert advice to start me down the right path
(files/functionsto update, pseudo-code, etc.). 

-Keith Baker

> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
> owner@postgresql.org] On Behalf Of Tom Lane
> Sent: Wednesday, July 30, 2014 11:02 AM
> To: Robert Haas
> Cc: Baker, Keith [OCDUS Non-J&J]; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL
>
> Robert Haas <robertmhaas@gmail.com> writes:
> > On Tue, Jul 29, 2014 at 7:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Hm.  That particular protocol is broken: two postmasters doing it at
> >> the same time would both pass (because neither has it open for read
> >> at the instant where they try to write).  But we could possibly frob
> >> the idea until it works.  Bigger question is how portable is this behavior?
> >> I see named pipes (fifos) in SUS v2, which is our usual baseline
> >> assumption about what's portable across Unixen, so maybe it would
> work.
> >> But does NFS support named pipes?
>
> > Looks iffy, on a quick search.  Sigh.
>
> I poked around, and it seems like a lot of the people who think it's flaky are
> imagining that they should be able to use a named pipe on an NFS server to
> pass data between two different machines.  That doesn't work, but it's not
> what we need, either.  For communication between processes on the same
> server, all that's needed is that the filesystem entry looks like a pipe to the
> local kernel --- and that's been required NFS functionality since RFC1813 (v3,
> in 1995).
>
> So it seems like we could possibly go this route, assuming we can think of a
> variant of your proposal that's race-condition-free.  A disadvantage
> compared to a true file lock is that it would not protect against people trying
> to start postmasters from two different NFS client machines --- but we don't
> have protection against that now.  (Maybe we could do this *and* do a
> regular file lock to offer some protection against that case, even if it's not
> bulletproof?)
>
>             regards, tom lane
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make
> changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Tom Lane

Дата:

30 июля 2014 г., 23:26:54

"Baker, Keith [OCDUS Non-J&J]" <KBaker9@its.jnj.com> writes:
> Please let me know if either of you are ready to experiment with the "named pipe" idea anytime soon.
> If not, I would be happy to take a crack at it, but would appreciate your expert advice to start me down the right
path(files/functions to update, pseudo-code, etc.).
 

Well, before we start coding anything, the first order of business would
be to think of a bulletproof locking protocol using the available pipe
operations.  Robert's straw man isn't that, but it seems like there might
be one in there.
        regards, tom lane

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Robert Haas

Дата:

31 июля 2014 г., 16:58:07

On Wed, Jul 30, 2014 at 11:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> So it seems like we could possibly go this route, assuming we can think
> of a variant of your proposal that's race-condition-free.  A disadvantage
> compared to a true file lock is that it would not protect against people
> trying to start postmasters from two different NFS client machines --- but
> we don't have protection against that now.  (Maybe we could do this *and*
> do a regular file lock to offer some protection against that case, even if
> it's not bulletproof?)

That's not a bad idea.  By the way, it also wouldn't be too hard to
test at runtime whether or not flock() has first-close semantics.  Not
that we'd want this exact design, but suppose you configure
shmem_interlock=flock in postgresql.conf.  On startup, we test whether
flock is reliable, determine that it is, and proceed accordingly.
Now, you move your database onto an NFS volume and the semantics
change (because, hey, breaking userspace assumptions is fun) and try
to restart up your database, and it says FATAL: flock() is broken.
Now you can either move the database back, or set shmem_interlock to
some other value.

Now maybe, as you say, it's best to use multiple locking protocols and
hope that at least one will catch whatever the dangerous situation is.
I'm just trying to point out that we need not blindly assume the
semantics we want are there (or that they are not); we can check.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

"Baker, Keith [OCDUS Non-J&J]"

Дата:

31 июля 2014 г., 22:51:04

I will on vacation until August 11, I look forward to any progress you are able to make.

Since ensuring there are not orphaned back-end processes is vital, could we add a check for getppid() == 1 ?
Patch below seemed to work on QNX (first client command after a kill -9 of postmaster resulted in exit of its
associatedserver process).
 
diff -rdup postgresql-9.3.5/src/backend/tcop/postgres.c postgresql-9.3.5_qnx/src/backend/tcop/postgres.c---
postgresql-9.3.5/src/backend/tcop/postgres.c   2014-07-21 15:10:42.000000000 -0400+++
postgresql-9.3.5_qnx/src/backend/tcop/postgres.c   2014-07-31 18:17:40.000000000 -0400@@ -3967,6 +3967,14 @@
PostgresMain(intargc, char *argv[],         */        firstchar = ReadCommand(&input_message); +#ifndef WIN32+
/*Check for death of parent */+            if (getppid() == 1)+            ereport(FATAL,+
(errcode(ERRCODE_CRASH_SHUTDOWN),+                errmsg("Parent server process has exited")));+#endif+        /*
 * (4) disable async signal conditions again.         */
 

Keith Baker 

> -----Original Message-----
> From: Robert Haas [mailto:robertmhaas@gmail.com]
> Sent: Thursday, July 31, 2014 12:58 PM
> To: Tom Lane
> Cc: Baker, Keith [OCDUS Non-J&J]; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL
> 
> On Wed, Jul 30, 2014 at 11:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > So it seems like we could possibly go this route, assuming we can
> > think of a variant of your proposal that's race-condition-free.  A
> > disadvantage compared to a true file lock is that it would not protect
> > against people trying to start postmasters from two different NFS
> > client machines --- but we don't have protection against that now.
> > (Maybe we could do this *and* do a regular file lock to offer some
> > protection against that case, even if it's not bulletproof?)
> 
> That's not a bad idea.  By the way, it also wouldn't be too hard to test at
> runtime whether or not flock() has first-close semantics.  Not that we'd want
> this exact design, but suppose you configure shmem_interlock=flock in
> postgresql.conf.  On startup, we test whether flock is reliable, determine
> that it is, and proceed accordingly.
> Now, you move your database onto an NFS volume and the semantics
> change (because, hey, breaking userspace assumptions is fun) and try to
> restart up your database, and it says FATAL: flock() is broken.
> Now you can either move the database back, or set shmem_interlock to
> some other value.
> 
> Now maybe, as you say, it's best to use multiple locking protocols and hope
> that at least one will catch whatever the dangerous situation is.
> I'm just trying to point out that we need not blindly assume the semantics we
> want are there (or that they are not); we can check.
> 
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL
> Company

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Tom Lane

Дата:

01 августа 2014 г., 01:51:39

"Baker, Keith [OCDUS Non-J&J]" <KBaker9@its.jnj.com> writes:
> Since ensuring there are not orphaned back-end processes is vital, could we add a check for getppid() == 1 ?

No.  Or yeah, we could, but that patch would add no security worth
mentioning.  For example, someone could launch a query that runs for
many minutes, and would have plenty of time to conflict with a
subsequently-started postmaster.

Even without that issue, there's no consensus that forcibly making
orphan backends exit would be a good thing.  (Some people would
like to have such an option, but the key word in that sentence is
"option".)
        regards, tom lane

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Robert Haas

Дата:

04 августа 2014 г., 14:54:32

On Thu, Jul 31, 2014 at 9:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Baker, Keith [OCDUS Non-J&J]" <KBaker9@its.jnj.com> writes:
>> Since ensuring there are not orphaned back-end processes is vital, could we add a check for getppid() == 1 ?
>
> No.  Or yeah, we could, but that patch would add no security worth
> mentioning.  For example, someone could launch a query that runs for
> many minutes, and would have plenty of time to conflict with a
> subsequently-started postmaster.

True.

> Even without that issue, there's no consensus that forcibly making
> orphan backends exit would be a good thing.  (Some people would
> like to have such an option, but the key word in that sentence is
> "option".)

I believe that multiple people have said multiple times that we should
change the behavior so that orphaned backends exit immediately; I
think you are the only one defending the current behavior.  There are
several problems with the status quo:

1. Most seriously, once the postmaster is gone, there's nobody to
SIGQUIT remaining backends if somebody exits uncleanly.  This means
that a backend running without a postmaster could be running in a
corrupt shared memory segment, which could lead to all sorts of
misbehavior, including possible data corruption.

2. Operationally, orphaned backends prevent the system from being
restarted.  There's no easy, automatic way to kill them, so scripts
that automatically restart the database server if it exits don't work.
Even if letting the remaining backends continue to operate is good,
not being able to accept new connections is bad enough to completely
overshadow it.  In many situations, killing them is a small price to
pay to get the system back on line.

3. Practically, the performance of any remaining backends will be
poor, because processes like the WAL writer and background writer
aren't going to be around to help any more.  I think this will only
get worse over time; certainly, any future parallel query facility
won't work if the postmaster isn't around to fork new children.  And
maybe we'll have other utility processes over time, too.  But in any
case the situation isn't great right now, either.

Now, I don't say that any of this is a reason not to have a strong
shared memory interlock, but I'm quite unconvinced that the current
behavior should even be optional, let alone the default.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Josh Berkus

Дата:

05 августа 2014 г., 00:15:38

On 08/04/2014 07:54 AM, Robert Haas wrote:
> 1. Most seriously, once the postmaster is gone, there's nobody to
> SIGQUIT remaining backends if somebody exits uncleanly.  This means
> that a backend running without a postmaster could be running in a
> corrupt shared memory segment, which could lead to all sorts of
> misbehavior, including possible data corruption.

I've seen this in the field.

> 2. Operationally, orphaned backends prevent the system from being
> restarted.  There's no easy, automatic way to kill them, so scripts
> that automatically restart the database server if it exits don't work.

I've also seen this in the field.

> Now, I don't say that any of this is a reason not to have a strong
> shared memory interlock, but I'm quite unconvinced that the current
> behavior should even be optional, let alone the default.

I always assumed that the current behavior existed because we *couldn't*
fix it, not because anybody wanted it.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Andres Freund

Дата:

09 августа 2014 г., 16:29:25

On 2014-08-04 10:54:25 -0400, Robert Haas wrote:
> On Thu, Jul 31, 2014 at 9:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Even without that issue, there's no consensus that forcibly making
> > orphan backends exit would be a good thing.  (Some people would
> > like to have such an option, but the key word in that sentence is
> > "option".)
> 
> I believe that multiple people have said multiple times that we should
> change the behavior so that orphaned backends exit immediately; I
> think you are the only one defending the current behavior.  There are
> several problems with the status quo:

+1. I think the current behaviour is a seriously bad idea.

Greetings,

Andres Freund

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Andres Freund

Дата:

09 августа 2014 г., 18:04:09

On 2014-08-09 14:00:49 -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2014-08-04 10:54:25 -0400, Robert Haas wrote:
> >> I believe that multiple people have said multiple times that we should
> >> change the behavior so that orphaned backends exit immediately; I
> >> think you are the only one defending the current behavior.  There are
> >> several problems with the status quo:
> 
> > +1. I think the current behaviour is a seriously bad idea.
> 
> I don't think it's anywhere near as black-and-white as you guys claim.
> What it comes down to is whether allowing existing transactions/sessions
> to finish is more important than allowing new sessions to start.
> Depending on the application, either could be more important.

Nah. The current behaviour circumvents security measures we normally
consider absolutely essential. If the postmaster died some bad shit went
on. The likelihood of hitting corner case bugs where it's important that
we react to a segfault/panic with a restart/crash replay is rather high.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Tom Lane

Дата:

09 августа 2014 г., 18:05:30

Andres Freund <andres@2ndquadrant.com> writes:
> On 2014-08-04 10:54:25 -0400, Robert Haas wrote:
>> I believe that multiple people have said multiple times that we should
>> change the behavior so that orphaned backends exit immediately; I
>> think you are the only one defending the current behavior.  There are
>> several problems with the status quo:

> +1. I think the current behaviour is a seriously bad idea.

I don't think it's anywhere near as black-and-white as you guys claim.
What it comes down to is whether allowing existing transactions/sessions
to finish is more important than allowing new sessions to start.
Depending on the application, either could be more important.

Ideally we'd have some way to configure the behavior appropriately for
a given installation; but short of that, it's unclear to me that
unilaterally changing the system's bias is something our users would
thank us for.  I've not noticed a large groundswell of complaints about
it (though this may just reflect that we've made the postmaster pretty
darn robust, so that the case seldom comes up).
        regards, tom lane

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Tom Lane

Дата:

09 августа 2014 г., 18:09:48

Andres Freund <andres@2ndquadrant.com> writes:
> On 2014-08-09 14:00:49 -0400, Tom Lane wrote:
>> I don't think it's anywhere near as black-and-white as you guys claim.
>> What it comes down to is whether allowing existing transactions/sessions
>> to finish is more important than allowing new sessions to start.
>> Depending on the application, either could be more important.

> Nah. The current behaviour circumvents security measures we normally
> consider absolutely essential. If the postmaster died some bad shit went
> on. The likelihood of hitting corner case bugs where it's important that
> we react to a segfault/panic with a restart/crash replay is rather high.

What's your point?  Once a new postmaster starts, it *will* do a crash
restart, because certainly no shutdown checkpoint ever happened.  The
only issue here is what grace period existing orphaned backends are given
to finish their work --- and it's not possible for the answer to that
to be "zero", so you don't get to assume that nothing happens in
backend-land after the instant of postmaster crash.
        regards, tom lane

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Andres Freund

Дата:

09 августа 2014 г., 18:16:06

On 2014-08-09 14:09:36 -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2014-08-09 14:00:49 -0400, Tom Lane wrote:
> >> I don't think it's anywhere near as black-and-white as you guys claim.
> >> What it comes down to is whether allowing existing transactions/sessions
> >> to finish is more important than allowing new sessions to start.
> >> Depending on the application, either could be more important.
> 
> > Nah. The current behaviour circumvents security measures we normally
> > consider absolutely essential. If the postmaster died some bad shit went
> > on. The likelihood of hitting corner case bugs where it's important that
> > we react to a segfault/panic with a restart/crash replay is rather high.
> 
> What's your point?  Once a new postmaster starts, it *will* do a crash
> restart, because certainly no shutdown checkpoint ever happened.

That's not saying much. For one, there can be online checkpoints in that
time. So it's certainly not guaranteed (or even all that likely) that
all the WAL since the incident is replayed.  For another, it can be
*hours* before all the backends finish.

IIRC we'll continue to happily write WAL and everything after postmaster
(and possibly some backends, corrupting shmem) have crashed. The
bgwriter, checkpointer, backends will continue to write dirty buffers to
disk. We'll IIRC continue to write checkpoints.  That's simply not
things we should be doing after postmaster crashed if we can avoid at
all.

> The
> only issue here is what grace period existing orphaned backends are given
> to finish their work --- and it's not possible for the answer to that
> to be "zero", so you don't get to assume that nothing happens in
> backend-land after the instant of postmaster crash.

Sure. But I don't think a window in the range of seconds comes close to
being the same as a window that easily can be hours.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Noah Misch

Дата:

10 августа 2014 г., 22:36:37

[Due for a new subject line?]

On Sat, Aug 09, 2014 at 08:16:01PM +0200, Andres Freund wrote:
> On 2014-08-09 14:09:36 -0400, Tom Lane wrote:
> > Andres Freund <andres@2ndquadrant.com> writes:
> > > On 2014-08-09 14:00:49 -0400, Tom Lane wrote:
> > >> I don't think it's anywhere near as black-and-white as you guys claim.
> > >> What it comes down to is whether allowing existing transactions/sessions
> > >> to finish is more important than allowing new sessions to start.
> > >> Depending on the application, either could be more important.
> > 
> > > Nah. The current behaviour circumvents security measures we normally
> > > consider absolutely essential. If the postmaster died some bad shit went
> > > on. The likelihood of hitting corner case bugs where it's important that
> > > we react to a segfault/panic with a restart/crash replay is rather high.
> > 
> > What's your point?  Once a new postmaster starts, it *will* do a crash
> > restart, because certainly no shutdown checkpoint ever happened.
> 
> That's not saying much. For one, there can be online checkpoints in that
> time. So it's certainly not guaranteed (or even all that likely) that
> all the WAL since the incident is replayed.  For another, it can be
> *hours* before all the backends finish.
> 
> IIRC we'll continue to happily write WAL and everything after postmaster
> (and possibly some backends, corrupting shmem) have crashed. The
> bgwriter, checkpointer, backends will continue to write dirty buffers to
> disk. We'll IIRC continue to write checkpoints.   That's simply not
> things we should be doing after postmaster crashed if we can avoid at
> all.

The basic support processes, including the checkpointer, exit promptly upon
detecting a postmaster exit.  Checkpoints cease.  Your central point still
stands.  WAL protects data integrity only to the extent that we stop writing
it after shared memory ceases to be trustworthy.  Crash recovery of WAL
written based on corrupt buffers just reproduces the corruption.

> > The
> > only issue here is what grace period existing orphaned backends are given
> > to finish their work --- and it's not possible for the answer to that
> > to be "zero", so you don't get to assume that nothing happens in
> > backend-land after the instant of postmaster crash.

Our grace period for active backends after unclean exit of one of their peers
is low, milliseconds to seconds.  Our grace period for active backends after
unclean exit of the postmaster is unconstrained.  At least one of those
policies has to be wrong.  Like Andres and Robert, I pick the second one.

-- 
Noah Misch
EnterpriseDB                                 http://www.enterprisedb.com

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Stephen Frost

Дата:

10 августа 2014 г., 23:11:48

* Noah Misch (noah@leadboat.com) wrote:
> [Due for a new subject line?]

Probably.

> Our grace period for active backends after unclean exit of one of their peers
> is low, milliseconds to seconds.  Our grace period for active backends after
> unclean exit of the postmaster is unconstrained.  At least one of those
> policies has to be wrong.  Like Andres and Robert, I pick the second one.

Ditto for me.  The postmaster going away really is a bad sign and the
confusion due to leftover processes is terrible for our users.
Thanks,
    Stephen

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Kevin Grittner

Дата:

11 августа 2014 г., 14:40:22

Stephen Frost <sfrost@snowman.net> wrote:

>> Our grace period for active backends after unclean exit of one
>> of their peers is low, milliseconds to seconds.  Our grace
>> period for active backends after unclean exit of the postmaster
>> is unconstrained.  At least one of those policies has to be
>> wrong. Like Andres and Robert, I pick the second one.
>
> Ditto for me.

+1

In fact, I would say that is slightly understated.  The grace
period for active backends after unclean exit of one of their peers
is low, milliseconds to seconds, *unless the postmaster has also
crashed* -- in which case it is unconstrained.  Why is the crash of
a backend less serious if the postmaster has also crashed?
Certainly it can't be considered to be surprising that if the
postmaster is crashing that other backends might be also crashing
around the same time?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Andres Freund

Дата:

11 августа 2014 г., 15:35:51

On 2014-08-10 18:36:18 -0400, Noah Misch wrote:
> [Due for a new subject line?]
> 
> On Sat, Aug 09, 2014 at 08:16:01PM +0200, Andres Freund wrote:
> > On 2014-08-09 14:09:36 -0400, Tom Lane wrote:
> > > Andres Freund <andres@2ndquadrant.com> writes:
> > > > On 2014-08-09 14:00:49 -0400, Tom Lane wrote:
> > > >> I don't think it's anywhere near as black-and-white as you guys claim.
> > > >> What it comes down to is whether allowing existing transactions/sessions
> > > >> to finish is more important than allowing new sessions to start.
> > > >> Depending on the application, either could be more important.
> > > 
> > > > Nah. The current behaviour circumvents security measures we normally
> > > > consider absolutely essential. If the postmaster died some bad shit went
> > > > on. The likelihood of hitting corner case bugs where it's important that
> > > > we react to a segfault/panic with a restart/crash replay is rather high.
> > > 
> > > What's your point?  Once a new postmaster starts, it *will* do a crash
> > > restart, because certainly no shutdown checkpoint ever happened.
> > 
> > That's not saying much. For one, there can be online checkpoints in that
> > time. So it's certainly not guaranteed (or even all that likely) that
> > all the WAL since the incident is replayed.  For another, it can be
> > *hours* before all the backends finish.
> > 
> > IIRC we'll continue to happily write WAL and everything after postmaster
> > (and possibly some backends, corrupting shmem) have crashed. The
> > bgwriter, checkpointer, backends will continue to write dirty buffers to
> > disk. We'll IIRC continue to write checkpoints.   That's simply not
> > things we should be doing after postmaster crashed if we can avoid at
> > all.
> 
> The basic support processes, including the checkpointer, exit promptly upon
> detecting a postmaster exit.  Checkpoints cease.

Only after finishing an 'in process' checkpoint though afaics. And only
if no new checkpoint has been requested since. The latter because we
don't even test for postmaster death if a latch has been set... I think
it's similar for the bgwriter and such.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Robert Haas

Дата:

11 августа 2014 г., 16:24:56

On Sat, Aug 9, 2014 at 2:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> +1. I think the current behaviour is a seriously bad idea.
>
> I don't think it's anywhere near as black-and-white as you guys claim.
> What it comes down to is whether allowing existing transactions/sessions
> to finish is more important than allowing new sessions to start.
> Depending on the application, either could be more important.

It's partly about that, and I think the answer is that being able to
start new sessions is almost always more important; but it's also
about about the fact that the postmaster provides essential
protections against data corruption, and running without those
protections is a bad idea.  If it's not a bad idea, then why do we
need those protections ever?  Why have we put so much effort into
bullet-proofing them over the years?

I mean, we could simply regard the unexpected end of a backend as
being something that is "probably OK" and we'd usually be right; after
all, a backend would crap out without releasing a critical spinlock
very often.   A lot of users would probably be very happy to be
liberated from the tyranny of a server-wide restart every time a
backend crashes, and 90% of the time nothing bad would happen.  But
clearly this is insanity, because every now and then something would
go terribly wrong and there would be no automated way for the system
to recover, and on even rarer occasions your data would get eaten.
That is why it is right to think that the service provided by the
postmaster is essential, not nice-to-have.

> Ideally we'd have some way to configure the behavior appropriately for
> a given installation; but short of that, it's unclear to me that
> unilaterally changing the system's bias is something our users would
> thank us for.  I've not noticed a large groundswell of complaints about
> it (though this may just reflect that we've made the postmaster pretty
> darn robust, so that the case seldom comes up).

I do think that's a large part of it.  The postmaster doesn't get
killed very often, and when it does, things are often messed up to a
degree where the user's just going to reboot anyway.  But I've
encountered customers who managed to corrupt their database because
backends didn't exit when the postmaster died, because it turns out
that removing postmaster.pid defeats the shared memory interlocks that
normally prevent starting a new postmaster, and the customer did that.
And I've personally experienced at least one protracted outage that
resulted from orphaned backends preventing 'pg_ctl restart' from
working.  If the postmaster weren't so reliable, I'm sure these kinds
of problems would be a lot more common.

But the fact that they're uncommon doesn't mean that the current
behavior is the best one, and I'm convinced that it isn't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

"Baker, Keith [OCDUS Non-J&J]"

Дата:

13 августа 2014 г., 22:39:02

Robert and Tom,

I assume you guys are working on other priorities, so I did some locking experiments on QNX.

I know fcntl() locking has downsides, but I think it deserves a second look:
- it is POSIX, so should be fairly consistent across platforms (at least more consistent than lockf and flock)
- the "accidental" open/close lock release can be easily avoided (simply don't add new code which touches the new,
uniquelock file)
 
- don't know if it will work on NFS, but that is not a priority for me (is that really a requirement for a QNX port?)

Existing System V shared memory locking can be left in place for all existing platforms (so nothing lost), while
fcntl()-stylelocking could be limited to platforms which lack System V shared memory (like QNX).
 

Experimental patch is attached, but logic is basically this:
a. postmaster obtains exclusive lock on data dir file "postmaster.fcntl" (or FATAL)
b. postmaster then downgrades to shared lock (or FATAL)
c. all other backend processes obtain shared lock on this file (or FATAL)

A quick test on QNX 6.5 appeared to behave well (orphan backends left behind after kill -9 of postmaster held their
locks,thus database restart was prevented as desired).
 
Let me know if there are other test scenarios to consider.

Thanks!

-Keith Baker


> -----Original Message-----
> From: Robert Haas [mailto:robertmhaas@gmail.com]
> Sent: Thursday, July 31, 2014 12:58 PM
> To: Tom Lane
> Cc: Baker, Keith [OCDUS Non-J&J]; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL
> 
> On Wed, Jul 30, 2014 at 11:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > So it seems like we could possibly go this route, assuming we can
> > think of a variant of your proposal that's race-condition-free.  A
> > disadvantage compared to a true file lock is that it would not protect
> > against people trying to start postmasters from two different NFS
> > client machines --- but we don't have protection against that now.
> > (Maybe we could do this *and* do a regular file lock to offer some
> > protection against that case, even if it's not bulletproof?)
> 
> That's not a bad idea.  By the way, it also wouldn't be too hard to test at
> runtime whether or not flock() has first-close semantics.  Not that we'd want
> this exact design, but suppose you configure shmem_interlock=flock in
> postgresql.conf.  On startup, we test whether flock is reliable, determine
> that it is, and proceed accordingly.
> Now, you move your database onto an NFS volume and the semantics
> change (because, hey, breaking userspace assumptions is fun) and try to
> restart up your database, and it says FATAL: flock() is broken.
> Now you can either move the database back, or set shmem_interlock to
> some other value.
> 
> Now maybe, as you say, it's best to use multiple locking protocols and hope
> that at least one will catch whatever the dangerous situation is.
> I'm just trying to point out that we need not blindly assume the semantics we
> want are there (or that they are not); we can check.
> 
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL
> Company

Вложения

fcntl_lock_20140813.patch

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Tom Lane

Дата:

13 августа 2014 г., 23:05:08

"Baker, Keith [OCDUS Non-J&J]" <KBaker9@its.jnj.com> writes:
> I assume you guys are working on other priorities, so I did some locking experiments on QNX.

> I know fcntl() locking has downsides, but I think it deserves a second look:
> - it is POSIX, so should be fairly consistent across platforms (at least more consistent than lockf and flock)
> - the "accidental" open/close lock release can be easily avoided (simply don't add new code which touches the new,
uniquelock file)
 

I guess you didn't read the previous discussion.  Asserting that it's
"easy to avoid" an accidental unlock doesn't make it true.  In the case of
a PG backend, we have to expect that people will run random code inside,
say, plperlu or plpythonu functions.  And it doesn't seem unlikely that
someone might scan the entire PGDATA directory tree as part of, for
example, a backup or archiving operation.  If we had full control of
everything that ever happens in a PG backend process then *maybe* we could
have adequate confidence that we'd never lose the lock, but we don't.
        regards, tom lane

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

"Baker, Keith [OCDUS Non-J&J]"

Дата:

14 августа 2014 г., 00:20:29

Tom,

I appreciate your patience and explanation. (I am new to PostgreSQL hacking.  I have read many old posts but not all of
itsticks, sorry). 
I know QNX support is not high on your TODO list, so I am trying to keep the effort moving without being a distraction.

Couldn't backend "random code" corrupt any file in the PGDATA dir?
Perhaps the new fcntl lock file could be kept outside PGDATA directory tree to make likelihood of backend "random code"
interferenceremote. 
This could be present and used only on systems without System V shared memory (QNX), leaving existing platforms
unaffected.

I know this falls short of perfect, but perhaps is good enough to get the QNX port off the ground.
I would rather have a QNX port with reasonable restrictions than no port at all.

Also, I will try to experiment with named pipe locking as Robert had suggested.
Thanks again for your feedback, I really do appreciate it.

-Keith Baker

> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Wednesday, August 13, 2014 7:05 PM
> To: Baker, Keith [OCDUS Non-J&J]
> Cc: Robert Haas; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL
>
> "Baker, Keith [OCDUS Non-J&J]" <KBaker9@its.jnj.com> writes:
> > I assume you guys are working on other priorities, so I did some locking
> experiments on QNX.
>
> > I know fcntl() locking has downsides, but I think it deserves a second look:
> > - it is POSIX, so should be fairly consistent across platforms (at
> > least more consistent than lockf and flock)
> > - the "accidental" open/close lock release can be easily avoided
> > (simply don't add new code which touches the new, unique lock file)
>
> I guess you didn't read the previous discussion.  Asserting that it's "easy to
> avoid" an accidental unlock doesn't make it true.  In the case of a PG
> backend, we have to expect that people will run random code inside, say,
> plperlu or plpythonu functions.  And it doesn't seem unlikely that someone
> might scan the entire PGDATA directory tree as part of, for example, a
> backup or archiving operation.  If we had full control of everything that ever
> happens in a PG backend process then *maybe* we could have adequate
> confidence that we'd never lose the lock, but we don't.
>
>             regards, tom lane

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

"Baker, Keith [OCDUS Non-J&J]"

Дата:

14 августа 2014 г., 16:08:52

Tom and Robert,

I tried a combination of PIPE lock and file lock (fcntl) as Tom had suggested.
Attached experimental patch has this logic...

Postmaster :
- get exclusive fcntl lock (to guard against race condition in PIPE-based lock)
- check PIPE for any existing readers
- open PIPE for read

All other backends:
- get shared fcnlt lock
- open PIPE for read

Your feedback is appreciated.
Thanks.

-Keith Baker


> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
> owner@postgresql.org] On Behalf Of Tom Lane
> Sent: Wednesday, July 30, 2014 11:02 AM
> To: Robert Haas
> Cc: Baker, Keith [OCDUS Non-J&J]; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL
>
> Robert Haas <robertmhaas@gmail.com> writes:
> > On Tue, Jul 29, 2014 at 7:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Hm.  That particular protocol is broken: two postmasters doing it at
> >> the same time would both pass (because neither has it open for read
> >> at the instant where they try to write).  But we could possibly frob
> >> the idea until it works.  Bigger question is how portable is this behavior?
> >> I see named pipes (fifos) in SUS v2, which is our usual baseline
> >> assumption about what's portable across Unixen, so maybe it would
> work.
> >> But does NFS support named pipes?
>
> > Looks iffy, on a quick search.  Sigh.
>
> I poked around, and it seems like a lot of the people who think it's flaky are
> imagining that they should be able to use a named pipe on an NFS server to
> pass data between two different machines.  That doesn't work, but it's not
> what we need, either.  For communication between processes on the same
> server, all that's needed is that the filesystem entry looks like a pipe to the
> local kernel --- and that's been required NFS functionality since RFC1813 (v3,
> in 1995).
>
> So it seems like we could possibly go this route, assuming we can think of a
> variant of your proposal that's race-condition-free.  A disadvantage
> compared to a true file lock is that it would not protect against people trying
> to start postmasters from two different NFS client machines --- but we don't
> have protection against that now.  (Maybe we could do this *and* do a
> regular file lock to offer some protection against that case, even if it's not
> bulletproof?)
>
>             regards, tom lane
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make
> changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

locking_20140814.patch

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Robert Haas

Дата:

15 августа 2014 г., 13:40:18

On Thu, Aug 14, 2014 at 12:08 PM, Baker, Keith [OCDUS Non-J&J]
<KBaker9@its.jnj.com> wrote:
> I tried a combination of PIPE lock and file lock (fcntl) as Tom had suggested.
> Attached experimental patch has this logic...
>
> Postmaster :
> - get exclusive fcntl lock (to guard against race condition in PIPE-based lock)
> - check PIPE for any existing readers
> - open PIPE for read
>
> All other backends:
> - get shared fcnlt lock
> - open PIPE for read

Hmm.  This seems like it might almost work.  But I don't see why the
other backends need to care about fcntl() at all.  How about this
locking protocol:

Postmaster:
1. Acquire an exclusive lock on some file in the data directory, maybe
the control file, using fcntl().
2. Open the named pipe for read.
3. Open the named pipe for write.
4. Close the named pipe for read.
5. Install a signal handler for SIGPIPE which sets a global variable.
6. Try to write to the pipe.
7. Check that the variable is set; if not, FATAL.
8. Revert SIGPIPE handler.
9. Close the named pipe for write.
10. Open the named pipe for read.
11. Release the fcntl() lock acquired in step 1.

Regular backends don't need to do anything special, except that they
need to make sure that the file descriptor opened in step 8 gets
inherited by the right set of processes.  That means that the
close-on-exec flag should be turned on in the postmaster; except in
EXEC_BACKEND builds, where it should be turned off but then turned on
again by child processes before they do anything that might fork.

It's impossible for two postmasters to start up at the same time
because the fcntl() lock acquired at step 1 will block any
newly-arriving postmaster until step 11 is completel.  The
first-to-close semantics of fcntl() aren't a problem for this purpose
because we only execute a very limited amount of code over which we
have full control while holding the lock.  By the time the postmaster
that gets the lock first completes step 10, any later-arriving
postmaster is guaranteed to fall out at step 7 while that postmaster
or any children who inherit the pipe descriptor remain alive.  No
process holds any resource that will survive its exit, so cleanup is
fully automatic.

This seems solid to me, but watch somebody find a problem with it...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Tom Lane

Дата:

15 августа 2014 г., 16:02:22

Robert Haas <robertmhaas@gmail.com> writes:
> How about this locking protocol:

> Postmaster:
> 1. Acquire an exclusive lock on some file in the data directory, maybe
> the control file, using fcntl().
> 2. Open the named pipe for read.
> 3. Open the named pipe for write.
> 4. Close the named pipe for read.
> 5. Install a signal handler for SIGPIPE which sets a global variable.
> 6. Try to write to the pipe.
> 7. Check that the variable is set; if not, FATAL.
> 8. Revert SIGPIPE handler.
> 9. Close the named pipe for write.
> 10. Open the named pipe for read.
> 11. Release the fcntl() lock acquired in step 1.

Hm, this seems like it would work.  A couple other thoughts:

* I think 5..8 are overly complex: we can just set SIGPIPE to SIG_IGN
(which is its usual setting in the postmaster already) and check for
EPIPE from the write().

* There might be some benefit to swapping steps 9 and 10; at the
very least, this would eliminate the need to use O_NONBLOCK while
re-opening for read.

* We talked about combining this technique with a plain file lock
so that we would have belt-and-suspenders protection, in particular
something that would have a chance of working across NFS clients.
This would suggest leaving the fcntl lock in place, ie, don't do
step 11, and also that the file-to-be-locked *not* have any other
purpose (which would only increase the risk of losing the lock
through careless open/close).

> Regular backends don't need to do anything special, except that they
> need to make sure that the file descriptor opened in step 8 gets
> inherited by the right set of processes.  That means that the
> close-on-exec flag should be turned on in the postmaster; except in
> EXEC_BACKEND builds, where it should be turned off but then turned on
> again by child processes before they do anything that might fork.

Meh.  Do we really want to allow a new postmaster to start if there
are any processes remaining that were launched by backends?  I'd
be inclined to just suppress close-on-exec, period.
        regards, tom lane

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Robert Haas

Дата:

15 августа 2014 г., 18:16:15

On Fri, Aug 15, 2014 at 12:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> * I think 5..8 are overly complex: we can just set SIGPIPE to SIG_IGN
> (which is its usual setting in the postmaster already) and check for
> EPIPE from the write().

wfm.

> * There might be some benefit to swapping steps 9 and 10; at the
> very least, this would eliminate the need to use O_NONBLOCK while
> re-opening for read.

Also wfm.

> * We talked about combining this technique with a plain file lock
> so that we would have belt-and-suspenders protection, in particular
> something that would have a chance of working across NFS clients.
> This would suggest leaving the fcntl lock in place, ie, don't do
> step 11, and also that the file-to-be-locked *not* have any other
> purpose (which would only increase the risk of losing the lock
> through careless open/close).

I'd be afraid that a secondary mechanism that mostly-but-not-really
works could do more harm by allowing us to miss bugs in the primary,
pipe-based locking mechanism than the good it would accomplish.

>> Regular backends don't need to do anything special, except that they
>> need to make sure that the file descriptor opened in step 8 gets
>> inherited by the right set of processes.  That means that the
>> close-on-exec flag should be turned on in the postmaster; except in
>> EXEC_BACKEND builds, where it should be turned off but then turned on
>> again by child processes before they do anything that might fork.
>
> Meh.  Do we really want to allow a new postmaster to start if there
> are any processes remaining that were launched by backends?  I'd
> be inclined to just suppress close-on-exec, period.

Seems like a pretty weird and artificial restriction.  Anything that
has done exec() will not be connected to shared memory, so it really
doesn't matter whether it's still alive or not.  People can and do
write extensions that launch processes from PostgreSQL backends via
fork()+exec(), and we've taken pains in the past not to break such
cases.  I don't see a reason to impose now (for no
data-integrity-related reason) the rule that any such processes must
not be daemons.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Noah Misch

Дата:

16 августа 2014 г., 07:28:59

Nice algorithm.

On Fri, Aug 15, 2014 at 02:16:08PM -0400, Robert Haas wrote:
> On Fri, Aug 15, 2014 at 12:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > * We talked about combining this technique with a plain file lock
> > so that we would have belt-and-suspenders protection, in particular
> > something that would have a chance of working across NFS clients.
> > This would suggest leaving the fcntl lock in place, ie, don't do
> > step 11, and also that the file-to-be-locked *not* have any other
> > purpose (which would only increase the risk of losing the lock
> > through careless open/close).
> 
> I'd be afraid that a secondary mechanism that mostly-but-not-really
> works could do more harm by allowing us to miss bugs in the primary,
> pipe-based locking mechanism than the good it would accomplish.

Users do corrupt their NFS- and GFS2-hosted databases today.  I would rather
have each process hold only an fcntl() lock than hold only the FIFO file
descriptor.  There's no such dichotomy, so let's have both.

> > Meh.  Do we really want to allow a new postmaster to start if there
> > are any processes remaining that were launched by backends?  I'd
> > be inclined to just suppress close-on-exec, period.
> 
> Seems like a pretty weird and artificial restriction.  Anything that
> has done exec() will not be connected to shared memory, so it really
> doesn't matter whether it's still alive or not.  People can and do
> write extensions that launch processes from PostgreSQL backends via
> fork()+exec(), and we've taken pains in the past not to break such
> cases.  I don't see a reason to impose now (for no
> data-integrity-related reason) the rule that any such processes must
> not be daemons.

+1

-- 
Noah Misch
EnterpriseDB                                 http://www.enterprisedb.com

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Robert Haas

Дата:

18 августа 2014 г., 13:01:33

On Sat, Aug 16, 2014 at 3:28 AM, Noah Misch <noah@leadboat.com> wrote:
> Nice algorithm.

Thanks.

>> I'd be afraid that a secondary mechanism that mostly-but-not-really
>> works could do more harm by allowing us to miss bugs in the primary,
>> pipe-based locking mechanism than the good it would accomplish.
>
> Users do corrupt their NFS- and GFS2-hosted databases today.  I would rather
> have each process hold only an fcntl() lock than hold only the FIFO file
> descriptor.  There's no such dichotomy, so let's have both.

Meh.  We can do that, but I think that will provide us with only the
it-works-until-it-doesn't level of protection.  Granted, that's more
than zero, but does anyone advocate wearing seatbelts for the first 60
minutes you're in the car and then taking them off after that?  I
think that with a sufficiently long-running server the chances of the
lock somehow getting released approach certainty.  But I'm not going
to fight this one tooth and nail.

A bigger question in my view is what to do with the existing
mechanism.  The main advantage of making a change like this is that we
could finally dispense with System V shared memory completely.  But we
risk encountering systems where the battle-tested System V mechanism
works and this new one either fails to work at all (server won't
start) or fails to work as desired (interlock broken).  So it's
tempting to think we should have a GUC or control-file setting to
control which mechanism gets used.  Of course for QNX, the actual
subject of this thread, System V won't be an option, but other people
might like a big red button they can push if the new code turns out to
be less than we're hoping.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Noah Misch

Дата:

18 августа 2014 г., 14:00:00

On Mon, Aug 18, 2014 at 09:01:20AM -0400, Robert Haas wrote:
> On Sat, Aug 16, 2014 at 3:28 AM, Noah Misch <noah@leadboat.com> wrote:
> >> I'd be afraid that a secondary mechanism that mostly-but-not-really
> >> works could do more harm by allowing us to miss bugs in the primary,
> >> pipe-based locking mechanism than the good it would accomplish.
> >
> > Users do corrupt their NFS- and GFS2-hosted databases today.  I would rather
> > have each process hold only an fcntl() lock than hold only the FIFO file
> > descriptor.  There's no such dichotomy, so let's have both.
> 
> Meh.  We can do that, but I think that will provide us with only the
> it-works-until-it-doesn't level of protection.  Granted, that's more
> than zero, but does anyone advocate wearing seatbelts for the first 60
> minutes you're in the car and then taking them off after that?  I
> think that with a sufficiently long-running server the chances of the
> lock somehow getting released approach certainty.  But I'm not going
> to fight this one tooth and nail.

In case it wasn't clear, I advocate both using the FIFO defense and holding
fcntl locks throughout the life of every PostgreSQL process having a shared
memory attachment.  I grant that this raises the chance of a shortcoming in
one mechanism remaining undiscovered.  However, we already know that each by
itself has limitations.  I don't like the prospect of accepting a known hole
to help discover unknown holes.

We could have the would-be new postmaster, when it hits a fcntl lock conflict,
proceed with the FIFO check anyway.  If the FIFO check says "go" after the
fcntl check said "stop", emit a message about the apparent bug.  (That's
oversimplified; it needs looping to account for the case of the old postmaster
exiting concurrently.)

> A bigger question in my view is what to do with the existing
> mechanism.  The main advantage of making a change like this is that we
> could finally dispense with System V shared memory completely.  But we
> risk encountering systems where the battle-tested System V mechanism
> works and this new one either fails to work at all (server won't
> start) or fails to work as desired (interlock broken).  So it's
> tempting to think we should have a GUC or control-file setting to
> control which mechanism gets used.  Of course for QNX, the actual
> subject of this thread, System V won't be an option, but other people
> might like a big red button they can push if the new code turns out to
> be less than we're hoping.

A GUC sounds fine to me, as would using the sysv interlock unconditionally for
a couple more releases before removing it.

Thanks,
nm

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

"Baker, Keith [OCDUS Non-J&J]"

Дата:

18 августа 2014 г., 15:03:11

Robert, Tom, and others,

Glad to see good discussion and progress on the locking topic!

My proof of concept code (steps a though e below) avoided any reading or writing to the pipe (and associated handling
ofSIGPIPE), it just relied on postmaster open of PIPE with ENXIO to indicate all is clear.

Trying to keep things simple, I created 1 function for fcntl locks, 1 function for PIPE locks, and a wrapper that
calledboth in sequence (wrapper is called by the Backend mains).

I agree that "d." could be omitted, but I thought better to be conservative and has all processes obtain fcntl and PIPE
locks.
Is there a gap that a-e does not cover? (Sorry, not clear to me).
   Postmaster :   a. get exclusive fcntl lock (to guard against race condition in PIPE-based lock)   b. check PIPE for
anyexisting readers    +    fd_write = open(DIRECTORY_LOCK_PIPE, O_WRONLY | O_NONBLOCK);   +    if (!((fd_write < 0) &&
(errno== ENXIO))) ereport(FATAL,   +    if (fd_write > -1) close(fd_write);   c. open PIPE for read   +    fd_read =
open(DIRECTORY_LOCK_PIPE,O_RDONLY | O_NONBLOCK);

   All other backends:   d. get shared fcnlt lock   e. open PIPE for read   +    fd_read = open(DIRECTORY_LOCK_PIPE,
O_RDONLY| O_NONBLOCK);

Just my 2 cents, I am happy with whatever solution you find agreeable.

My assumptions:
1. Platforms without System V shared memory (QNX) would use POSIX shared memory and file-based (fcntl+pipe) locks.
2. Existing platforms would continue to rely of System V shared memory and its proven locking by default (perhaps with
optionuse all POSIX shared memory and file-based locks instead, at your discretion).

Robert, Assuming an algorithm choice is agreed upon in the near future, would you be the logical choice to implement
thechange?

I am happy to help, especially with any QNX-specific aspects, but don't want to step on anyone's toes.

Thanks.

Keith Baker

> -----Original Message-----
> From: Robert Haas [mailto:robertmhaas@gmail.com]
> Sent: Friday, August 15, 2014 2:16 PM
> To: Tom Lane
> Cc: Baker, Keith [OCDUS Non-J&J]; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL
> 
> On Fri, Aug 15, 2014 at 12:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > * I think 5..8 are overly complex: we can just set SIGPIPE to SIG_IGN
> > (which is its usual setting in the postmaster already) and check for
> > EPIPE from the write().
> 
> wfm.
> 
> > * There might be some benefit to swapping steps 9 and 10; at the very
> > least, this would eliminate the need to use O_NONBLOCK while
> > re-opening for read.
> 
> Also wfm.
> 
> > * We talked about combining this technique with a plain file lock so
> > that we would have belt-and-suspenders protection, in particular
> > something that would have a chance of working across NFS clients.
> > This would suggest leaving the fcntl lock in place, ie, don't do step
> > 11, and also that the file-to-be-locked *not* have any other purpose
> > (which would only increase the risk of losing the lock through
> > careless open/close).
> 
> I'd be afraid that a secondary mechanism that mostly-but-not-really works
> could do more harm by allowing us to miss bugs in the primary, pipe-based
> locking mechanism than the good it would accomplish.
> 
> >> Regular backends don't need to do anything special, except that they
> >> need to make sure that the file descriptor opened in step 8 gets
> >> inherited by the right set of processes.  That means that the
> >> close-on-exec flag should be turned on in the postmaster; except in
> >> EXEC_BACKEND builds, where it should be turned off but then turned on
> >> again by child processes before they do anything that might fork.
> >
> > Meh.  Do we really want to allow a new postmaster to start if there
> > are any processes remaining that were launched by backends?  I'd be
> > inclined to just suppress close-on-exec, period.
> 
> Seems like a pretty weird and artificial restriction.  Anything that has done
> exec() will not be connected to shared memory, so it really doesn't matter
> whether it's still alive or not.  People can and do write extensions that launch
> processes from PostgreSQL backends via fork()+exec(), and we've taken
> pains in the past not to break such cases.  I don't see a reason to impose now
> (for no data-integrity-related reason) the rule that any such processes must
> not be daemons.
> 
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL
> Company

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Robert Haas

Дата:

20 августа 2014 г., 16:25:39

On Mon, Aug 18, 2014 at 11:02 AM, Baker, Keith [OCDUS Non-J&J]
<KBaker9@its.jnj.com> wrote:
> My proof of concept code (steps a though e below) avoided any reading or writing to the pipe (and associated handling
ofSIGPIPE), it just relied on postmaster open of PIPE with ENXIO to indicate all is clear.
 

I'm not following.

> Robert, Assuming an algorithm choice is agreed upon in the near future, would you be the logical choice to implement
thechange?
 
> I am happy to help, especially with any QNX-specific aspects, but don't want to step on anyone's toes.

I'm unlikely to have time to work on this in the immediate future, but
I may be able to help review.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

"Baker, Keith [OCDUS Non-J&J]"

Дата:

20 августа 2014 г., 19:10:15

Robert and Tom,

Sorry for any confusion, I will try to clarify.

Here is progression of events as I recall them:
- My Initial QNX 6.5 port proposal lacked a robust replacement for the existing System V shared memory locking
mechanism,a show stopper.
 
- Robert proposed a nice set of possible alternatives for locking (to enable an all POSIX shared memory solution for
futureplatforms).
 
- Tom and Robert seemed to agree that a combination of file-based locking plus pipe-based locking should be a
sufficientlyrobust on platforms without Sys V shared memory (e.g., QNX).
 
- I coded a proof-of-concept patch (fcntl + PIPE) which appeared to work on QNX (steps a through e).
- Robert countered with an 11 step algorithm (all in the postmaster)
- Tom suggested elimination of steps 5,6,7,8, and 11 (and swapping order 9 and 10)

I was just taking a step back to ask what gaps existed in the proof-of-concept patch (steps a through e).
Is there a scenario it fails to cover, prompting the seemingly more complex 11 step algorithm (which added writing data
tothe pipe and handling of SIGPIPE)?
 

I am willing to attempt coding of the set of changes for a QNX port (option for new locking and all POSIX shared
memory,plus a few minor QNX-specific tweaks), provided you and Tom are satisfied that the show stoppers have been
sufficientlyaddressed.
 

Please let me know if more discussion is required, or if it would be reasonable for me (or someone else of your
choosing)to work on the coding effort (perhaps targeted for 9.5?)
 
If on the other hand it has been decided that a QNX port is not in the cards, I would like to know (I hope that is not
thecase given the progress made, but no point in wasting anyone's time).
 

Thanks again for your time, effort, patience, and coaching.

Keith Baker


> -----Original Message-----
> From: Robert Haas [mailto:robertmhaas@gmail.com]
> Sent: Wednesday, August 20, 2014 12:26 PM
> To: Baker, Keith [OCDUS Non-J&J]
> Cc: Tom Lane; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL
> 
> On Mon, Aug 18, 2014 at 11:02 AM, Baker, Keith [OCDUS Non-J&J]
> <KBaker9@its.jnj.com> wrote:
> > My proof of concept code (steps a though e below) avoided any reading or
> writing to the pipe (and associated handling of SIGPIPE), it just relied on
> postmaster open of PIPE with ENXIO to indicate all is clear.
> 
> I'm not following.
> 
> > Robert, Assuming an algorithm choice is agreed upon in the near future,
> would you be the logical choice to implement the change?
> > I am happy to help, especially with any QNX-specific aspects, but don't
> want to step on anyone's toes.
> 
> I'm unlikely to have time to work on this in the immediate future, but I may
> be able to help review.
> 
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL
> Company

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Alvaro Herrera

Дата:

20 августа 2014 г., 20:15:59

Baker, Keith [OCDUS Non-J&J] wrote:

> Please let me know if more discussion is required, or if it would be
> reasonable for me (or someone else of your choosing) to work on the
> coding effort (perhaps targeted for 9.5?)
> If on the other hand it has been decided that a QNX port is not in the
> cards, I would like to know (I hope that is not the case given the
> progress made, but no point in wasting anyone's time).

As I recall, other than the postmaster startup interlock, the other
major missing item you mentioned is SA_RESTART.  That could well turn
out to be a showstopper, so I suggest you study that in more depth.

Are there other major items missing?  Did you have to use
configure --disable-spinlocks for instance?

What's your compiler, and what are the underlying hardware platforms you
want to support?

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

"Baker, Keith [OCDUS Non-J&J]"

Дата:

20 августа 2014 г., 21:21:48

Alvaro,

Thanks for your interest and questions.
At this point I have created a proof-of-concept QNX 6.5 port which appears to work on the surface (passes regression
tests),but needs to be deemed "production-quality". 

To work around lack of SA_RESTART, I added QNX-specific retry macros to port.h
With these macros in place "make check" runs cleanly (fails in many place without them).
   +#if defined(__QNX__)   +/* QNX does not support sigaction SA_RESTART. We must retry interrupted calls (EINTR) */
+/*Helper macros, used to build our retry macros */   +#define PG_RETRY_EINTR3(exp,val,type) ({ type _tmp_rc; do
_tmp_rc= (exp); while (_tmp_rc == (val) && errno == EINTR); _tmp_rc; })   +#define PG_RETRY_EINTR(exp)
PG_RETRY_EINTR3(exp,-1L,longint)   +#define PG_RETRY_EINTR_FILE(exp) PG_RETRY_EINTR3(exp,NULL,FILE *)   +/* override
callsknown to return EINTR when interrupted */   +#define close(a) PG_RETRY_EINTR(close(a))   +#define fclose(a)
PG_RETRY_EINTR(fclose(a))  +#define fdopen(a,b) PG_RETRY_EINTR_FILE(fdopen(a,b))   +#define fopen(a,b)
PG_RETRY_EINTR_FILE(fopen(a,b))  +#define freopen(a,b,c) PG_RETRY_EINTR_FILE(freopen(a,b,c))   +#define fseek(a,b,c)
PG_RETRY_EINTR(fseek(a,b,c))  +#define fseeko(a,b,c) PG_RETRY_EINTR(fseeko(a,b,c))   +#define ftruncate(a,b)
PG_RETRY_EINTR(ftruncate(a,b))  +#define lseek(a,b,c) PG_RETRY_EINTR(lseek(a,b,c))   +#define open(a,b,...) ({ int
_tmp_rc;do _tmp_rc = open(a,b,##__VA_ARGS__); while (_tmp_rc == (-1) && errno == EINTR); _tmp_rc; })   +#define
shm_open(a,b,c)PG_RETRY_EINTR(shm_open(a,b,c))   +#define stat(a,b) PG_RETRY_EINTR(stat(a,b))   +#define unlink(a)
PG_RETRY_EINTR(unlink(a))  ... (Macros for read and write are similar but slightly longer, so I omit them here)...
+#endif   /* __QNX__ */ 

Here is what I used for configure, I am open to suggestions:   ./configure --without-readline --disable-thread-safety

I am targeting QNX 6.5 on x86, using gcc 4.4.2.

Also, I have an issue to work out for locale support, but expect I can solve that.

Keith Baker

> -----Original Message-----
> From: Alvaro Herrera [mailto:alvherre@2ndquadrant.com]
> Sent: Wednesday, August 20, 2014 4:16 PM
> To: Baker, Keith [OCDUS Non-J&J]
> Cc: Robert Haas; Tom Lane; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL
>
> Baker, Keith [OCDUS Non-J&J] wrote:
>
> > Please let me know if more discussion is required, or if it would be
> > reasonable for me (or someone else of your choosing) to work on the
> > coding effort (perhaps targeted for 9.5?) If on the other hand it has
> > been decided that a QNX port is not in the cards, I would like to know
> > (I hope that is not the case given the progress made, but no point in
> > wasting anyone's time).
>
> As I recall, other than the postmaster startup interlock, the other major
> missing item you mentioned is SA_RESTART.  That could well turn out to be a
> showstopper, so I suggest you study that in more depth.
>
> Are there other major items missing?  Did you have to use configure --
> disable-spinlocks for instance?
>
> What's your compiler, and what are the underlying hardware platforms you
> want to support?
>
> --
> Álvaro Herrera                http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Andres Freund

Дата:

20 августа 2014 г., 23:25:09

Hi,

On 2014-08-20 21:21:41 +0000, Baker, Keith [OCDUS Non-J&J] wrote:
> To work around lack of SA_RESTART, I added QNX-specific retry macros to port.h
> With these macros in place "make check" runs cleanly (fails in many place without them).
> 
>     +#if defined(__QNX__)
>     +/* QNX does not support sigaction SA_RESTART. We must retry interrupted calls (EINTR) */
>     +/* Helper macros, used to build our retry macros */
>     +#define PG_RETRY_EINTR3(exp,val,type) ({ type _tmp_rc; do _tmp_rc = (exp); while (_tmp_rc == (val) && errno ==
EINTR);_tmp_rc; })
 
>     +#define PG_RETRY_EINTR(exp) PG_RETRY_EINTR3(exp,-1L,long int)
>     +#define PG_RETRY_EINTR_FILE(exp) PG_RETRY_EINTR3(exp,NULL,FILE *)
>     +/* override calls known to return EINTR when interrupted */
>     +#define close(a) PG_RETRY_EINTR(close(a))
>     +#define fclose(a) PG_RETRY_EINTR(fclose(a))
>     +#define fdopen(a,b) PG_RETRY_EINTR_FILE(fdopen(a,b))
>     +#define fopen(a,b) PG_RETRY_EINTR_FILE(fopen(a,b))
>     +#define freopen(a,b,c) PG_RETRY_EINTR_FILE(freopen(a,b,c))
>     +#define fseek(a,b,c) PG_RETRY_EINTR(fseek(a,b,c))
>     +#define fseeko(a,b,c) PG_RETRY_EINTR(fseeko(a,b,c))
>     +#define ftruncate(a,b) PG_RETRY_EINTR(ftruncate(a,b))
>     +#define lseek(a,b,c) PG_RETRY_EINTR(lseek(a,b,c))
>     +#define open(a,b,...) ({ int _tmp_rc; do _tmp_rc = open(a,b,##__VA_ARGS__); while (_tmp_rc == (-1) && errno ==
EINTR);_tmp_rc; })
 
>     +#define shm_open(a,b,c) PG_RETRY_EINTR(shm_open(a,b,c))
>     +#define stat(a,b) PG_RETRY_EINTR(stat(a,b))
>     +#define unlink(a) PG_RETRY_EINTR(unlink(a))
>     ... (Macros for read and write are similar but slightly longer, so I omit them here)...
>     +#endif    /* __QNX__ */

I think this is a horrible way to go and unlikely to succeed. You're
surely going to miss calls and it's going to need to be maintained
continuously. We'll miss adding things which will then only break under
load. Which most poeple won't be able to generate under qnx.

The only reasonably way to fake kernel SA_RESTART support is doing so is
in $platform's libc. In the syscall wrapper.

> Here is what I used for configure, I am open to suggestions:
>     ./configure --without-readline --disable-thread-safety

Why is the --disable-thread-safety needed?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Andres Freund

Дата:

20 августа 2014 г., 23:33:46

On 2014-07-25 18:29:53 -0400, Tom Lane wrote:
> > *         QNX lacks sigaction SA_RESTART: I modified "src/include/port.h" to define macros to retry system calls
uponEINTR (open,read,write,...) when compiled on QNX

> 
> That's pretty scary too.  For one thing, such macros would affect every
> call site whether it's running with SA_RESTART or not.  Do you really
> need it?  It looks to me like we just turn off HAVE_POSIX_SIGNALS if
> you don't have SA_RESTART.  Maybe that code has bit-rotted by now, but
> it did work at one time.

I have pretty much no trust that we're maintaining
!HAVE_POSIX_SIGNAL. And none that we have that capability of doing so. I
seriously doubt there's any !HAVE_POSIX_SIGNAL animals and
873ab97219caabeb2f7b390268a4fe01e2b7518c makes it pretty darn unlikely
that we have much chance of finding such mistakes during development.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

"Baker, Keith [OCDUS Non-J&J]"

Дата:

21 августа 2014 г., 15:25:52

Hello Andres,

Thanks for your response.

About SA_RESTART:
------------------------
I would like to offer you a different perspective which may alter your current opinion.
I believe the port.h QNX macro replacement for SA_RESTART is still a reasonable solution on QNX for these reasons:

First, I think it is better to adapt PostgreSQL to suit the platform than to adapt the platform to suit PostgreSQL.
Changing default behavior of libc on QNX to suit PostgreSQL may break other applications which rely on the current
behaviorof libc. 

Yes, I could forget to add a port.h macro for a given interruptible primitive, but I could likewise forget to update
thewrapper for that call in a custom libc. 
I requested that QNX support provide me a list of interruptible primitives, but I was able to identify many by
searchingthrough the QNX help. 
Definition of a new interruptible primitive is a rare event, so once a solid list of macros is in place for QNX, it
shouldneed very little maintenance. 
If you have any specific calls you believe are missing from my list of macros, I would be happy to add them.

port.h is included in c.h, which is in postgres.h, so the QNX macros should be effective for all QNX PostgreSQL
compiles.
If it were not, no one could reply on any port.h features on any platform.

Testing so far has demonstrated that the macro fixes are effective on QNX.  Repeated runs of the regression tests run
cleanly.
More testing will be required to boost the confidence and expose any gaps, but the foundation appears to be solid.

The first release on any platform has risk of defects, which can be corrected once identified.
I would expect that a first release on any platform would include a warning or disclaimer stating that it is new port.

Lastly, the QNX-specific section added to port.h appears to solve the SA_RESTART issue for QNX, while having no impact
oncompiles of existing platforms. 

About configure:
--------------------
"./configure" barked at 2 things on QNX, and it advised using both "--without-readline --disable-thread-safety".
I can investigate further, but I have been focusing on the bigger issues first.

I hope the explanations above address your main concerns.
Again, thanks for your response!

Keith Baker

> -----Original Message-----
> From: Andres Freund [mailto:andres@2ndquadrant.com]
> Sent: Wednesday, August 20, 2014 7:25 PM
> To: Baker, Keith [OCDUS Non-J&J]
> Cc: Alvaro Herrera; Robert Haas; Tom Lane; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL
>
> Hi,
>
> On 2014-08-20 21:21:41 +0000, Baker, Keith [OCDUS Non-J&J] wrote:
> > To work around lack of SA_RESTART, I added QNX-specific retry macros
> > to port.h With these macros in place "make check" runs cleanly (fails in
> many place without them).
> >
> >     +#if defined(__QNX__)
> >     +/* QNX does not support sigaction SA_RESTART. We must retry
> interrupted calls (EINTR) */
> >     +/* Helper macros, used to build our retry macros */
> >     +#define PG_RETRY_EINTR3(exp,val,type) ({ type _tmp_rc; do _tmp_rc =
> (exp); while (_tmp_rc == (val) && errno == EINTR); _tmp_rc; })
> >     +#define PG_RETRY_EINTR(exp) PG_RETRY_EINTR3(exp,-1L,long int)
> >     +#define PG_RETRY_EINTR_FILE(exp) PG_RETRY_EINTR3(exp,NULL,FILE
> *)
> >     +/* override calls known to return EINTR when interrupted */
> >     +#define close(a) PG_RETRY_EINTR(close(a))
> >     +#define fclose(a) PG_RETRY_EINTR(fclose(a))
> >     +#define fdopen(a,b) PG_RETRY_EINTR_FILE(fdopen(a,b))
> >     +#define fopen(a,b) PG_RETRY_EINTR_FILE(fopen(a,b))
> >     +#define freopen(a,b,c) PG_RETRY_EINTR_FILE(freopen(a,b,c))
> >     +#define fseek(a,b,c) PG_RETRY_EINTR(fseek(a,b,c))
> >     +#define fseeko(a,b,c) PG_RETRY_EINTR(fseeko(a,b,c))
> >     +#define ftruncate(a,b) PG_RETRY_EINTR(ftruncate(a,b))
> >     +#define lseek(a,b,c) PG_RETRY_EINTR(lseek(a,b,c))
> >     +#define open(a,b,...) ({ int _tmp_rc; do _tmp_rc =
> open(a,b,##__VA_ARGS__); while (_tmp_rc == (-1) && errno == EINTR);
> _tmp_rc; })
> >     +#define shm_open(a,b,c) PG_RETRY_EINTR(shm_open(a,b,c))
> >     +#define stat(a,b) PG_RETRY_EINTR(stat(a,b))
> >     +#define unlink(a) PG_RETRY_EINTR(unlink(a))
> >     ... (Macros for read and write are similar but slightly longer, so I omit
> them here)...
> >     +#endif    /* __QNX__ */
>
> I think this is a horrible way to go and unlikely to succeed. You're surely going
> to miss calls and it's going to need to be maintained continuously. We'll miss
> adding things which will then only break under load. Which most poeple
> won't be able to generate under qnx.
>
> The only reasonably way to fake kernel SA_RESTART support is doing so is in
> $platform's libc. In the syscall wrapper.
>
> > Here is what I used for configure, I am open to suggestions:
> >     ./configure --without-readline --disable-thread-safety
>
> Why is the --disable-thread-safety needed?
>
> Greetings,
>
> Andres Freund
>
> --
>  Andres Freund                       http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Alvaro Herrera

Дата:

21 августа 2014 г., 17:29:04

Baker, Keith [OCDUS Non-J&J] wrote:

> About configure:
> --------------------
> "./configure" barked at 2 things on QNX, and it advised using both "--without-readline --disable-thread-safety".
> I can investigate further, but I have been focusing on the bigger issues first.

I don't think thread-safety is of great concern.  The backend is not
multithreaded, and neither are the utilities (I think the only exception
is pgbench, and even there it is optional).  The only problem, as I
recall, would be that libpq would not lock things correctly when used in
a multithreaded program.  I think you will need to solve this
eventually, but it doesn't look as critical as the others.

I was asking specifically about spinlocks because if you have to use
that switch, it means our spinlock implementation doesn't cover your
platform, and you would need to add something to support native
spinlocks.  Since you're using gcc on x86, I assume your port is
choosing an already existing, working implementation.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Noah Misch

Дата:

22 августа 2014 г., 05:36:47

On Thu, Aug 21, 2014 at 01:33:38AM +0200, Andres Freund wrote:
> On 2014-07-25 18:29:53 -0400, Tom Lane wrote:
> > > *         QNX lacks sigaction SA_RESTART: I modified "src/include/port.h" to define macros to retry system calls
uponEINTR (open,read,write,...) when compiled on QNX
 
> > 
> > That's pretty scary too.  For one thing, such macros would affect every
> > call site whether it's running with SA_RESTART or not.  Do you really
> > need it?  It looks to me like we just turn off HAVE_POSIX_SIGNALS if
> > you don't have SA_RESTART.  Maybe that code has bit-rotted by now, but
> > it did work at one time.
> 
> I have pretty much no trust that we're maintaining
> !HAVE_POSIX_SIGNAL. And none that we have that capability of doing so. I
> seriously doubt there's any !HAVE_POSIX_SIGNAL animals and
> 873ab97219caabeb2f7b390268a4fe01e2b7518c makes it pretty darn unlikely
> that we have much chance of finding such mistakes during development.

I bet it's fine for its intended target, namely BSD-style signal() in which
SA_RESTART-like behavior is implicit.  See the src/port/pqsignal.c header
comment.  PostgreSQL has no support for V7-style/QNX-style signal().

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Andres Freund

Дата:

22 августа 2014 г., 07:34:53

On 2014-08-22 01:36:37 -0400, Noah Misch wrote:
> On Thu, Aug 21, 2014 at 01:33:38AM +0200, Andres Freund wrote:
> > On 2014-07-25 18:29:53 -0400, Tom Lane wrote:
> > > > *         QNX lacks sigaction SA_RESTART: I modified "src/include/port.h" to define macros to retry system
callsupon EINTR (open,read,write,...) when compiled on QNX
 
> > > 
> > > That's pretty scary too.  For one thing, such macros would affect every
> > > call site whether it's running with SA_RESTART or not.  Do you really
> > > need it?  It looks to me like we just turn off HAVE_POSIX_SIGNALS if
> > > you don't have SA_RESTART.  Maybe that code has bit-rotted by now, but
> > > it did work at one time.
> > 
> > I have pretty much no trust that we're maintaining
> > !HAVE_POSIX_SIGNAL. And none that we have that capability of doing so. I
> > seriously doubt there's any !HAVE_POSIX_SIGNAL animals and
> > 873ab97219caabeb2f7b390268a4fe01e2b7518c makes it pretty darn unlikely
> > that we have much chance of finding such mistakes during development.
> 
> I bet it's fine for its intended target, namely BSD-style signal() in which
> SA_RESTART-like behavior is implicit.  See the src/port/pqsignal.c header
> comment.  PostgreSQL has no support for V7-style/QNX-style signal().

That might be true - although I'm not sure it actually still works - but
my point is that I can't see Tom's suggestion on relying on
!HAVE_POSIX_SIGNALS for QNX work out.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Noah Misch

Дата:

22 августа 2014 г., 13:45:07

On Fri, Aug 22, 2014 at 09:34:42AM +0200, Andres Freund wrote:
> On 2014-08-22 01:36:37 -0400, Noah Misch wrote:
> > On Thu, Aug 21, 2014 at 01:33:38AM +0200, Andres Freund wrote:
> > > On 2014-07-25 18:29:53 -0400, Tom Lane wrote:
> > > > > *         QNX lacks sigaction SA_RESTART: I modified "src/include/port.h" to define macros to retry system
callsupon EINTR (open,read,write,...) when compiled on QNX
 
> > > > 
> > > > That's pretty scary too.  For one thing, such macros would affect every
> > > > call site whether it's running with SA_RESTART or not.  Do you really
> > > > need it?  It looks to me like we just turn off HAVE_POSIX_SIGNALS if
> > > > you don't have SA_RESTART.  Maybe that code has bit-rotted by now, but
> > > > it did work at one time.
> > > 
> > > I have pretty much no trust that we're maintaining
> > > !HAVE_POSIX_SIGNAL. And none that we have that capability of doing so. I
> > > seriously doubt there's any !HAVE_POSIX_SIGNAL animals and
> > > 873ab97219caabeb2f7b390268a4fe01e2b7518c makes it pretty darn unlikely
> > > that we have much chance of finding such mistakes during development.
> > 
> > I bet it's fine for its intended target, namely BSD-style signal() in which
> > SA_RESTART-like behavior is implicit.  See the src/port/pqsignal.c header
> > comment.  PostgreSQL has no support for V7-style/QNX-style signal().
> 
> That might be true - although I'm not sure it actually still works - but
> my point is that I can't see Tom's suggestion on relying on
> !HAVE_POSIX_SIGNALS for QNX work out.

True.

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Andres Freund

Дата:

22 августа 2014 г., 14:27:58

Hi,

On 2014-08-21 15:25:44 +0000, Baker, Keith [OCDUS Non-J&J] wrote:
> About SA_RESTART:
> ------------------------
> I would like to offer you a different perspective which may alter your current opinion.
> I believe the port.h QNX macro replacement for SA_RESTART is still a reasonable solution on QNX for these reasons:
> 
> First, I think it is better to adapt PostgreSQL to suit the platform
> than to adapt the platform to suit PostgreSQL.

Well. That might be somewhat true for a popular platform. Which QNX
really isn't. I personally don't believe your approach to be likely to
end up with a correct and maintainable port.

> Changing default behavior of libc on QNX to suit PostgreSQL may break
> other applications which rely on the current behavior of libc.

I don't see how *adding* SA_RESTART support which would only be used
when SA_RESTART is being passed to sigaction(), would do that.

> Yes, I could forget to add a port.h macro for a given interruptible
> primitive, but I could likewise forget to update the wrapper for that
> call in a custom libc.

> I requested that QNX support provide me a list of interruptible
> primitives, but I was able to identify many by searching through the
> QNX help.  Definition of a new interruptible primitive is a rare
> event, so once a solid list of macros is in place for QNX, it should
> need very little maintenance.  If you have any specific calls you
> believe are missing from my list of macros, I would be happy to add
> them.

I have no idea whether there are any other ones - I don't have access to
a QNX machine, and I don't personally wan't any. The problem is that we
might want to start using new syscalls or QNX might introduce new
interruptible signals. Problems caused by missed interruptible syscalls
won't show during low-load testing like pg_regress. They'll show up
during production usage.

> port.h is included in c.h, which is in postgres.h, so the QNX macros
> should be effective for all QNX PostgreSQL compiles.  If it were not,
> no one could reply on any port.h features on any platform.

Yea, that's not a concern I have.

> The first release on any platform has risk of defects, which can be
> corrected once identified.  I would expect that a first release on any
> platform would include a warning or disclaimer stating that it is new
> port.
> 
> Lastly, the QNX-specific section added to port.h appears to solve the
> SA_RESTART issue for QNX, while having no impact on compiles of
> existing platforms.

My problem is that it's ugly hack for a niche paltform that will need to
be maintained for a long while into the future. I don't have a problem
adding support for not that frequently used platforms if the support is
very localized, but that's definitely not the case here.

> About configure:
> --------------------
> "./configure" barked at 2 things on QNX, and it advised using both
> "--without-readline --disable-thread-safety".  I can investigate
> further, but I have been focusing on the bigger issues first.

Yea, those aren't really critical. It'd be interesting to know why the
the thread safety test fails - quite possibly it's just the configure
test for pthreads not being very good.

Greetings,

Andres Freund

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Alvaro Herrera

Дата:

22 августа 2014 г., 14:42:05

Andres Freund wrote:
> Hi,
> 
> On 2014-08-21 15:25:44 +0000, Baker, Keith [OCDUS Non-J&J] wrote:
> > About SA_RESTART:
> > ------------------------
> > I would like to offer you a different perspective which may alter your current opinion.
> > I believe the port.h QNX macro replacement for SA_RESTART is still a reasonable solution on QNX for these reasons:
> > 
> > First, I think it is better to adapt PostgreSQL to suit the platform
> > than to adapt the platform to suit PostgreSQL.
> 
> Well. That might be somewhat true for a popular platform. Which QNX
> really isn't. I personally don't believe your approach to be likely to
> end up with a correct and maintainable port.
> 
> > Changing default behavior of libc on QNX to suit PostgreSQL may break
> > other applications which rely on the current behavior of libc.
> 
> I don't see how *adding* SA_RESTART support which would only be used
> when SA_RESTART is being passed to sigaction(), would do that.

I guess the important question here is how much traction does Keith have
with the QNX development group.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

"Baker, Keith [OCDUS Non-J&J]"

Дата:

22 августа 2014 г., 15:14:38

I am reaching out to our QNX support contacts today, I will let you know how they respond.

Keith Baker

> -----Original Message-----
> From: Alvaro Herrera [mailto:alvherre@2ndquadrant.com]
> Sent: Friday, August 22, 2014 10:42 AM
> To: Andres Freund
> Cc: Baker, Keith [OCDUS Non-J&J]; Robert Haas; Tom Lane; pgsql-
> hackers@postgresql.org
> Subject: Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL
>
> Andres Freund wrote:
> > Hi,
> >
> > On 2014-08-21 15:25:44 +0000, Baker, Keith [OCDUS Non-J&J] wrote:
> > > About SA_RESTART:
> > > ------------------------
> > > I would like to offer you a different perspective which may alter your
> current opinion.
> > > I believe the port.h QNX macro replacement for SA_RESTART is still a
> reasonable solution on QNX for these reasons:
> > >
> > > First, I think it is better to adapt PostgreSQL to suit the platform
> > > than to adapt the platform to suit PostgreSQL.
> >
> > Well. That might be somewhat true for a popular platform. Which QNX
> > really isn't. I personally don't believe your approach to be likely to
> > end up with a correct and maintainable port.
> >
> > > Changing default behavior of libc on QNX to suit PostgreSQL may
> > > break other applications which rely on the current behavior of libc.
> >
> > I don't see how *adding* SA_RESTART support which would only be used
> > when SA_RESTART is being passed to sigaction(), would do that.
>
> I guess the important question here is how much traction does Keith have
> with the QNX development group.
>
> --
> Álvaro Herrera                http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services

Re: Proposal to add a QNX 6.5 port to PostgreSQL

От

Andres Freund

Дата:

22 августа 2014 г., 15:32:14

On 2014-08-22 10:41:55 -0400, Alvaro Herrera wrote:
> Andres Freund wrote:
> > Hi,
> > 
> > On 2014-08-21 15:25:44 +0000, Baker, Keith [OCDUS Non-J&J] wrote:
> > > About SA_RESTART:
> > > ------------------------
> > > I would like to offer you a different perspective which may alter your current opinion.
> > > I believe the port.h QNX macro replacement for SA_RESTART is still a reasonable solution on QNX for these
reasons:
> > > 
> > > First, I think it is better to adapt PostgreSQL to suit the platform
> > > than to adapt the platform to suit PostgreSQL.
> > 
> > Well. That might be somewhat true for a popular platform. Which QNX
> > really isn't. I personally don't believe your approach to be likely to
> > end up with a correct and maintainable port.
> > 
> > > Changing default behavior of libc on QNX to suit PostgreSQL may break
> > > other applications which rely on the current behavior of libc.
> > 
> > I don't see how *adding* SA_RESTART support which would only be used
> > when SA_RESTART is being passed to sigaction(), would do that.
> 
> I guess the important question here is how much traction does Keith have
> with the QNX development group.

If you search for SA_RESTART and QNX there's a fair number of bugs
cropping up where it leads to problems... I think a large amount of open
source software essentially relies on it these days.

Note that it doesn't necessarily need to be implemented inside QNX. It
could very well be a wrapper library that you would optionally link
against. That'd benefit more users than just postgres on QNX.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Proposal to add a QNX 6.5 port to PostgreSQL

Вложения

Вложения

Вложения