Re: Escaping from blocked send() reprised.

Поиск
Список
Период
Сортировка
От Noah Misch
Тема Re: Escaping from blocked send() reprised.
Дата
Msg-id 20150112003753.GB2722746@tornado.leadboat.com
обсуждение исходный текст
Ответ на Re: Escaping from blocked send() reprised.  (Andres Freund <andres@2ndquadrant.com>)
Ответы Re: Escaping from blocked send() reprised.  (Andres Freund <andres@2ndquadrant.com>)
Список pgsql-hackers
On Mon, Jan 12, 2015 at 12:40:50AM +0100, Andres Freund wrote:
> On 2015-01-11 16:36:07 -0500, Noah Misch wrote:
> > On Sat, Jan 10, 2015 at 03:25:42AM +0100, Andres Freund wrote:
> > > 0001-Allow-latches-to-wait-for-socket-writability-without.patch
> > >      Imo pretty close to commit and can be committed independently.
> > 
> > The key open question is whether all platforms of interest can reliably detect
> > end-of-file when poll()ing or select()ing for write only.  Older GNU/Linux
> > select() cannot; see attached test program.
> 
> Yuck. By my reading that's a violation of posix.

Agreed.

> I did test it a bit, and I didn't see problems, but that obviously
> doesn't say much about old versions.
> 
> Afaics we interestingly don't have any poll-less buildfarm animals that
> use unix_latch.c...

More likely is that some system will have a poll() exhibiting the same bug,
possibly via poll() being a wrapper around select().  Systems without poll()
are hard to find; the gnulib manual lists only Windows, BeOS and HP NonStop
OS.  HP NonStop OS is the one possibly of interest here.  I have never
personally seen a machine running it.

I recommend either (a) taking no action or (b) adding a regression test
verifying WaitLatchOrSocket() conformance in this scenario.  Then we can
decide what more to do if failure evidence emerges.

> > We use poll() there anyway, so the bug in that configuration does not
> > affect PostgreSQL.  Is it a bellwether of similar bugs in other
> > implementations, bugs that will affect PostgreSQL?
> 
> Hm. I can think of two stopgap measures we could add:
> 
> 1) If we're using select() and WL_SOCKET_WRITEABLE is set without
>    _READABLE, add a timeout of Min(1s, Max(passed_timeout, 1s)). As the
>    time spent waiting only for writable normally shouldn't be very long,
>    that shouldn't be noticeably bad for power usage.
> 2) Add a SIGPIPE handler that just does a SetLatch(MyLach).

I'm having trouble visualizing those proposed measures in detail, but I trust
that a decent workaround would emerge.

> > > +            if (pfds[0].revents & (POLLHUP | POLLERR | POLLNVAL))
> > > +            {
> > > +                /* EOF/error condition */
> > > +                if (wakeEvents & WL_SOCKET_READABLE)
> > > +                    result |= WL_SOCKET_READABLE;
> > > +                if (wakeEvents & WL_SOCKET_WRITEABLE)
> > > +                    result |= WL_SOCKET_WRITEABLE;
> > > +            }
> > 
> > With some poll() implementations (e.g. OS X), this can wrongly report
> > WL_SOCKET_WRITEABLE if the peer used shutdown(SHUT_WR).  I tentatively think
> > that's acceptable.  libpq does not use shutdown(), and other client interfaces
> > would do so at their own risk.  Should we worry about hostile clients creating
> > a denial-of-service by causing a server send() to block unexpectedly?
> > Probably not; a user able to send arbitrary TCP traffic to the postmaster port
> > can already achieve that.
> 
> Yea, this doesn't seem particularly concerning.
> 
> a) They can just stop consuming writes and use noticeable amounts of
> memory by doing output intensive queries. That uses significant os
> resources and is much harder to detect - today.

If there's anything to worry about here (unlikely), it would be with respect
to not-yet-authenticated connections only.

> b) does accepting WL_SOCKET_WRITEABLE without _READABLE change anything
> here? We already allow _WRITABLE...

Today's code translates POLLHUP to WL_SOCKET_READABLE; it must see POLLOUT to
set WL_SOCKET_WRITEABLE.  Your patch changes that.

> What happens if you write/send() in
> that state, btw?

write() reports EAGAIN.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Using 128-bit integers for sum, avg and statistics aggregates
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Escaping from blocked send() reprised.