Обсуждение: [BUGS] BUG #14720: getsockopt(TCP_KEEPALIVE) failed: Option not supported byprotocol

Поиск
Список
Период
Сортировка

[BUGS] BUG #14720: getsockopt(TCP_KEEPALIVE) failed: Option not supported byprotocol

От
lizenko79@gmail.com
Дата:
The following bug has been logged on the website:

Bug reference:      14720
Logged by:          Andrey Lizenko
Email address:      lizenko79@gmail.com
PostgreSQL version: 9.6.3
Operating system:   Solaris 11.3
Description:

I've got the following message running PostgreSQL 9.6.3 on Solaris 11.3
(both latest stable).

> getsockopt(TCP_KEEPALIVE) failed: Option not supported by protocol


Unfortunately, I can not reproduce it with libpq c code examples, but at
least I can see it while using pgAdmin 3 , pgAdmin 4 and zabbix monitoring
extension libzbxpgsql.

In getsockopt manual only SO_KEEPALIVE mentioned.

Regards,
Andrey Lizenko




--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #14720: getsockopt(TCP_KEEPALIVE) failed: Option notsupported by protocol

От
Alvaro Herrera
Дата:
lizenko79@gmail.com wrote:
> The following bug has been logged on the website:
> 
> Bug reference:      14720
> Logged by:          Andrey Lizenko
> Email address:      lizenko79@gmail.com
> PostgreSQL version: 9.6.3
> Operating system:   Solaris 11.3
> Description:        
> 
> I've got the following message running PostgreSQL 9.6.3 on Solaris 11.3
> (both latest stable).
> 
> > getsockopt(TCP_KEEPALIVE) failed: Option not supported by protocol
> 
> Unfortunately, I can not reproduce it with libpq c code examples, but at
> least I can see it while using pgAdmin 3 , pgAdmin 4 and zabbix monitoring
> extension libzbxpgsql.
> 
> In getsockopt manual only SO_KEEPALIVE mentioned.

It sounds like your system defines the TCP_KEEPALIVE symbol at compile
time but the kernel doesn't know it; maybe the package was compiled in a
system where the kernel does support that option, and you're running it
in one that doesn't?

Are you getting the message in the client side or server side?  If the
latter, you should just set tcp_keepalives_idle to 0 in postgresql.conf.
If the former, I think the only option is to fix the libpq compile.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> lizenko79@gmail.com wrote:
>> I've got the following message running PostgreSQL 9.6.3 on Solaris 11.3
>> (both latest stable).
>> getsockopt(TCP_KEEPALIVE) failed: Option not supported by protocol

> It sounds like your system defines the TCP_KEEPALIVE symbol at compile
> time but the kernel doesn't know it; maybe the package was compiled in a
> system where the kernel does support that option, and you're running it
> in one that doesn't?

Actually, I find the same error in the logs for our Solaris buildfarm
members.  So apparently that's been going on since day one, and we
hadn't noticed it, though I now find that it's been reported before:
https://www.postgresql.org/message-id/CAJgtxT6QL0_Gt+TkSDw=q1=YVJkT73FoSrtStcu5Hy+-SXn8rw@mail.gmail.com

Some googling turned up the tcp(7P) man page for Solaris 11:
https://docs.oracle.com/cd/E36784_01/html/E36884/tcp-7p.html#REFMAN7tcp-7p

and it says this:
 SunOS supports the keep-alive mechanism described in RFC 1122. It is enabled using the socket option SO_KEEPALIVE.
Whenenabled, the first keep-alive probe is sent out after a TCP is idle for two hours. If the peer does not respond to
theprobe within eight minutes, the TCP connection is aborted. You can alter the interval for sending out the first
probeusing the socket option TCP_KEEPALIVE_THRESHOLD. The option value is an unsigned integer in milliseconds. The
systemdefault is controlled by the TCP ndd parameter tcp_keepalive_interval. The minimum value is ten seconds. The
maximumis ten days, while the default is two hours. If you receive no response to the probe, you can use the
TCP_KEEPALIVE_ABORT_THRESHOLDsocket option to change the time threshold for aborting a TCP connection. The option value
isan unsigned integer in milliseconds. The value zero indicates that TCP should never time out and abort the connection
whenprobing. The system default is controlled by the TCP ndd parameter tcp_keepalive_abort_interval. The default is
eightminutes. 

So apparently, Linux's TCP_KEEPIDLE corresponds to Solaris'
TCP_KEEPALIVE_THRESHOLD.  TCP_KEEPINTVL and TCP_KEEPCNT seem to have no
direct equivalent, although TCP_KEEPALIVE_ABORT_THRESHOLD would correspond
to their product.

I suggest that we ought to expand the keepalive code to know about this
synonym.
        regards, tom lane


--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

I wrote:
> So apparently, Linux's TCP_KEEPIDLE corresponds to Solaris'
> TCP_KEEPALIVE_THRESHOLD.  TCP_KEEPINTVL and TCP_KEEPCNT seem to have no
> direct equivalent, although TCP_KEEPALIVE_ABORT_THRESHOLD would correspond
> to their product.

> I suggest that we ought to expand the keepalive code to know about this
> synonym.

Concretely, something like the attached.  I have no way to test this
locally, so I'm thinking of just pushing it and seeing what the buildfarm
says.

            regards, tom lane

diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c
index 261e9be..c62f7e9 100644
*** a/src/backend/libpq/pqcomm.c
--- b/src/backend/libpq/pqcomm.c
*************** pq_setkeepaliveswin32(Port *port, int id
*** 1676,1682 ****
  int
  pq_getkeepalivesidle(Port *port)
  {
! #if defined(TCP_KEEPIDLE) || defined(TCP_KEEPALIVE) || defined(WIN32)
      if (port == NULL || IS_AF_UNIX(port->laddr.addr.ss_family))
          return 0;

--- 1676,1682 ----
  int
  pq_getkeepalivesidle(Port *port)
  {
! #if defined(TCP_KEEPIDLE) || defined(TCP_KEEPALIVE_THRESHOLD) || defined(TCP_KEEPALIVE) || defined(WIN32)
      if (port == NULL || IS_AF_UNIX(port->laddr.addr.ss_family))
          return 0;

*************** pq_getkeepalivesidle(Port *port)
*** 1688,1694 ****
  #ifndef WIN32
          ACCEPT_TYPE_ARG3 size = sizeof(port->default_keepalives_idle);

! #ifdef TCP_KEEPIDLE
          if (getsockopt(port->sock, IPPROTO_TCP, TCP_KEEPIDLE,
                         (char *) &port->default_keepalives_idle,
                         &size) < 0)
--- 1688,1695 ----
  #ifndef WIN32
          ACCEPT_TYPE_ARG3 size = sizeof(port->default_keepalives_idle);

! #if defined(TCP_KEEPIDLE)
!         /* TCP_KEEPIDLE is the name of this option on Linux and *BSD */
          if (getsockopt(port->sock, IPPROTO_TCP, TCP_KEEPIDLE,
                         (char *) &port->default_keepalives_idle,
                         &size) < 0)
*************** pq_getkeepalivesidle(Port *port)
*** 1696,1702 ****
              elog(LOG, "getsockopt(TCP_KEEPIDLE) failed: %m");
              port->default_keepalives_idle = -1; /* don't know */
          }
! #else
          if (getsockopt(port->sock, IPPROTO_TCP, TCP_KEEPALIVE,
                         (char *) &port->default_keepalives_idle,
                         &size) < 0)
--- 1697,1713 ----
              elog(LOG, "getsockopt(TCP_KEEPIDLE) failed: %m");
              port->default_keepalives_idle = -1; /* don't know */
          }
! #elif defined(TCP_KEEPALIVE_THRESHOLD)
!         /* TCP_KEEPALIVE_THRESHOLD is the name of this option on Solaris */
!         if (getsockopt(port->sock, IPPROTO_TCP, TCP_KEEPALIVE_THRESHOLD,
!                        (char *) &port->default_keepalives_idle,
!                        &size) < 0)
!         {
!             elog(LOG, "getsockopt(TCP_KEEPALIVE_THRESHOLD) failed: %m");
!             port->default_keepalives_idle = -1; /* don't know */
!         }
! #else                            /* must have TCP_KEEPALIVE */
!         /* TCP_KEEPALIVE is the name of this option on macOS */
          if (getsockopt(port->sock, IPPROTO_TCP, TCP_KEEPALIVE,
                         (char *) &port->default_keepalives_idle,
                         &size) < 0)
*************** pq_getkeepalivesidle(Port *port)
*** 1704,1710 ****
              elog(LOG, "getsockopt(TCP_KEEPALIVE) failed: %m");
              port->default_keepalives_idle = -1; /* don't know */
          }
! #endif                            /* TCP_KEEPIDLE */
  #else                            /* WIN32 */
          /* We can't get the defaults on Windows, so return "don't know" */
          port->default_keepalives_idle = -1;
--- 1715,1721 ----
              elog(LOG, "getsockopt(TCP_KEEPALIVE) failed: %m");
              port->default_keepalives_idle = -1; /* don't know */
          }
! #endif                            /* KEEPIDLE/KEEPALIVE_THRESHOLD/KEEPALIVE */
  #else                            /* WIN32 */
          /* We can't get the defaults on Windows, so return "don't know" */
          port->default_keepalives_idle = -1;
*************** pq_setkeepalivesidle(int idle, Port *por
*** 1723,1729 ****
      if (port == NULL || IS_AF_UNIX(port->laddr.addr.ss_family))
          return STATUS_OK;

! #if defined(TCP_KEEPIDLE) || defined(TCP_KEEPALIVE) || defined(SIO_KEEPALIVE_VALS)
      if (idle == port->keepalives_idle)
          return STATUS_OK;

--- 1734,1741 ----
      if (port == NULL || IS_AF_UNIX(port->laddr.addr.ss_family))
          return STATUS_OK;

! /* check SIO_KEEPALIVE_VALS here, not just WIN32, as some toolchains lack it */
! #if defined(TCP_KEEPIDLE) || defined(TCP_KEEPALIVE_THRESHOLD) || defined(TCP_KEEPALIVE) ||
defined(SIO_KEEPALIVE_VALS)
      if (idle == port->keepalives_idle)
          return STATUS_OK;

*************** pq_setkeepalivesidle(int idle, Port *por
*** 1742,1755 ****
      if (idle == 0)
          idle = port->default_keepalives_idle;

! #ifdef TCP_KEEPIDLE
      if (setsockopt(port->sock, IPPROTO_TCP, TCP_KEEPIDLE,
                     (char *) &idle, sizeof(idle)) < 0)
      {
          elog(LOG, "setsockopt(TCP_KEEPIDLE) failed: %m");
          return STATUS_ERROR;
      }
! #else
      if (setsockopt(port->sock, IPPROTO_TCP, TCP_KEEPALIVE,
                     (char *) &idle, sizeof(idle)) < 0)
      {
--- 1754,1777 ----
      if (idle == 0)
          idle = port->default_keepalives_idle;

! #if defined(TCP_KEEPIDLE)
!     /* TCP_KEEPIDLE is the name of this option on Linux and *BSD */
      if (setsockopt(port->sock, IPPROTO_TCP, TCP_KEEPIDLE,
                     (char *) &idle, sizeof(idle)) < 0)
      {
          elog(LOG, "setsockopt(TCP_KEEPIDLE) failed: %m");
          return STATUS_ERROR;
      }
! #elif defined(TCP_KEEPALIVE_THRESHOLD)
!     /* TCP_KEEPALIVE_THRESHOLD is the name of this option on Solaris */
!     if (setsockopt(port->sock, IPPROTO_TCP, TCP_KEEPALIVE_THRESHOLD,
!                    (char *) &idle, sizeof(idle)) < 0)
!     {
!         elog(LOG, "setsockopt(TCP_KEEPALIVE_THRESHOLD) failed: %m");
!         return STATUS_ERROR;
!     }
! #else                            /* must have TCP_KEEPALIVE */
!     /* TCP_KEEPALIVE is the name of this option on macOS */
      if (setsockopt(port->sock, IPPROTO_TCP, TCP_KEEPALIVE,
                     (char *) &idle, sizeof(idle)) < 0)
      {
*************** pq_setkeepalivesidle(int idle, Port *por
*** 1762,1768 ****
  #else                            /* WIN32 */
      return pq_setkeepaliveswin32(port, idle, port->keepalives_interval);
  #endif
! #else                            /* TCP_KEEPIDLE || SIO_KEEPALIVE_VALS */
      if (idle != 0)
      {
          elog(LOG, "setting the keepalive idle time is not supported");
--- 1784,1790 ----
  #else                            /* WIN32 */
      return pq_setkeepaliveswin32(port, idle, port->keepalives_interval);
  #endif
! #else                            /* no way to set it */
      if (idle != 0)
      {
          elog(LOG, "setting the keepalive idle time is not supported");
*************** pq_setkeepalivesinterval(int interval, P
*** 1812,1818 ****
      if (port == NULL || IS_AF_UNIX(port->laddr.addr.ss_family))
          return STATUS_OK;

! #if defined(TCP_KEEPINTVL) || defined (SIO_KEEPALIVE_VALS)
      if (interval == port->keepalives_interval)
          return STATUS_OK;

--- 1834,1840 ----
      if (port == NULL || IS_AF_UNIX(port->laddr.addr.ss_family))
          return STATUS_OK;

! #if defined(TCP_KEEPINTVL) || defined(SIO_KEEPALIVE_VALS)
      if (interval == port->keepalives_interval)
          return STATUS_OK;

diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 764e960..e32c42b 100644
*** a/src/interfaces/libpq/fe-connect.c
--- b/src/interfaces/libpq/fe-connect.c
*************** setKeepalivesIdle(PGconn *conn)
*** 1470,1476 ****
      if (idle < 0)
          idle = 0;

! #ifdef TCP_KEEPIDLE
      if (setsockopt(conn->sock, IPPROTO_TCP, TCP_KEEPIDLE,
                     (char *) &idle, sizeof(idle)) < 0)
      {
--- 1470,1477 ----
      if (idle < 0)
          idle = 0;

! #if defined(TCP_KEEPIDLE)
!     /* TCP_KEEPIDLE is the name of this option on Linux and *BSD */
      if (setsockopt(conn->sock, IPPROTO_TCP, TCP_KEEPIDLE,
                     (char *) &idle, sizeof(idle)) < 0)
      {
*************** setKeepalivesIdle(PGconn *conn)
*** 1481,1489 ****
                            SOCK_STRERROR(SOCK_ERRNO, sebuf, sizeof(sebuf)));
          return 0;
      }
! #else
! #ifdef TCP_KEEPALIVE
!     /* macOS uses TCP_KEEPALIVE rather than TCP_KEEPIDLE */
      if (setsockopt(conn->sock, IPPROTO_TCP, TCP_KEEPALIVE,
                     (char *) &idle, sizeof(idle)) < 0)
      {
--- 1482,1501 ----
                            SOCK_STRERROR(SOCK_ERRNO, sebuf, sizeof(sebuf)));
          return 0;
      }
! #elif defined(TCP_KEEPALIVE_THRESHOLD)
!     /* TCP_KEEPALIVE_THRESHOLD is the name of this option on Solaris */
!     if (setsockopt(conn->sock, IPPROTO_TCP, TCP_KEEPALIVE_THRESHOLD,
!                    (char *) &idle, sizeof(idle)) < 0)
!     {
!         char        sebuf[256];
!
!         appendPQExpBuffer(&conn->errorMessage,
!                           libpq_gettext("setsockopt(TCP_KEEPALIVE_THRESHOLD) failed: %s\n"),
!                           SOCK_STRERROR(SOCK_ERRNO, sebuf, sizeof(sebuf)));
!         return 0;
!     }
! #elif defined(TCP_KEEPALIVE)
!     /* TCP_KEEPALIVE is the name of this option on macOS */
      if (setsockopt(conn->sock, IPPROTO_TCP, TCP_KEEPALIVE,
                     (char *) &idle, sizeof(idle)) < 0)
      {
*************** setKeepalivesIdle(PGconn *conn)
*** 1495,1501 ****
          return 0;
      }
  #endif
- #endif

      return 1;
  }
--- 1507,1512 ----
*************** setKeepalivesCount(PGconn *conn)
*** 1562,1568 ****

      return 1;
  }
! #else                            /* Win32 */
  #ifdef SIO_KEEPALIVE_VALS
  /*
   * Enable keepalives and set the keepalive values on Win32,
--- 1573,1579 ----

      return 1;
  }
! #else                            /* WIN32 */
  #ifdef SIO_KEEPALIVE_VALS
  /*
   * Enable keepalives and set the keepalive values on Win32,

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #14720: getsockopt(TCP_KEEPALIVE) failed: Option notsupported by protocol

От
Michael Paquier
Дата:
On Wed, Jun 28, 2017 at 7:26 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Concretely, something like the attached.  I have no way to test this
> locally, so I'm thinking of just pushing it and seeing what the buildfarm
> says.

! #if defined(TCP_KEEPIDLE)
!     /* TCP_KEEPIDLE is the name of this option on Linux and *BSD */     if (setsockopt(port->sock, IPPROTO_TCP,
TCP_KEEPIDLE,                   (char *) &idle, sizeof(idle)) < 0)     {         elog(LOG, "setsockopt(TCP_KEEPIDLE)
failed:%m");         return STATUS_ERROR;     }
 
! #elif defined(TCP_KEEPALIVE_THRESHOLD)
What about defining a PG_TCP_KEEPALIVE instead?

Side note: Windows has something with a different set of options:
https://msdn.microsoft.com/en-us/library/windows/desktop/ms740476(v=vs.85).aspx
-- 
Michael


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Michael Paquier <michael.paquier@gmail.com> writes:
> What about defining a PG_TCP_KEEPALIVE instead?

I thought about that, but it would complicate constructing the elog
messages, so I didn't bother.  It might be worth working harder if
we ever grow any more alternatives.

> Side note: Windows has something with a different set of options:
> https://msdn.microsoft.com/en-us/library/windows/desktop/ms740476(v=vs.85).aspx

Yeah, the Windows part of that code is a real mess.  But it works
as far as I've heard.
        regards, tom lane


--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

I wrote:
> Concretely, something like the attached.  I have no way to test this
> locally, so I'm thinking of just pushing it and seeing what the buildfarm
> says.

So that didn't work: castoroides is still showing

[5953a7e1.1fff:13] LOG:  getsockopt(TCP_KEEPALIVE) failed: Option not supported by protocol
[5953a7e1.1fff:14] STATEMENT:  select name, setting from pg_settings where name like 'enable%';

which implies that TCP_KEEPALIVE_THRESHOLD doesn't exist on Solaris 10.
Evidently, the logic here needs to be along the lines of

#if defined(TCP_KEEPIDLE)
...
#elif defined(TCP_KEEPALIVE_THRESHOLD)
...
#elif defined(TCP_KEEPALIVE) && defined(__darwin__)
...

Or we could make the last test be !defined(__solaris__), but I'm not
sure that's better.  Anybody have an opinion?

As long as I have to touch this code again anyway, I'm also going to
look into Michael's thought of trying to reduce code duplication.
I was unhappy yesterday about how to handle the error messages,
but we could do it like this:

#if defined(TCP_KEEPIDLE)
#define PG_TCP_KEEPALIVE TCP_KEEPIDLE
#define PG_TCP_KEEPALIVE_STR "TCP_KEEPIDLE"
#elif ...

#ifdef PG_TCP_KEEPALIVE     if (setsockopt(port->sock, IPPROTO_TCP, PG_TCP_KEEPALIVE,                    (char *)
&idle,sizeof(idle)) < 0)     {         elog(LOG, "setsockopt(%s) failed: %m", PG_TCP_KEEPALIVE_STR); 

which doesn't seem too painful.
        regards, tom lane


--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs