Re: Connection problem under extreme load.

Поиск
Список
Период
Сортировка
От Jeffery Collins
Тема Re: Connection problem under extreme load.
Дата
Msg-id 39818331.653B0DD9@onyx-technologies.com
обсуждение исходный текст
Ответ на Connection problem under extreme load.  (Jeffery Collins <collins@onyx-technologies.com>)
Список pgsql-general
Tom Lane wrote:

> Interesting.  I *think* (not totally sure) that 'Connection refused'
> here implies that the kernel rejected the connection before the
> postmaster ever had a chance to do anything with it.  The most likely
> reason would probably be that the maximum connection backlog was
> exceeded.  On my system (HPUX) man listen(2) sez
>
>      int listen(int s, int backlog);
>
>      ...
>
>      backlog defines the desirable queue length for pending connections.
>      The actual queue length may be greater than the specified backlog . If
>      a connection request arrives when the queue is full, the client will
>      receive an ETIMEDOUT error.
>
>      backlog is limited to the range of 0 to SOMAXCONN, which is defined in
>      <sys/socket.h>.  SOMAXCONN is currently set to 20.  If any other value
>      is specified, the system automatically assigns the closest value
>      within the range.  A backlog of 0 specifies only 1 pending connection
>      is allowed at any given time.
>
> ETIMEDOUT is not the error you are getting, but that could be a platform
> difference.  In fact the nearest BSD system I have access to says that
> "the client will receive an error with an indication of ECONNREFUSED".
> The same box defines SOMAXCONN as 5, which seems a tad low :-(
>
> So, it would seem your options are
>   (a) recompile your kernel with larger SOMAXCONN, or
>   (b) figure out why the postmaster isn't responding faster.
>
> Offhand, the only performance problem I know of in the postmaster is
> that it does IDENT checks serially --- if you specify ident checks in
> pg_hba.conf, the postmaster will wait for a response from the ident
> server before processing more connection requests.  So if you're using
> IDENT authentication you might want to consider some other answer, or
> else fix that code and send in a patch.
>
> If that's not it, please poke into it further and let us know what you
> find out.
>
>                         regards, tom lane

I think you are correct.  The listen man page on my machine (Sun Solaris)
says:

     If a connection request arrives with  the  queue  full,  the client  will
receive
    an error with an indication of ECONNREFUSED...

The SOMAXCONN field is also 5, which IS a tad low.

Unfortunately, I don't have the ability to rebuild the kernel so this is not
an option.

As to why the postmaster was not responding faster, I think it was because of
the load on the machine.  The load was so heavy, and there were so many
connection requests at the same time, I am not surprised that it could not
keep up.  My test was probably not a realistic load.

I think my best option is to retry the connection when this happens.  I do
wish my kernel would return a different failure, because there really is no
way to distinguish a legitimate ECONNREFUSED (i.e. the server really isn't
listening), versus a backlog queue full situation.

Once again, thank you very much,
Jeff



В списке pgsql-general по дате отправления:

Предыдущее
От: Louis-David Mitterrand
Дата:
Сообщение: ALTER TABLE has not effect on children tables?
Следующее
От: Ian Turner
Дата:
Сообщение: Inheritance