Re: [GENERAL] UDP buffer drops / statistics collector

Поиск
Список
Период
Сортировка
От Tim Kane
Тема Re: [GENERAL] UDP buffer drops / statistics collector
Дата
Msg-id CADVWZZK8eyAC9c=ha0tTbRQ9u0x5+L4aQdDp2hWsBLWzppbUkA@mail.gmail.com
обсуждение исходный текст
Ответ на [GENERAL] UDP buffer drops / statistics collector  (Tim Kane <tim.kane@gmail.com>)
Ответы Re: [GENERAL] UDP buffer drops / statistics collector  (Tim Kane <tim.kane@gmail.com>)
Список pgsql-general
Okay, so I've run an strace on the collector process during a buffer drop event.
I can see evidence of a recvfrom loop pulling in a maximum of 142kb.

While I've had already increased rmem_max, it would appear this is not being observed by the kernel.
rmem_default is set to 124kb, which would explain the above read maxing out just slightly beyond this (presuming a ring buffer filling up behind the read).

I'm going to try increasing rmem_default and see if it has any positive effect.. (and then investigate why the kernel doesn't want to consider rmem_max)..





On Tue, Apr 18, 2017 at 8:05 AM Tim Kane <tim.kane@gmail.com> wrote:
Hi all,

I'm seeing sporadic (but frequent) UDP buffer drops on a host that so far I've not been able to resolve.

The drops are originating from postgres processes, and from what I know - the only UDP traffic generated by postgres should be consumed by the statistics collector - but for whatever reason, it's failing to read the packets quickly enough.

Interestingly, I'm seeing these drops occur even when the system is idle..  but every 15 minutes or so (not consistently enough to isolate any particular activity) we'll see in the order of ~90 packets dropped at a time.

I'm running 9.6.2, but the issue was previously occurring on 9.2.4 (on the same hardware)


If it's relevant..  there are two instances of postgres running (and consequently, 2 instances of the stats collector process) though 1 of those instances is most definitely idle for most of the day.

In an effort to try to resolve the problem, I've increased (x2) the UDP recv buffer sizes on the host - but it seems to have had no effect.

cat /proc/sys/net/core/rmem_max
1677216

The following parameters are configured

track_activities on
track_counts     on
track_functions  none
track_io_timing  off


There are approximately 80-100 connections at any given time.

It seems that the issue started a few weeks ago, around the time of a reboot on the given host... but it's difficult to know what (if anything) has changed, or why :-/


Incidentally... the documentation doesn't seem to have any mention of UDP whatsoever.  I'm going to use this as an opportunity to dive into the source - but perhaps it's worth improving the documentation around this?

My next step is to try disabling track_activities and track_counts to see if they improve matters any, but I wouldn't expect these to generate enough data to flood the UDP buffers :-/

Any ideas?



В списке pgsql-general по дате отправления:

Предыдущее
От: Melvin Davidson
Дата:
Сообщение: Re: [GENERAL] Clone PostgreSQL schema
Следующее
От: Adrian Klaver
Дата:
Сообщение: Re: [GENERAL] QGIS Loads Black Screen For PostGIS Out-Db Raster Data