Обсуждение: kqueue

Поиск

Список

Период

Сортировка

Re: kqueue

От

Robert Haas

Дата:

21 апреля 2016 г., 18:15:58

On Tue, Mar 29, 2016 at 7:53 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On the WaitEventSet thread I posted a small patch to add kqueue
> support[1].  Since then I peeked at how some other software[2]
> interacts with kqueue and discovered that there are platforms
> including NetBSD where kevent.udata is an intptr_t instead of a void
> *.  Here's a version which should compile there.  Would any NetBSD
> user be interested in testing this?  (An alternative would be to make
> configure to test for this with some kind of AC_COMPILE_IFELSE
> incantation but the steamroller cast is simpler.)

Did you code this up blind or do you have a NetBSD machine yourself?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: kqueue

От

Andres Freund

Дата:

21 апреля 2016 г., 18:23:02

On 2016-04-21 14:15:53 -0400, Robert Haas wrote:
> On Tue, Mar 29, 2016 at 7:53 PM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
> > On the WaitEventSet thread I posted a small patch to add kqueue
> > support[1].  Since then I peeked at how some other software[2]
> > interacts with kqueue and discovered that there are platforms
> > including NetBSD where kevent.udata is an intptr_t instead of a void
> > *.  Here's a version which should compile there.  Would any NetBSD
> > user be interested in testing this?  (An alternative would be to make
> > configure to test for this with some kind of AC_COMPILE_IFELSE
> > incantation but the steamroller cast is simpler.)
> 
> Did you code this up blind or do you have a NetBSD machine yourself?

RMT, what do you think, should we try to get this into 9.6? It's
feasible that the performance problem 98a64d0bd713c addressed is also
present on free/netbsd.

- Andres

Re: kqueue

От

Robert Haas

Дата:

21 апреля 2016 г., 18:25:11

On Thu, Apr 21, 2016 at 2:22 PM, Andres Freund <andres@anarazel.de> wrote:
> On 2016-04-21 14:15:53 -0400, Robert Haas wrote:
>> On Tue, Mar 29, 2016 at 7:53 PM, Thomas Munro
>> <thomas.munro@enterprisedb.com> wrote:
>> > On the WaitEventSet thread I posted a small patch to add kqueue
>> > support[1].  Since then I peeked at how some other software[2]
>> > interacts with kqueue and discovered that there are platforms
>> > including NetBSD where kevent.udata is an intptr_t instead of a void
>> > *.  Here's a version which should compile there.  Would any NetBSD
>> > user be interested in testing this?  (An alternative would be to make
>> > configure to test for this with some kind of AC_COMPILE_IFELSE
>> > incantation but the steamroller cast is simpler.)
>>
>> Did you code this up blind or do you have a NetBSD machine yourself?
>
> RMT, what do you think, should we try to get this into 9.6? It's
> feasible that the performance problem 98a64d0bd713c addressed is also
> present on free/netbsd.

My personal opinion is that it would be a reasonable thing to do if
somebody can demonstrate that it actually solves a real problem.
Absent that, I don't think we should rush it in.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: kqueue

От

Alvaro Herrera

Дата:

21 апреля 2016 г., 19:31:30

Robert Haas wrote:
> On Thu, Apr 21, 2016 at 2:22 PM, Andres Freund <andres@anarazel.de> wrote:
> > On 2016-04-21 14:15:53 -0400, Robert Haas wrote:
> >> On Tue, Mar 29, 2016 at 7:53 PM, Thomas Munro
> >> <thomas.munro@enterprisedb.com> wrote:
> >> > On the WaitEventSet thread I posted a small patch to add kqueue
> >> > support[1].  Since then I peeked at how some other software[2]
> >> > interacts with kqueue and discovered that there are platforms
> >> > including NetBSD where kevent.udata is an intptr_t instead of a void
> >> > *.  Here's a version which should compile there.  Would any NetBSD
> >> > user be interested in testing this?  (An alternative would be to make
> >> > configure to test for this with some kind of AC_COMPILE_IFELSE
> >> > incantation but the steamroller cast is simpler.)
> >>
> >> Did you code this up blind or do you have a NetBSD machine yourself?
> >
> > RMT, what do you think, should we try to get this into 9.6? It's
> > feasible that the performance problem 98a64d0bd713c addressed is also
> > present on free/netbsd.
> 
> My personal opinion is that it would be a reasonable thing to do if
> somebody can demonstrate that it actually solves a real problem.
> Absent that, I don't think we should rush it in.

My first question is whether there are platforms that use kqueue on
which the WaitEventSet stuff proves to be a bottleneck.  I vaguely
recall that MacOS X in particular doesn't scale terribly well for other
reasons, and I don't know if anybody runs *BSD in large machines.

On the other hand, there's plenty of hackers running their laptops on
MacOS X these days, so presumably any platform dependent problem would
be discovered quickly enough.  As for NetBSD, it seems mostly a fringe
platform, doesn't it?  We would discover serious dependency problems
quickly enough on the buildfarm ... except that the only netbsd
buildfarm member hasn't reported in over two weeks.

Am I mistaken in any of these points?

(Our coverage of the BSD platforms leaves much to be desired FWIW.)

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: kqueue

От

Robert Haas

Дата:

21 апреля 2016 г., 19:54:00

On Thu, Apr 21, 2016 at 3:31 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Robert Haas wrote:
>> On Thu, Apr 21, 2016 at 2:22 PM, Andres Freund <andres@anarazel.de> wrote:
>> > On 2016-04-21 14:15:53 -0400, Robert Haas wrote:
>> >> On Tue, Mar 29, 2016 at 7:53 PM, Thomas Munro
>> >> <thomas.munro@enterprisedb.com> wrote:
>> >> > On the WaitEventSet thread I posted a small patch to add kqueue
>> >> > support[1].  Since then I peeked at how some other software[2]
>> >> > interacts with kqueue and discovered that there are platforms
>> >> > including NetBSD where kevent.udata is an intptr_t instead of a void
>> >> > *.  Here's a version which should compile there.  Would any NetBSD
>> >> > user be interested in testing this?  (An alternative would be to make
>> >> > configure to test for this with some kind of AC_COMPILE_IFELSE
>> >> > incantation but the steamroller cast is simpler.)
>> >>
>> >> Did you code this up blind or do you have a NetBSD machine yourself?
>> >
>> > RMT, what do you think, should we try to get this into 9.6? It's
>> > feasible that the performance problem 98a64d0bd713c addressed is also
>> > present on free/netbsd.
>>
>> My personal opinion is that it would be a reasonable thing to do if
>> somebody can demonstrate that it actually solves a real problem.
>> Absent that, I don't think we should rush it in.
>
> My first question is whether there are platforms that use kqueue on
> which the WaitEventSet stuff proves to be a bottleneck.  I vaguely
> recall that MacOS X in particular doesn't scale terribly well for other
> reasons, and I don't know if anybody runs *BSD in large machines.
>
> On the other hand, there's plenty of hackers running their laptops on
> MacOS X these days, so presumably any platform dependent problem would
> be discovered quickly enough.  As for NetBSD, it seems mostly a fringe
> platform, doesn't it?  We would discover serious dependency problems
> quickly enough on the buildfarm ... except that the only netbsd
> buildfarm member hasn't reported in over two weeks.
>
> Am I mistaken in any of these points?
>
> (Our coverage of the BSD platforms leaves much to be desired FWIW.)

My impression is that the Linux problem only manifested itself on
large machines.  I might be wrong about that.  But if that's true,
then we might not see regressions on other platforms just because
people aren't running those operating systems on big enough hardware.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: kqueue

От

Andres Freund

Дата:

22 апреля 2016 г., 04:49:33

On 2016-04-21 14:25:06 -0400, Robert Haas wrote:
> On Thu, Apr 21, 2016 at 2:22 PM, Andres Freund <andres@anarazel.de> wrote:
> > On 2016-04-21 14:15:53 -0400, Robert Haas wrote:
> >> On Tue, Mar 29, 2016 at 7:53 PM, Thomas Munro
> >> <thomas.munro@enterprisedb.com> wrote:
> >> > On the WaitEventSet thread I posted a small patch to add kqueue
> >> > support[1].  Since then I peeked at how some other software[2]
> >> > interacts with kqueue and discovered that there are platforms
> >> > including NetBSD where kevent.udata is an intptr_t instead of a void
> >> > *.  Here's a version which should compile there.  Would any NetBSD
> >> > user be interested in testing this?  (An alternative would be to make
> >> > configure to test for this with some kind of AC_COMPILE_IFELSE
> >> > incantation but the steamroller cast is simpler.)
> >>
> >> Did you code this up blind or do you have a NetBSD machine yourself?
> >
> > RMT, what do you think, should we try to get this into 9.6? It's
> > feasible that the performance problem 98a64d0bd713c addressed is also
> > present on free/netbsd.
> 
> My personal opinion is that it would be a reasonable thing to do if
> somebody can demonstrate that it actually solves a real problem.
> Absent that, I don't think we should rush it in.

On linux you needed a 2 socket machine to demonstrate the problem, but
both old ones (my 2009 workstation) and new ones were sufficient. I'd be
surprised if the situation on freebsd is any better, except that you
might hit another scalability bottleneck earlier.

I doubt there's many real postgres instances operating on bigger
hardware on freebsd, with sufficient throughput to show the problem. So
I think the argument for including is more along trying to be "nice" to
more niche-y OSs.

I really don't have any opinion either way.

- Andres

Re: kqueue

От

Thomas Munro

Дата:

22 апреля 2016 г., 08:39:33

On Fri, Apr 22, 2016 at 12:21 PM, Andres Freund <andres@anarazel.de> wrote:
> On 2016-04-21 14:25:06 -0400, Robert Haas wrote:
>> On Thu, Apr 21, 2016 at 2:22 PM, Andres Freund <andres@anarazel.de> wrote:
>> > On 2016-04-21 14:15:53 -0400, Robert Haas wrote:
>> >> On Tue, Mar 29, 2016 at 7:53 PM, Thomas Munro
>> >> <thomas.munro@enterprisedb.com> wrote:
>> >> > On the WaitEventSet thread I posted a small patch to add kqueue
>> >> > support[1].  Since then I peeked at how some other software[2]
>> >> > interacts with kqueue and discovered that there are platforms
>> >> > including NetBSD where kevent.udata is an intptr_t instead of a void
>> >> > *.  Here's a version which should compile there.  Would any NetBSD
>> >> > user be interested in testing this?  (An alternative would be to make
>> >> > configure to test for this with some kind of AC_COMPILE_IFELSE
>> >> > incantation but the steamroller cast is simpler.)
>> >>
>> >> Did you code this up blind or do you have a NetBSD machine yourself?
>> >
>> > RMT, what do you think, should we try to get this into 9.6? It's
>> > feasible that the performance problem 98a64d0bd713c addressed is also
>> > present on free/netbsd.
>>
>> My personal opinion is that it would be a reasonable thing to do if
>> somebody can demonstrate that it actually solves a real problem.
>> Absent that, I don't think we should rush it in.
>
> On linux you needed a 2 socket machine to demonstrate the problem, but
> both old ones (my 2009 workstation) and new ones were sufficient. I'd be
> surprised if the situation on freebsd is any better, except that you
> might hit another scalability bottleneck earlier.
>
> I doubt there's many real postgres instances operating on bigger
> hardware on freebsd, with sufficient throughput to show the problem. So
> I think the argument for including is more along trying to be "nice" to
> more niche-y OSs.

What has BSD ever done for us?!  (Joke...)

I vote to leave this patch in the next commitfest where it is, and
reconsider if someone shows up with a relevant problem report on large
systems.  I can't see any measurable performance difference on a 4
core laptop running FreeBSD 10.3.  Maybe kqueue will make more
difference even on smaller systems in future releases if we start
using big wait sets for distributed/asynchronous work, in-core
pooling/admission control etc.

Here's a new version of the patch that fixes some stupid bugs.  I have
run regression tests and some basic sanity checks on OSX 10.11.4,
FreeBSD 10.3, NetBSD 7.0 and OpenBSD 5.8.  There is still room to make
an improvement that would drop the syscall from AddWaitEventToSet and
ModifyWaitEvent, compressing wait set modifications and waiting into a
single syscall (kqueue's claimed advantage over the competition).

While doing that I discovered that unpatched master doesn't actually
build on recent NetBSD systems because our static function strtoi
clashes with a non-standard libc function of the same name[1] declared
in inttypes.h.  Maybe we should rename it, like in the attached?

[1] http://netbsd.gw.com/cgi-bin/man-cgi?strtoi++NetBSD-current

--
Thomas Munro
http://www.enterprisedb.com

On Wed, Sep 14, 2016 at 9:09 AM, Matteo Beccati <php@beccati.com> wrote:

Hi,

On 14/09/2016 00:06, Tom Lane wrote:
I'm inclined to think the kqueue patch is worth applying just on the
grounds that it makes things better on OS X and doesn't seem to hurt
on FreeBSD. Whether anyone would ever get to the point of seeing
intra-kernel contention on these platforms is hard to predict, but
we'd be ahead of the curve if so.

It would be good for someone else to reproduce my results though.
For one thing, 5%-ish is not that far above the noise level; maybe
what I'm measuring here is just good luck from relocation of critical
loops into more cache-line-friendly locations.

FWIW, I've tested HEAD vs patch on a 2-cpu low end NetBSD 7.0 i386 machine.

HEAD: 1890/1935/1889 tps
kqueue: 1905/1957/1932 tps

no weird surprises, and basically no differences either.

Cheers
--
Matteo Beccati

Development & Consulting - http://www.beccati.com/

Thomas Munro brought up in #postgresql on freenode needing someone to test a patch on a larger FreeBSD server. I've got a pretty decent machine (3.1Ghz Quad Core Xeon E3-1220V3, 16GB ECC RAM, ZFS mirror on WD Red HDD) so offered to give it a try.

Bench setup was:
pgbench -i -s 100 -d postgres

I ran this against 96rc1 instead of HEAD like most of the others in this thread seem to have done. Not sure if that makes a difference and can re-run if needed.
With higher concurrency, this seems to cause decreased performance. You can tell which of the runs is the kqueue patch by looking at the path to pgbench.

SINGLE PROCESS
[keith@corpus /tank/pgdata]$ /home/keith/pgsql96rc1_kqueue/bin/pgbench -T 60 -j 1 -c 1 -M prepared -S postgres -p 5496
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 1547387
latency average: 0.039 ms
tps = 25789.750236 (including connections establishing)
tps = 25791.018293 (excluding connections establishing)
[keith@corpus /tank/pgdata]$ /home/keith/pgsql96rc1_kqueue/bin/pgbench -T 60 -j 1 -c 1 -M prepared -S postgres -p 5496
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 1549442
latency average: 0.039 ms
tps = 25823.981255 (including connections establishing)
tps = 25825.189871 (excluding connections establishing)
[keith@corpus /tank/pgdata]$ /home/keith/pgsql96rc1_kqueue/bin/pgbench -T 60 -j 1 -c 1 -M prepared -S postgres -p 5496
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 1547936
latency average: 0.039 ms
tps = 25798.572583 (including connections establishing)
tps = 25799.917170 (excluding connections establishing)

[keith@corpus /tank/pgdata]$ /home/keith/pgsql96rc1/bin/pgbench -T 60 -j 1 -c 1 -M prepared -S postgres -p 5496
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 1520722
latency average: 0.039 ms
tps = 25343.122533 (including connections establishing)
tps = 25344.357116 (excluding connections establishing)
[keith@corpus /tank/pgdata]$ /home/keith/pgsql96rc1/bin/pgbench -T 60 -j 1 -c 1 -M prepared -S postgres -p 5496~
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 1549282
latency average: 0.039 ms
tps = 25821.107595 (including connections establishing)
tps = 25822.407310 (excluding connections establishing)
[keith@corpus /tank/pgdata]$ /home/keith/pgsql96rc1/bin/pgbench -T 60 -j 1 -c 1 -M prepared -S postgres -p 5496~
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 1541907
latency average: 0.039 ms
tps = 25698.025983 (including connections establishing)
tps = 25699.270663 (excluding connections establishing)

FOUR
/home/keith/pgsql96rc1_kqueue/bin/pgbench -T 60 -j 4 -c 4 -M prepared -S postgres -p 5496
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 4
number of threads: 4
duration: 60 s
number of transactions actually processed: 4282185
latency average: 0.056 ms
tps = 71369.146931 (including connections establishing)
tps = 71372.646243 (excluding connections establishing)
[keith@corpus ~/postgresql-9.6rc1_kqueue]$ /home/keith/pgsql96rc1_kqueue/bin/pgbench -T 60 -j 4 -c 4 -M prepared -S postgres -p 5496
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 4
number of threads: 4
duration: 60 s
number of transactions actually processed: 4777596
latency average: 0.050 ms
tps = 79625.214521 (including connections establishing)
tps = 79629.800123 (excluding connections establishing)
[keith@corpus ~/postgresql-9.6rc1_kqueue]$ /home/keith/pgsql96rc1_kqueue/bin/pgbench -T 60 -j 4 -c 4 -M prepared -S postgres -p 5496
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 4
number of threads: 4
duration: 60 s
number of transactions actually processed: 4809132
latency average: 0.050 ms
tps = 80151.803249 (including connections establishing)
tps = 80155.903203 (excluding connections establishing)

/home/keith/pgsql96rc1/bin/pgbench -T 60 -j 4 -c 4 -M prepared -S postgres -p 5496
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 4
number of threads: 4
duration: 60 s
number of transactions actually processed: 5114286
latency average: 0.047 ms
tps = 85236.858383 (including connections establishing)
tps = 85241.847800 (excluding connections establishing)
/home/keith/pgsql96rc1/bin/pgbench -T 60 -j 4 -c 4 -M prepared -S postgres -p 5496
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 4
number of threads: 4
duration: 60 s
number of transactions actually processed: 5600194
latency average: 0.043 ms
tps = 93335.508864 (including connections establishing)
tps = 93340.970416 (excluding connections establishing)
/home/keith/pgsql96rc1/bin/pgbench -T 60 -j 4 -c 4 -M prepared -S postgres -p 5496
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 4
number of threads: 4
duration: 60 s
number of transactions actually processed: 5606962
latency average: 0.043 ms
tps = 93447.905764 (including connections establishing)
tps = 93454.077142 (excluding connections establishing)

SIXTY-FOUR
[keith@corpus /tank/pgdata]$ /home/keith/pgsql96rc1_kqueue/bin/pgbench -T 60 -j 64 -c 64 -M prepared -S postgres -p 5496
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 64
number of threads: 64
duration: 60 s
number of transactions actually processed: 4084213
latency average: 0.940 ms
tps = 67633.476871 (including connections establishing)
tps = 67751.865998 (excluding connections establishing)
[keith@corpus /tank/pgdata]$ /home/keith/pgsql96rc1_kqueue/bin/pgbench -T 60 -j 64 -c 64 -M prepared -S postgres -p 5496
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 64
number of threads: 64
duration: 60 s
number of transactions actually processed: 4119994
latency average: 0.932 ms
tps = 68474.847365 (including connections establishing)
tps = 68540.221835 (excluding connections establishing)
[keith@corpus /tank/pgdata]$ /home/keith/pgsql96rc1_kqueue/bin/pgbench -T 60 -j 64 -c 64 -M prepared -S postgres -p 5496
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 64
number of threads: 64
duration: 60 s
number of transactions actually processed: 4068071
latency average: 0.944 ms
tps = 67192.603129 (including connections establishing)
tps = 67254.760177 (excluding connections establishing)

[keith@corpus /tank/pgdata]$ /home/keith/pgsql96rc1/bin/pgbench -T 60 -j 64 -c 64 -M prepared -S postgres -p 5496
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 64
number of threads: 64
duration: 60 s
number of transactions actually processed: 4281302
latency average: 0.897 ms
tps = 70147.847337 (including connections establishing)
tps = 70389.283564 (excluding connections establishing)
[keith@corpus /tank/pgdata]$ /home/keith/pgsql96rc1/bin/pgbench -T 60 -j 64 -c 64 -M prepared -S postgres -p 5496
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 64
number of threads: 64
duration: 60 s
number of transactions actually processed: 4573114
latency average: 0.840 ms
tps = 74848.884475 (including connections establishing)
tps = 75102.862539 (excluding connections establishing)
[keith@corpus /tank/pgdata]$ /home/keith/pgsql96rc1/bin/pgbench -T 60 -j 64 -c 64 -M prepared -S postgres -p 5496
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 64
number of threads: 64
duration: 60 s
number of transactions actually processed: 4341447
latency average: 0.884 ms
tps = 72350.152281 (including connections establishing)
tps = 72421.831179 (excluding connections establishing)

--
Keith Fiske
Database Administrator
OmniTI Computer Consulting, Inc.
http://www.keithf4.com

Re: kqueue

От

Thomas Munro

Дата:

14 сентября 2016 г., 23:04:22

On Thu, Sep 15, 2016 at 10:48 AM, Keith Fiske <keith@omniti.com> wrote:
> Thomas Munro brought up in #postgresql on freenode needing someone to test a
> patch on a larger FreeBSD server. I've got a pretty decent machine (3.1Ghz
> Quad Core Xeon E3-1220V3, 16GB ECC RAM, ZFS mirror on WD Red HDD) so offered
> to give it a try.
>
> Bench setup was:
> pgbench -i -s 100 -d postgres
>
> I ran this against 96rc1 instead of HEAD like most of the others in this
> thread seem to have done. Not sure if that makes a difference and can re-run
> if needed.
> With higher concurrency, this seems to cause decreased performance. You can
> tell which of the runs is the kqueue patch by looking at the path to
> pgbench.

Thanks Keith.  So to summarise, you saw no change with 1 client, but
with 4 clients you saw a significant drop in performance (~93K TPS ->
~80K TPS), and a smaller drop for 64 clients (~72 TPS -> ~68K TPS).
These results seem to be a nail in the coffin for this patch for now.

Thanks to everyone who tested.  I might be back in a later commitfest
if I can figure out why and how to fix it.

-- 
Thomas Munro
http://www.enterprisedb.com

Re: kqueue

От

Thomas Munro

Дата:

16 сентября 2016 г., 03:11:48

On Thu, Sep 15, 2016 at 11:04 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Thu, Sep 15, 2016 at 10:48 AM, Keith Fiske <keith@omniti.com> wrote:
>> Thomas Munro brought up in #postgresql on freenode needing someone to test a
>> patch on a larger FreeBSD server. I've got a pretty decent machine (3.1Ghz
>> Quad Core Xeon E3-1220V3, 16GB ECC RAM, ZFS mirror on WD Red HDD) so offered
>> to give it a try.
>>
>> Bench setup was:
>> pgbench -i -s 100 -d postgres
>>
>> I ran this against 96rc1 instead of HEAD like most of the others in this
>> thread seem to have done. Not sure if that makes a difference and can re-run
>> if needed.
>> With higher concurrency, this seems to cause decreased performance. You can
>> tell which of the runs is the kqueue patch by looking at the path to
>> pgbench.
>
> Thanks Keith.  So to summarise, you saw no change with 1 client, but
> with 4 clients you saw a significant drop in performance (~93K TPS ->
> ~80K TPS), and a smaller drop for 64 clients (~72 TPS -> ~68K TPS).
> These results seem to be a nail in the coffin for this patch for now.
>
> Thanks to everyone who tested.  I might be back in a later commitfest
> if I can figure out why and how to fix it.

Ok, here's a version tweaked to use EVFILT_PROC for postmaster death
detection instead of the pipe, as Tom Lane suggested in another
thread[1].

The pipe still exists and is used for PostmasterIsAlive(), and also
for the race case where kevent discovers that the PID doesn't exist
when you try to add it (presumably it died already, but we want to
defer the report of that until you call EventSetWait, so in that case
we stick the traditional pipe into the kqueue set as before so that
it'll fire a readable-because-EOF event then).

Still no change measurable on my laptop.  Keith, would you be able to
test this on your rig and see if it sucks any less than the last one?

[1] https://www.postgresql.org/message-id/13774.1473972000%40sss.pgh.pa.us

--
Thomas Munro
http://www.enterprisedb.com

Вложения

kqueue-v6.patch

Re: kqueue

От

Matteo Beccati

Дата:

20 сентября 2016 г., 12:19:46

Hi,

On 16/09/2016 05:11, Thomas Munro wrote:
> Still no change measurable on my laptop.  Keith, would you be able to
> test this on your rig and see if it sucks any less than the last one?

I've tested kqueue-v6.patch on the Celeron NetBSD machine and numbers 
were constantly lower by about 5-10% vs fairly recent HEAD (same as my 
last pgbench runs).

Cheers
-- 
Matteo Beccati

Development & Consulting - http://www.beccati.com/

Re: kqueue

От

Keith Fiske

Дата:

28 сентября 2016 г., 20:09:53

On Thu, Sep 15, 2016 at 11:11 PM, Thomas Munro <thomas.munro@enterprisedb.com> wrote:

On Thu, Sep 15, 2016 at 11:04 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Thu, Sep 15, 2016 at 10:48 AM, Keith Fiske <keith@omniti.com> wrote:
>> Thomas Munro brought up in #postgresql on freenode needing someone to test a
>> patch on a larger FreeBSD server. I've got a pretty decent machine (3.1Ghz
>> Quad Core Xeon E3-1220V3, 16GB ECC RAM, ZFS mirror on WD Red HDD) so offered
>> to give it a try.
>>
>> Bench setup was:
>> pgbench -i -s 100 -d postgres
>>
>> I ran this against 96rc1 instead of HEAD like most of the others in this
>> thread seem to have done. Not sure if that makes a difference and can re-run
>> if needed.
>> With higher concurrency, this seems to cause decreased performance. You can
>> tell which of the runs is the kqueue patch by looking at the path to
>> pgbench.
>
> Thanks Keith. So to summarise, you saw no change with 1 client, but
> with 4 clients you saw a significant drop in performance (~93K TPS ->
> ~80K TPS), and a smaller drop for 64 clients (~72 TPS -> ~68K TPS).
> These results seem to be a nail in the coffin for this patch for now.
>
> Thanks to everyone who tested. I might be back in a later commitfest
> if I can figure out why and how to fix it.

Ok, here's a version tweaked to use EVFILT_PROC for postmaster death
detection instead of the pipe, as Tom Lane suggested in another
thread[1].

The pipe still exists and is used for PostmasterIsAlive(), and also
for the race case where kevent discovers that the PID doesn't exist
when you try to add it (presumably it died already, but we want to
defer the report of that until you call EventSetWait, so in that case
we stick the traditional pipe into the kqueue set as before so that
it'll fire a readable-because-EOF event then).

Still no change measurable on my laptop. Keith, would you be able to
test this on your rig and see if it sucks any less than the last one?

[1] https://www.postgresql.org/message-id/13774.1473972000%40sss.pgh.pa.us

--
Thomas Munro
http://www.enterprisedb.com

Ran benchmarks on unaltered 96rc1 again just to be safe. Those are first. Decided to throw a 32 process test in there as well to see if there's anything going on between 4 and 64

~/pgsql96rc1/bin/pgbench -i -s 100 -d pgbench -p 5496

[keith@corpus ~]$ /home/keith/pgsql96rc1/bin/pgbench -T 60 -j 1 -c 1 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 1543809
latency average: 0.039 ms
tps = 25729.749474 (including connections establishing)
tps = 25731.006414 (excluding connections establishing)
[keith@corpus ~]$ /home/keith/pgsql96rc1/bin/pgbench -T 60 -j 1 -c 1 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 1548340
latency average: 0.039 ms
tps = 25796.928387 (including connections establishing)
tps = 25798.275891 (excluding connections establishing)
[keith@corpus ~]$ /home/keith/pgsql96rc1/bin/pgbench -T 60 -j 1 -c 1 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 1535072
latency average: 0.039 ms
tps = 25584.182830 (including connections establishing)
tps = 25585.487246 (excluding connections establishing)

[keith@corpus ~]$ /home/keith/pgsql96rc1/bin/pgbench -T 60 -j 4 -c 4 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 4
number of threads: 4
duration: 60 s
number of transactions actually processed: 5621013
latency average: 0.043 ms
tps = 93668.594248 (including connections establishing)
tps = 93674.730914 (excluding connections establishing)
[keith@corpus ~]$ /home/keith/pgsql96rc1/bin/pgbench -T 60 -j 4 -c 4 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 4
number of threads: 4
duration: 60 s
number of transactions actually processed: 5659929
latency average: 0.042 ms
tps = 94293.572928 (including connections establishing)
tps = 94300.500395 (excluding connections establishing)
[keith@corpus ~]$ /home/keith/pgsql96rc1/bin/pgbench -T 60 -j 4 -c 4 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 4
number of threads: 4
duration: 60 s
number of transactions actually processed: 5649572
latency average: 0.042 ms
tps = 94115.854165 (including connections establishing)
tps = 94123.436211 (excluding connections establishing)

[keith@corpus ~]$ /home/keith/pgsql96rc1/bin/pgbench -T 60 -j 32 -c 32 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 32
number of threads: 32
duration: 60 s
number of transactions actually processed: 5196336
latency average: 0.369 ms
tps = 86570.696138 (including connections establishing)
tps = 86608.648579 (excluding connections establishing)
[keith@corpus ~]$ /home/keith/pgsql96rc1/bin/pgbench -T 60 -j 32 -c 32 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 32
number of threads: 32
duration: 60 s
number of transactions actually processed: 5202443
latency average: 0.369 ms
tps = 86624.724577 (including connections establishing)
tps = 86664.848857 (excluding connections establishing)
[keith@corpus ~]$ /home/keith/pgsql96rc1/bin/pgbench -T 60 -j 32 -c 32 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 32
number of threads: 32
duration: 60 s
number of transactions actually processed: 5198412
latency average: 0.369 ms
tps = 86637.730825 (including connections establishing)
tps = 86668.706105 (excluding connections establishing)

[keith@corpus ~]$ /home/keith/pgsql96rc1/bin/pgbench -T 60 -j 64 -c 64 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 64
number of threads: 64
duration: 60 s
number of transactions actually processed: 4790285
latency average: 0.802 ms
tps = 79800.369679 (including connections establishing)
tps = 79941.243428 (excluding connections establishing)
[keith@corpus ~]$ /home/keith/pgsql96rc1/bin/pgbench -T 60 -j 64 -c 64 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 64
number of threads: 64
duration: 60 s
number of transactions actually processed: 4852921
latency average: 0.791 ms
tps = 79924.873678 (including connections establishing)
tps = 80179.182200 (excluding connections establishing)
[keith@corpus ~]$ /home/keith/pgsql96rc1/bin/pgbench -T 60 -j 64 -c 64 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 64
number of threads: 64
duration: 60 s
number of transactions actually processed: 4672965
latency average: 0.822 ms
tps = 77871.911528 (including connections establishing)
tps = 77961.614345 (excluding connections establishing)

~/pgsql96rc1_kqueue_v6/bin/pgbench -i -s 100 -d pgbench -p 5496

Ran more than 3 times on occasion since results were coming out differently by larger than expected values sometimes. Probably just something else running on the server at the time.

Again, no real noticeable difference for single process
For 4 processes, things are mostly the same and only very, very slightly lower, which is better than before.
For thirty-two processes, I saw a slight increase in performance for v6.
But, again, for 64 the results were slightly worse. Although the last run did almost match, most runs were lower. They're better than they were last time, but still not as good as the unchanged 96rc1

I can try running against HEAD if you'd like.

SINGLE
[keith@corpus ~]$ /home/keith/pgsql96rc1_kqueue_v6/bin/pgbench -T 60 -j 1 -c 1 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 1508745
latency average: 0.040 ms
tps = 25145.524948 (including connections establishing)
tps = 25146.433564 (excluding connections establishing)
[keith@corpus ~]$ /home/keith/pgsql96rc1_kqueue_v6/bin/pgbench -T 60 -j 1 -c 1 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 1346454
latency average: 0.045 ms
tps = 22440.692798 (including connections establishing)
tps = 22441.527989 (excluding connections establishing)
[keith@corpus ~]$ /home/keith/pgsql96rc1_kqueue_v6/bin/pgbench -T 60 -j 1 -c 1 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 1426906
latency average: 0.042 ms
tps = 23781.710780 (including connections establishing)
tps = 23782.523744 (excluding connections establishing)
[keith@corpus ~]$ /home/keith/pgsql96rc1_kqueue_v6/bin/pgbench -T 60 -j 1 -c 1 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 1546252
latency average: 0.039 ms
tps = 25770.468513 (including connections establishing)
tps = 25771.352027 (excluding connections establishing)
[keith@corpus ~]$ /home/keith/pgsql96rc1_kqueue_v6/bin/pgbench -T 60 -j 1 -c 1 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 1542366
latency average: 0.039 ms
tps = 25705.706274 (including connections establishing)
tps = 25706.577285 (excluding connections establishing)

FOUR
[keith@corpus ~]$ /home/keith/pgsql96rc1_kqueue_v6/bin/pgbench -T 60 -j 4 -c 4 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 4
number of threads: 4
duration: 60 s
number of transactions actually processed: 5606159
latency average: 0.043 ms
tps = 93435.464767 (including connections establishing)
tps = 93442.716270 (excluding connections establishing)
[keith@corpus ~]$ /home/keith/pgsql96rc1_kqueue_v6/bin/pgbench -T 60 -j 4 -c 4 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 4
number of threads: 4
duration: 60 s
number of transactions actually processed: 5602564
latency average: 0.043 ms
tps = 93375.528201 (including connections establishing)
tps = 93381.999147 (excluding connections establishing)
[keith@corpus ~]$ /home/keith/pgsql96rc1_kqueue_v6/bin/pgbench -T 60 -j 4 -c 4 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 4
number of threads: 4
duration: 60 s
number of transactions actually processed: 5608675
latency average: 0.043 ms
tps = 93474.081114 (including connections establishing)
tps = 93481.634509 (excluding connections establishing)

THIRTY-TWO
[keith@corpus ~]$ /home/keith/pgsql96rc1_kqueue_v6/bin/pgbench -T 60 -j 32 -c 32 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 32
number of threads: 32
duration: 60 s
number of transactions actually processed: 5273952
latency average: 0.364 ms
tps = 87855.483112 (including connections establishing)
tps = 87880.762662 (excluding connections establishing)
[keith@corpus ~]$ /home/keith/pgsql96rc1_kqueue_v6/bin/pgbench -T 60 -j 32 -c 32 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 32
number of threads: 32
duration: 60 s
number of transactions actually processed: 5294039
latency average: 0.363 ms
tps = 88126.254862 (including connections establishing)
tps = 88151.282371 (excluding connections establishing)
[keith@corpus ~]$
[keith@corpus ~]$ /home/keith/pgsql96rc1_kqueue_v6/bin/pgbench -T 60 -j 32 -c 32 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 32
number of threads: 32
duration: 60 s
number of transactions actually processed: 5279444
latency average: 0.364 ms
tps = 87867.500628 (including connections establishing)
tps = 87891.856414 (excluding connections establishing)
[keith@corpus ~]$ /home/keith/pgsql96rc1_kqueue_v6/bin/pgbench -T 60 -j 32 -c 32 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 32
number of threads: 32
duration: 60 s
number of transactions actually processed: 5286405
latency average: 0.363 ms
tps = 88049.742194 (including connections establishing)
tps = 88077.409809 (excluding connections establishing)

SIXTY-FOUR
[keith@corpus ~]$ /home/keith/pgsql96rc1_kqueue_v6/bin/pgbench -T 60 -j 64 -c 64 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 64
number of threads: 64
duration: 60 s
number of transactions actually processed: 4426565
latency average: 0.867 ms
tps = 72142.306576 (including connections establishing)
tps = 72305.201516 (excluding connections establishing)
[keith@corpus ~]$ /home/keith/pgsql96rc1_kqueue_v6/bin/pgbench -T 60 -j 64 -c 64 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 64
number of threads: 64
duration: 60 s
number of transactions actually processed: 4070048
latency average: 0.943 ms
tps = 66587.264608 (including connections establishing)
tps = 66711.820878 (excluding connections establishing)
[keith@corpus ~]$ /home/keith/pgsql96rc1_kqueue_v6/bin/pgbench -T 60 -j 64 -c 64 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 64
number of threads: 64
duration: 60 s
number of transactions actually processed: 4478535
latency average: 0.857 ms
tps = 72768.961061 (including connections establishing)
tps = 72930.488922 (excluding connections establishing)
[keith@corpus ~]$ /home/keith/pgsql96rc1_kqueue_v6/bin/pgbench -T 60 -j 64 -c 64 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 64
number of threads: 64
duration: 60 s
number of transactions actually processed: 4051086
latency average: 0.948 ms
tps = 66540.741821 (including connections establishing)
tps = 66601.943062 (excluding connections establishing)
[keith@corpus ~]$ /home/keith/pgsql96rc1_kqueue_v6/bin/pgbench -T 60 -j 64 -c 64 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 64
number of threads: 64
duration: 60 s
number of transactions actually processed: 4374049
latency average: 0.878 ms
tps = 72093.025134 (including connections establishing)
tps = 72271.145559 (excluding connections establishing)
[keith@corpus ~]$ /home/keith/pgsql96rc1_kqueue_v6/bin/pgbench -T 60 -j 64 -c 64 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 64
number of threads: 64
duration: 60 s
number of transactions actually processed: 4762663
latency average: 0.806 ms
tps = 79372.610362 (including connections establishing)
tps = 79535.601194 (excluding connections establishing)

As a sanity check I went back and ran the pgbench from the v5 patch to see if it was still lower. It is. So v6 seems to have a slight improvement in some cases.

[keith@corpus ~]$ /home/keith/pgsql96rc1_kqueue_v5/bin/pgbench -T 60 -j 32 -c 32 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 32
number of threads: 32
duration: 60 s
number of transactions actually processed: 4618814
latency average: 0.416 ms
tps = 76960.608378 (including connections establishing)
tps = 76981.609781 (excluding connections establishing)
[keith@corpus ~]$ /home/keith/pgsql96rc1_kqueue_v5/bin/pgbench -T 60 -j 32 -c 32 -M prepared -S -p 5496 pgbench
starting vacuum...end.
transaction type: <builtin: select only>
scaling factor: 100
query mode: prepared
number of clients: 32
number of threads: 32
duration: 60 s
number of transactions actually processed: 4649745
latency average: 0.413 ms
tps = 77491.094077 (including connections establishing)
tps = 77525.443941 (excluding connections establishing)

Re: kqueue

От

Thomas Munro

Дата:

28 сентября 2016 г., 21:39:26

On Thu, Sep 29, 2016 at 9:09 AM, Keith Fiske <keith@omniti.com> wrote:
> On Thu, Sep 15, 2016 at 11:11 PM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> Ok, here's a version tweaked to use EVFILT_PROC for postmaster death
>> detection instead of the pipe, as Tom Lane suggested in another
>> thread[1].
>>
>> [...]
>
> Ran benchmarks on unaltered 96rc1 again just to be safe. Those are first.
> Decided to throw a 32 process test in there as well to see if there's
> anything going on between 4 and 64

Thanks!  A summary:

┌──────────────────┬─────────┬───────────┬────────────────────┬───────────┐
│       code       │ clients │  average  │ standard_deviation │  median   │
├──────────────────┼─────────┼───────────┼────────────────────┼───────────┤
│ 9.6rc1           │       1 │ 25704.923 │            108.766 │ 25731.006 │
│ 9.6rc1           │       4 │ 94032.889 │            322.562 │ 94123.436 │
│ 9.6rc1           │      32 │ 86647.401 │             33.616 │ 86664.849 │
│ 9.6rc1           │      64 │ 79360.680 │           1217.453 │ 79941.243 │
│ 9.6rc1/kqueue-v6 │       1 │ 24569.683 │           1433.339 │ 25146.434 │
│ 9.6rc1/kqueue-v6 │       4 │ 93435.450 │             50.214 │ 93442.716 │
│ 9.6rc1/kqueue-v6 │      32 │ 88000.328 │            135.143 │ 87891.856 │
│ 9.6rc1/kqueue-v6 │      64 │ 71726.034 │           4784.794 │ 72271.146 │
└──────────────────┴─────────┴───────────┴────────────────────┴───────────┘

┌─────────┬───────────┬───────────┬──────────────────────────┐
│ clients │ unpatched │  patched  │      percent_change      │
├─────────┼───────────┼───────────┼──────────────────────────┤
│       1 │ 25731.006 │ 25146.434 │ -2.271858317548874692000 │
│       4 │ 94123.436 │ 93442.716 │ -0.723220516514080510000 │
│      32 │ 86664.849 │ 87891.856 │  1.415807001521458833000 │
│      64 │ 79941.243 │ 72271.146 │ -9.594668173973727179000 │
└─────────┴───────────┴───────────┴──────────────────────────┘

The variation in the patched 64 client numbers is quite large, ranging
from ~66.5k to ~79.5k.  The highest number matched the unpatched
numbers which ranged 77.9k to 80k.  I wonder if that is noise and we
need to run longer (in which case the best outcome might be 'this
patch is neutral on FreeBSD'), or if something the patch does is doing
is causing that (for example maybe EVFILT_PROC proc filters causes
contention on the process table lock).

Matteo's results with the v6 patch on a low end NetBSD machine were
not good.  But the report at [1] implies that larger NetBSD and
OpenBSD systems have terrible problems with the
poll-postmaster-alive-pipe approach, which this EVFILT_PROC approach
would seem to address pretty well.

It's difficult to draw any conclusions at this point.

[1] https://www.postgresql.org/message-id/flat/20160915135755.GC19008%40genua.de

-- 
Thomas Munro
http://www.enterprisedb.com

Re: kqueue

От

Torsten Zuehlsdorff

Дата:

11 октября 2016 г., 08:23:47

On 28.09.2016 23:39, Thomas Munro wrote:
> On Thu, Sep 29, 2016 at 9:09 AM, Keith Fiske <keith@omniti.com> wrote:
>> On Thu, Sep 15, 2016 at 11:11 PM, Thomas Munro
>> <thomas.munro@enterprisedb.com> wrote:
>>> Ok, here's a version tweaked to use EVFILT_PROC for postmaster death
>>> detection instead of the pipe, as Tom Lane suggested in another
>>> thread[1].
>>>
>>> [...]
>>
>> Ran benchmarks on unaltered 96rc1 again just to be safe. Those are first.
>> Decided to throw a 32 process test in there as well to see if there's
>> anything going on between 4 and 64
>
> Thanks!  A summary:
>
> [summary]
>
> The variation in the patched 64 client numbers is quite large, ranging
> from ~66.5k to ~79.5k.  The highest number matched the unpatched
> numbers which ranged 77.9k to 80k.  I wonder if that is noise and we
> need to run longer (in which case the best outcome might be 'this
> patch is neutral on FreeBSD'), or if something the patch does is doing
> is causing that (for example maybe EVFILT_PROC proc filters causes
> contention on the process table lock).
>
> [..]
>
> It's difficult to draw any conclusions at this point.

I'm currently setting up a new FreeBSD machine. Its a FreeBSD 11 with 
ZFS, 64 GB RAM and Quad Core. If you're interested in i can give you 
access for more tests this week. Maybe this will help to draw any 
conclusion.

Greetings,
Torsten

Re: [HACKERS] kqueue

От

Thomas Munro

Дата:

22 июня 2017 г., 10:19:00

On Tue, Oct 11, 2016 at 8:08 PM, Torsten Zuehlsdorff
<mailinglists@toco-domains.de> wrote:
> On 28.09.2016 23:39, Thomas Munro wrote:
>> It's difficult to draw any conclusions at this point.
>
> I'm currently setting up a new FreeBSD machine. Its a FreeBSD 11 with ZFS,
> 64 GB RAM and Quad Core. If you're interested in i can give you access for
> more tests this week. Maybe this will help to draw any conclusion.

I don't plan to resubmit this patch myself, but I was doing some
spring cleaning and rebasing today and I figured it might be worth
quietly leaving a working patch here just in case anyone from the
various BSD communities is interested in taking the idea further.

Some thoughts:  We could decide to make it the default on FooBSD but
not BarBSD according to experimental results... for example several
people reported that macOS developer machines run pgbench a bit
faster.  Also, we didn't ever get to the bottom of the complaint that
NetBSD and OpenBSD systems wake up every waiting backend when anyone
calls PostmasterIsAlive[1], which this patch should in theory fix (by
using EVFILT_PROC instead of waiting on that pipe).  On the other
hand, the fix for that may be to stop calling PostmasterIsAlive in
loops[2]!

[1] https://www.postgresql.org/message-id/CAEepm%3D27K-2AP1th97kiVvKpTuria9ocbjT0cXCJqnt4if5rJQ%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAEepm%3D3FW33PeRxt0jE4N0truJqOepp72R6W-zyM5mu1bxnZRw%40mail.gmail.com

-- 
Thomas Munro
http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

kqueue-v7.patch

Re: [HACKERS] kqueue

От

Thomas Munro

Дата:

05 декабря 2017 г., 14:53:22

On Thu, Jun 22, 2017 at 7:19 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> I don't plan to resubmit this patch myself, but I was doing some
> spring cleaning and rebasing today and I figured it might be worth
> quietly leaving a working patch here just in case anyone from the
> various BSD communities is interested in taking the idea further.

Since there was a mention of kqueue on -hackers today, here's another
rebase.  I got curious just now and ran a very quick test on an AWS 64
vCPU m4.16xlarge instance running image "FreeBSD
11.1-STABLE-amd64-2017-08-08 - ami-00608178".  I set shared_buffers =
10GB and ran pgbench approximately the same way Heikki and Keith did
upthread:

pgbench -i -s 200 postgres
pgbench -M prepared  -j 6 -c 6 -S postgres -T60 -P1
pgbench -M prepared  -j 12 -c 12 -S postgres -T60 -P1
pgbench -M prepared  -j 24 -c 24 -S postgres -T60 -P1
pgbench -M prepared  -j 36 -c 36 -S postgres -T60 -P1
pgbench -M prepared  -j 48 -c 48 -S postgres -T60 -P1

The TPS numbers I got (including connections establishing) were:

clients    master    patched
      6   146,215    147,535 (+0.9%)
     12   273,056    280,505 (+2.7%)
     24   360,751    369,965 (+2.5%)
     36   413,147    420,769 (+1.8%)
     48   416,189    444,537 (+6.8%)

The patch appears to be doing something positive on this particular
system and that effect was stable over a few runs.

-- 
Thomas Munro
http://www.enterprisedb.com

Вложения

kqueue-v8.patch

Re: [HACKERS] kqueue

От

Thomas Munro

Дата:

11 апреля 2018 г., 04:05:17

On Wed, Dec 6, 2017 at 12:53 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Thu, Jun 22, 2017 at 7:19 PM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> I don't plan to resubmit this patch myself, but I was doing some
>> spring cleaning and rebasing today and I figured it might be worth
>> quietly leaving a working patch here just in case anyone from the
>> various BSD communities is interested in taking the idea further.

I heard through the grapevine of some people currently investigating
performance problems on busy FreeBSD systems, possibly related to the
postmaster pipe.  I suspect this patch might be a part of the solution
(other patches probably needed to get maximum value out of this patch:
reuse WaitEventSet objects in some key places, and get rid of high
frequency PostmasterIsAlive() read() calls).  The autoconf-fu in the
last version bit-rotted so it seemed like a good time to post a
rebased patch.

-- 
Thomas Munro
http://www.enterprisedb.com

Вложения

kqueue-v9.patch

Re: [HACKERS] kqueue

От

Thomas Munro

Дата:

21 мая 2018 г., 10:03:57

On Wed, Apr 11, 2018 at 1:05 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> I heard through the grapevine of some people currently investigating
> performance problems on busy FreeBSD systems, possibly related to the
> postmaster pipe.  I suspect this patch might be a part of the solution
> (other patches probably needed to get maximum value out of this patch:
> reuse WaitEventSet objects in some key places, and get rid of high
> frequency PostmasterIsAlive() read() calls).  The autoconf-fu in the
> last version bit-rotted so it seemed like a good time to post a
> rebased patch.

Once I knew how to get a message resent to someone who wasn't
subscribed to our mailing list at the time it was sent[1] so they
could join an existing thread.  I don't know how to do that with the
new mailing list software, so I'm CC'ing Mateusz so he can share his
results on-thread.  Sorry for the noise.

[1] https://www.postgresql.org/message-id/CAEepm=0-KsV4Sj-0Qd4rMCg7UYdOQA=TUjLkEZOX7h_qiQQaCA@mail.gmail.com

-- 
Thomas Munro
http://www.enterprisedb.com

Re: [HACKERS] kqueue

От

Mateusz Guzik

Дата:

21 мая 2018 г., 10:27:15

On Mon, May 21, 2018 at 9:03 AM, Thomas Munro <thomas.munro@enterprisedb.com> wrote:

On Wed, Apr 11, 2018 at 1:05 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> I heard through the grapevine of some people currently investigating
> performance problems on busy FreeBSD systems, possibly related to the
> postmaster pipe. I suspect this patch might be a part of the solution
> (other patches probably needed to get maximum value out of this patch:
> reuse WaitEventSet objects in some key places, and get rid of high
> frequency PostmasterIsAlive() read() calls). The autoconf-fu in the
> last version bit-rotted so it seemed like a good time to post a
> rebased patch.

Hi everyone,

I have benchmarked the change on a FreeBSD box and found an big
performance win once the number of clients goes beyond the number of
hardware threads on the target machine. For smaller number of clients

the win was very modest.

The test was performed few weeks ago.

For convenience PostgreSQL 10.3 as found in the ports tree was used.

3 variants were tested:
- stock 10.3
- stock 10.3 + pdeathsig
- stock 10.3 + pdeathsig + kqueue

Appropriate patches were provided by Thomas.

In order to keep this message PG-13 I'm not going to show the actual
script, but a mere outline:

for i in $(seq 1 10): do
        for t in vanilla pdeathsig pdeathsig_kqueue; do
                start up the relevant version
                for c in 32 64 96; do
                        pgbench -j 96 -c $c -T 120 -M prepared -S -U bench -h 172.16.0.2 -P1 bench > ${t}-${c}-out-warmup 2>&1
                        pgbench -j 96 -c $c -T 120 -M prepared -S -U bench -h 172.16.0.2 -P1 bench > ${t}-${c}-out 2>&1
                done
                shutdown the relevant version
done

Data from the warmup is not used. All the data was pre-read prior to the
test.

PostgreSQL was configured with 32GB of shared buffers and 200 max
connections, otherwise it was the default.

The server is:
Intel(R) Xeon(R) Gold 6134 CPU @ 3.20GHz
2 package(s) x 8 core(s) x 2 hardware threads

i.e. 32 threads in total.

running FreeBSD -head with 'options NUMA' in kernel config and
sysctl net.inet.tcp.per_cpu_timers=1 on top of zfs.

The load was generated from a different box over a 100Gbit ethernet link.

x cumulative-tps-vanilla-32
+ cumulative-tps-pdeathsig-32
* cumulative-tps-pdeathsig_kqueue-32
+------------------------------------------------------------------------+
|+   + x+*     x+ * x       *        + * *       * * ** * **        *|
|   |_____|__M_A___M_A_____|____|             |________MA________|       |
+------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x 10     442898.77     448476.81     444805.17     445062.08     1679.7169
+ 10      442057.2     447835.46     443840.28     444235.01     1771.2254
No difference proven at 95.0% confidence
* 10     448138.07     452786.41     450274.56     450311.51     1387.2927
Difference at 95.0% confidence
        5249.43 +/- 1447.41
        1.17948% +/- 0.327501%
        (Student's t, pooled s = 1540.46)
x cumulative-tps-vanilla-64
+ cumulative-tps-pdeathsig-64
* cumulative-tps-pdeathsig_kqueue-64
+------------------------------------------------------------------------+
|                                                                     ** |
|                                                                     ** |
| xx x +                                                            ***|
|++**x *+*++                                                          ***|
| ||_A|M_|                                                           |A |
+------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x 10     411849.26      422145.5     416043.77      416061.9     3763.2545
+ 10     407123.74     425727.84     419908.73      417480.7     6817.5549
No difference proven at 95.0% confidence
* 10     542032.71     546106.93     543948.05     543874.06     1234.1788
Difference at 95.0% confidence
        127812 +/- 2631.31
        30.7195% +/- 0.809892%
        (Student's t, pooled s = 2800.47)
x cumulative-tps-vanilla-96
+ cumulative-tps-pdeathsig-96
* cumulative-tps-pdeathsig_kqueue-96
+------------------------------------------------------------------------+
|                                                                      * |
|                                                                      * |
|                                                                      * |
|                                                                      * |
| + x                                                                 * |
| *xxx+                                                               **|
|+ *****+                                                            * **|
| |MA||                                                              |A||
+------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x 10      325263.7        336338     332399.16     331321.82     3571.2478
+ 10     321213.33     338669.66     329553.78     330903.58      5652.008
No difference proven at 95.0% confidence
* 10     503877.22     511449.96     508708.41     508808.51     2016.9483
Difference at 95.0% confidence
        177487 +/- 2724.98
        53.5693% +/- 1.17178%
        (Student's t, pooled s = 2900.16)

Mateusz Guzik <mjguzik gmail.com>

Re: [HACKERS] kqueue

От

Thomas Munro

Дата:

22 мая 2018 г., 03:07:17

On Mon, May 21, 2018 at 7:27 PM, Mateusz Guzik <mjguzik@gmail.com> wrote:
> I have benchmarked the change on a FreeBSD box and found an big
> performance win once the number of clients goes beyond the number of
> hardware threads on the target machine. For smaller number of clients
> the win was very modest.

Thanks for the report!  This is good news for the patch, if we can
explain a few mysteries.

> 3 variants were tested:
> - stock 10.3
> - stock 10.3 + pdeathsig
> - stock 10.3 + pdeathsig + kqueue

For the record, "pdeathsig" refers to another patch of mine[1] that is
not relevant to this test (it's a small change in the recovery loop,
important for replication but not even reached here).

> [a bunch of neat output from ministat]

So to summarise your results:

32 connections: ~445k -> ~450k = +1.2%
64 connections: ~416k -> ~544k = +30.7%
96 connections: ~331k -> ~508k = +53.6%

As you added more connections above your thread count, stock 10.3's
TPS number went down, but with the patch it went up.  So now we have
to explain why you see a huge performance boost but others reported a
modest gain or in some cases loss.  The main things that jump out:

1.  You used TCP sockets and ran pgbench on another machine, while
others used Unix domain sockets.
2.  You're running a newer/bleeding edge kernel.
3.  You used more CPUs than most reporters.

For the record, Mateusz and others discovered some fixable global lock
contention in the Unix domain socket layer that is now being hacked
on[2], though it's not clear if that'd affect the results reported
earlier or not.

[1] https://www.postgresql.org/message-id/CAEepm%3D0w9AAHAH73-tkZ8VS2Lg6JzY4ii3TG7t-R%2B_MWyUAk9g%40mail.gmail.com
[2] https://reviews.freebsd.org/D15430

-- 
Thomas Munro
http://www.enterprisedb.com

Re: [HACKERS] kqueue

От

Thomas Munro

Дата:

28 сентября 2018 г., 01:55:13

On Tue, May 22, 2018 at 12:07 PM Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Mon, May 21, 2018 at 7:27 PM, Mateusz Guzik <mjguzik@gmail.com> wrote:
> > I have benchmarked the change on a FreeBSD box and found an big
> > performance win once the number of clients goes beyond the number of
> > hardware threads on the target machine. For smaller number of clients
> > the win was very modest.
>
> So to summarise your results:
>
> 32 connections: ~445k -> ~450k = +1.2%
> 64 connections: ~416k -> ~544k = +30.7%
> 96 connections: ~331k -> ~508k = +53.6%

I would like to commit this patch for PostgreSQL 12, based on this
report.  We know it helps performance on macOS developer machines and
big FreeBSD servers, and it is the right kernel interface for the job
on principle.  Matteo Beccati reported a 5-10% performance drop on a
low-end Celeron NetBSD box which we have no explanation for, and we
have no reports from server-class machines on that OS -- so perhaps we
(or the NetBSD port?) should consider building with WAIT_USE_POLL on
NetBSD until someone can figure out what needs to be fixed there
(possibly on the NetBSD side)?

Here's a rebased patch, which I'm adding to the to November CF to give
people time to retest, object, etc if they want to.

-- 
Thomas Munro
http://www.enterprisedb.com

Вложения

0001-Add-kqueue-support-to-WaitEventSet-v10.patch

Re: [HACKERS] kqueue

От

Andres Freund

Дата:

28 сентября 2018 г., 02:09:11

Hi,

On 2018-09-28 10:55:13 +1200, Thomas Munro wrote:
> On Tue, May 22, 2018 at 12:07 PM Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
> > On Mon, May 21, 2018 at 7:27 PM, Mateusz Guzik <mjguzik@gmail.com> wrote:
> > > I have benchmarked the change on a FreeBSD box and found an big
> > > performance win once the number of clients goes beyond the number of
> > > hardware threads on the target machine. For smaller number of clients
> > > the win was very modest.
> >
> > So to summarise your results:
> >
> > 32 connections: ~445k -> ~450k = +1.2%
> > 64 connections: ~416k -> ~544k = +30.7%
> > 96 connections: ~331k -> ~508k = +53.6%
> 
> I would like to commit this patch for PostgreSQL 12, based on this
> report.  We know it helps performance on macOS developer machines and
> big FreeBSD servers, and it is the right kernel interface for the job
> on principle.

Seems reasonable.


> Matteo Beccati reported a 5-10% performance drop on a
> low-end Celeron NetBSD box which we have no explanation for, and we
> have no reports from server-class machines on that OS -- so perhaps we
> (or the NetBSD port?) should consider building with WAIT_USE_POLL on
> NetBSD until someone can figure out what needs to be fixed there
> (possibly on the NetBSD side)?

Yea, I'm not too worried about that. It'd be great to test that, but
otherwise I'm also ok to just plonk that into the template.

> @@ -576,6 +592,10 @@ CreateWaitEventSet(MemoryContext context, int nevents)
>      if (fcntl(set->epoll_fd, F_SETFD, FD_CLOEXEC) == -1)
>          elog(ERROR, "fcntl(F_SETFD) failed on epoll descriptor: %m");
>  #endif                            /* EPOLL_CLOEXEC */
> +#elif defined(WAIT_USE_KQUEUE)
> +    set->kqueue_fd = kqueue();
> +    if (set->kqueue_fd < 0)
> +        elog(ERROR, "kqueue failed: %m");
>  #elif defined(WAIT_USE_WIN32)

Is this automatically opened with some FD_CLOEXEC equivalent?


> +static inline void
> +WaitEventAdjustKqueueAdd(struct kevent *k_ev, int filter, int action,
> +                         WaitEvent *event)
> +{
> +    k_ev->ident = event->fd;
> +    k_ev->filter = filter;
> +    k_ev->flags = action | EV_CLEAR;
> +    k_ev->fflags = 0;
> +    k_ev->data = 0;
> +
> +    /*
> +     * On most BSD family systems, udata is of type void * so we could simply
> +     * assign event to it without casting, or use the EV_SET macro instead of
> +     * filling in the struct manually.  Unfortunately, NetBSD and possibly
> +     * others have it as intptr_t, so here we wallpaper over that difference
> +     * with an unsightly lvalue cast.
> +     */
> +    *((WaitEvent **)(&k_ev->udata)) = event;

I'm mildly inclined to hide that behind a macro, so the other places
have a reference, via the macro definition, to this too.

> +    if (rc < 0 && event->events == WL_POSTMASTER_DEATH && errno == ESRCH)
> +    {
> +        /*
> +         * The postmaster is already dead.  Defer reporting this to the caller
> +         * until wait time, for compatibility with the other implementations.
> +         * To do that we will now add the regular alive pipe.
> +         */
> +        WaitEventAdjustKqueueAdd(&k_ev[0], EVFILT_READ, EV_ADD, event);
> +        rc = kevent(set->kqueue_fd, &k_ev[0], count, NULL, 0, NULL);
> +    }

That's, ... not particulary pretty. Kinda wonder if we shouldn't instead
just add a 'pending_events' field, that we can check at wait time.

> diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
> index 90dda8ea050..4bcabc3b381 100644
> --- a/src/include/pg_config.h.in
> +++ b/src/include/pg_config.h.in
> @@ -330,6 +330,9 @@
>  /* Define to 1 if you have isinf(). */
>  #undef HAVE_ISINF
>  
> +/* Define to 1 if you have the `kqueue' function. */
> +#undef HAVE_KQUEUE
> +
>  /* Define to 1 if you have the <langinfo.h> header file. */
>  #undef HAVE_LANGINFO_H
>  
> @@ -598,6 +601,9 @@
>  /* Define to 1 if you have the <sys/epoll.h> header file. */
>  #undef HAVE_SYS_EPOLL_H
>  
> +/* Define to 1 if you have the <sys/event.h> header file. */
> +#undef HAVE_SYS_EVENT_H
> +
>  /* Define to 1 if you have the <sys/ipc.h> header file. */
>  #undef HAVE_SYS_IPC_H

Should adjust pg_config.win32.h too.

Greetings,

Andres Freund

Re: [HACKERS] kqueue

От

Matteo Beccati

Дата:

28 сентября 2018 г., 10:01:26

Hi Thomas,

On 28/09/2018 00:55, Thomas Munro wrote:
> I would like to commit this patch for PostgreSQL 12, based on this
> report.  We know it helps performance on macOS developer machines and
> big FreeBSD servers, and it is the right kernel interface for the job
> on principle.  Matteo Beccati reported a 5-10% performance drop on a
> low-end Celeron NetBSD box which we have no explanation for, and we
> have no reports from server-class machines on that OS -- so perhaps we
> (or the NetBSD port?) should consider building with WAIT_USE_POLL on
> NetBSD until someone can figure out what needs to be fixed there
> (possibly on the NetBSD side)?

Thanks for keeping me in the loop.

Out of curiosity (and time permitting) I'll try to spin up a NetBSD 8 VM
and run some benchmarks, but I guess we should leave it up to the pkgsrc
people to eventually change the build flags.


Cheers
-- 
Matteo Beccati

Development & Consulting - http://www.beccati.com/

Re: [HACKERS] kqueue

От

Thomas Munro

Дата:

28 сентября 2018 г., 15:19:58

On Fri, Sep 28, 2018 at 11:09 AM Andres Freund <andres@anarazel.de> wrote:
> On 2018-09-28 10:55:13 +1200, Thomas Munro wrote:
> > Matteo Beccati reported a 5-10% performance drop on a
> > low-end Celeron NetBSD box which we have no explanation for, and we
> > have no reports from server-class machines on that OS -- so perhaps we
> > (or the NetBSD port?) should consider building with WAIT_USE_POLL on
> > NetBSD until someone can figure out what needs to be fixed there
> > (possibly on the NetBSD side)?
>
> Yea, I'm not too worried about that. It'd be great to test that, but
> otherwise I'm also ok to just plonk that into the template.

Thanks for the review!  Ok, if we don't get a better idea I'll put
this in src/template/netbsd:

CPPFLAGS="$CPPFLAGS -DWAIT_USE_POLL"

> > @@ -576,6 +592,10 @@ CreateWaitEventSet(MemoryContext context, int nevents)
> >       if (fcntl(set->epoll_fd, F_SETFD, FD_CLOEXEC) == -1)
> >               elog(ERROR, "fcntl(F_SETFD) failed on epoll descriptor: %m");
> >  #endif                                                       /* EPOLL_CLOEXEC */
> > +#elif defined(WAIT_USE_KQUEUE)
> > +     set->kqueue_fd = kqueue();
> > +     if (set->kqueue_fd < 0)
> > +             elog(ERROR, "kqueue failed: %m");
> >  #elif defined(WAIT_USE_WIN32)
>
> Is this automatically opened with some FD_CLOEXEC equivalent?

No.  Hmm, I thought it wasn't necessary because kqueue descriptors are
not inherited and backends don't execve() directly without forking,
but I guess it can't hurt to add a fcntl() call.  Done.

> > +     *((WaitEvent **)(&k_ev->udata)) = event;
>
> I'm mildly inclined to hide that behind a macro, so the other places
> have a reference, via the macro definition, to this too.

Done.

> > +     if (rc < 0 && event->events == WL_POSTMASTER_DEATH && errno == ESRCH)
> > +     {
> > +             /*
> > +              * The postmaster is already dead.  Defer reporting this to the caller
> > +              * until wait time, for compatibility with the other implementations.
> > +              * To do that we will now add the regular alive pipe.
> > +              */
> > +             WaitEventAdjustKqueueAdd(&k_ev[0], EVFILT_READ, EV_ADD, event);
> > +             rc = kevent(set->kqueue_fd, &k_ev[0], count, NULL, 0, NULL);
> > +     }
>
> That's, ... not particulary pretty. Kinda wonder if we shouldn't instead
> just add a 'pending_events' field, that we can check at wait time.

Done.

> > +/* Define to 1 if you have the `kqueue' function. */
> > +#undef HAVE_KQUEUE
> > +

> Should adjust pg_config.win32.h too.

Done.

-- 
Thomas Munro
http://www.enterprisedb.com

Вложения

0001-Add-kqueue-2-support-for-WaitEventSet-v11.patch

Re: [HACKERS] kqueue

От

Matteo Beccati

Дата:

29 сентября 2018 г., 10:51:22

On 28/09/2018 14:19, Thomas Munro wrote:
> On Fri, Sep 28, 2018 at 11:09 AM Andres Freund <andres@anarazel.de> wrote:
>> On 2018-09-28 10:55:13 +1200, Thomas Munro wrote:
>>> Matteo Beccati reported a 5-10% performance drop on a
>>> low-end Celeron NetBSD box which we have no explanation for, and we
>>> have no reports from server-class machines on that OS -- so perhaps we
>>> (or the NetBSD port?) should consider building with WAIT_USE_POLL on
>>> NetBSD until someone can figure out what needs to be fixed there
>>> (possibly on the NetBSD side)?
>>
>> Yea, I'm not too worried about that. It'd be great to test that, but
>> otherwise I'm also ok to just plonk that into the template.
> 
> Thanks for the review!  Ok, if we don't get a better idea I'll put
> this in src/template/netbsd:
> 
> CPPFLAGS="$CPPFLAGS -DWAIT_USE_POLL"

A quick test on a 8 vCPU / 4GB RAM virtual machine running a fresh
install of NetBSD 8.0 again shows that kqueue is consistently slower
running pgbench vs unpatched master on tcp-b like pgbench workloads:

~1200tps vs ~1400tps w/ 96 clients and threads, scale factor 10

while on select only benchmarks the difference is below the noise floor,
with both doing roughly the same ~30k tps.

Out of curiosity, I've installed FreBSD on an identically specced VM,
and the select benchmark was ~75k tps for kqueue vs ~90k tps on
unpatched master, so maybe there's something wrong I'm doing when
benchmarking. Could you please provide proper instructions?


Cheers
-- 
Matteo Beccati

Development & Consulting - http://www.beccati.com/

Re: [HACKERS] kqueue

От

Thomas Munro

Дата:

30 сентября 2018 г., 05:36:18

On Sat, Sep 29, 2018 at 7:51 PM Matteo Beccati <php@beccati.com> wrote:
> On 28/09/2018 14:19, Thomas Munro wrote:
> > On Fri, Sep 28, 2018 at 11:09 AM Andres Freund <andres@anarazel.de> wrote:
> >> On 2018-09-28 10:55:13 +1200, Thomas Munro wrote:
> >>> Matteo Beccati reported a 5-10% performance drop on a
> >>> low-end Celeron NetBSD box which we have no explanation for, and we
> >>> have no reports from server-class machines on that OS -- so perhaps we
> >>> (or the NetBSD port?) should consider building with WAIT_USE_POLL on
> >>> NetBSD until someone can figure out what needs to be fixed there
> >>> (possibly on the NetBSD side)?
> >>
> >> Yea, I'm not too worried about that. It'd be great to test that, but
> >> otherwise I'm also ok to just plonk that into the template.
> >
> > Thanks for the review!  Ok, if we don't get a better idea I'll put
> > this in src/template/netbsd:
> >
> > CPPFLAGS="$CPPFLAGS -DWAIT_USE_POLL"
>
> A quick test on a 8 vCPU / 4GB RAM virtual machine running a fresh
> install of NetBSD 8.0 again shows that kqueue is consistently slower
> running pgbench vs unpatched master on tcp-b like pgbench workloads:
>
> ~1200tps vs ~1400tps w/ 96 clients and threads, scale factor 10
>
> while on select only benchmarks the difference is below the noise floor,
> with both doing roughly the same ~30k tps.
>
> Out of curiosity, I've installed FreBSD on an identically specced VM,
> and the select benchmark was ~75k tps for kqueue vs ~90k tps on
> unpatched master, so maybe there's something wrong I'm doing when
> benchmarking. Could you please provide proper instructions?

Ouch.  What kind of virtualisation is this?  Which version of FreeBSD?
 Not sure if it's relevant, but do you happen to see gettimeofday()
showing up as a syscall, if you truss a backend running pgbench?

-- 
Thomas Munro
http://www.enterprisedb.com

Re: [HACKERS] kqueue

От

Matteo Beccati

Дата:

30 сентября 2018 г., 11:49:21

Hi Thomas,

On 30/09/2018 04:36, Thomas Munro wrote:
> On Sat, Sep 29, 2018 at 7:51 PM Matteo Beccati <php@beccati.com> wrote:
>> Out of curiosity, I've installed FreBSD on an identically specced VM,
>> and the select benchmark was ~75k tps for kqueue vs ~90k tps on
>> unpatched master, so maybe there's something wrong I'm doing when
>> benchmarking. Could you please provide proper instructions?
> 
> Ouch.  What kind of virtualisation is this?  Which version of FreeBSD?
>  Not sure if it's relevant, but do you happen to see gettimeofday()
> showing up as a syscall, if you truss a backend running pgbench?

I downloaded 11.2 as VHD file in order to run on MS Hyper-V / Win10 Pro.

Yes, I saw plenty of gettimeofday calls when running truss:

> gettimeofday({ 1538297117.071344 },0x0)          = 0 (0x0)
> gettimeofday({ 1538297117.071743 },0x0)          = 0 (0x0)
> gettimeofday({ 1538297117.072021 },0x0)          = 0 (0x0)
> getpid()                                         = 766 (0x2fe)
> __sysctl(0x7fffffffce90,0x4,0x0,0x0,0x801891000,0x2b) = 0 (0x0)
> gettimeofday({ 1538297117.072944 },0x0)          = 0 (0x0)
> getpid()                                         = 766 (0x2fe)
> __sysctl(0x7fffffffce90,0x4,0x0,0x0,0x801891000,0x29) = 0 (0x0)
> gettimeofday({ 1538297117.073682 },0x0)          = 0 (0x0)
> sendto(9,"2\0\0\0\^DT\0\0\0!\0\^Aabalance"...,71,0,NULL,0) = 71 (0x47)
> recvfrom(9,"B\0\0\0\^\\0P0_1\0\0\0\0\^A\0\0"...,8192,0,NULL,0x0) = 51 (0x33)
> gettimeofday({ 1538297117.074955 },0x0)          = 0 (0x0)
> gettimeofday({ 1538297117.075308 },0x0)          = 0 (0x0)
> getpid()                                         = 766 (0x2fe)
> __sysctl(0x7fffffffce90,0x4,0x0,0x0,0x801891000,0x29) = 0 (0x0)
> gettimeofday({ 1538297117.076252 },0x0)          = 0 (0x0)
> gettimeofday({ 1538297117.076431 },0x0)          = 0 (0x0)
> gettimeofday({ 1538297117.076678 },0x0^C)                = 0 (0x0)



Cheers
-- 
Matteo Beccati

Development & Consulting - http://www.beccati.com/

Re: [HACKERS] kqueue

От

Thomas Munro

Дата:

01 октября 2018 г., 02:09:43

On Sun, Sep 30, 2018 at 9:49 PM Matteo Beccati <php@beccati.com> wrote:
> On 30/09/2018 04:36, Thomas Munro wrote:
> > On Sat, Sep 29, 2018 at 7:51 PM Matteo Beccati <php@beccati.com> wrote:
> >> Out of curiosity, I've installed FreBSD on an identically specced VM,
> >> and the select benchmark was ~75k tps for kqueue vs ~90k tps on
> >> unpatched master, so maybe there's something wrong I'm doing when
> >> benchmarking. Could you please provide proper instructions?
> >
> > Ouch.  What kind of virtualisation is this?  Which version of FreeBSD?
> >  Not sure if it's relevant, but do you happen to see gettimeofday()
> > showing up as a syscall, if you truss a backend running pgbench?
>
> I downloaded 11.2 as VHD file in order to run on MS Hyper-V / Win10 Pro.
>
> Yes, I saw plenty of gettimeofday calls when running truss:
>
> > gettimeofday({ 1538297117.071344 },0x0)          = 0 (0x0)
> > gettimeofday({ 1538297117.071743 },0x0)          = 0 (0x0)
> > gettimeofday({ 1538297117.072021 },0x0)          = 0 (0x0)

Ok.  Those syscalls show up depending on your
kern.timecounter.hardware setting and virtualised hardware: just like
on Linux, gettimeofday() can be a cheap userspace operation (vDSO)
that avoids the syscall path, or not.  I'm not seeing any reason to
think that's relevant here.

> > getpid()                                         = 766 (0x2fe)
> > __sysctl(0x7fffffffce90,0x4,0x0,0x0,0x801891000,0x2b) = 0 (0x0)
> > gettimeofday({ 1538297117.072944 },0x0)          = 0 (0x0)
> > getpid()                                         = 766 (0x2fe)
> > __sysctl(0x7fffffffce90,0x4,0x0,0x0,0x801891000,0x29) = 0 (0x0)

That's setproctitle().  Those syscalls go away if you use FreeBSD 12
(which has setproctitle_fast()).  If you fix both of those problems,
you are left with just:

> > sendto(9,"2\0\0\0\^DT\0\0\0!\0\^Aabalance"...,71,0,NULL,0) = 71 (0x47)
> > recvfrom(9,"B\0\0\0\^\\0P0_1\0\0\0\0\^A\0\0"...,8192,0,NULL,0x0) = 51 (0x33)

These are the only syscalls I see for each pgbench -S transaction on
my bare metal machine: just the network round trip.  The funny thing
is ... there are almost no kevent() calls.

I managed to reproduce the regression (~70k -> ~50k) using a prewarmed
scale 10 select-only pgbench with 2GB of shared_buffers (so it all
fits), with -j 96 -c 96 on an 8 vCPU AWS t2.2xlarge running FreeBSD 12
ALPHA8.  Here is what truss -c says, capturing data from one backend
for about 10 seconds:

syscall                     seconds   calls  errors
sendto                  0.396840146    3452       0
recvfrom                0.415802029    3443       6
kevent                  0.000626393       6       0
gettimeofday            2.723923249   24053       0
                      ------------- ------- -------
                        3.537191817   30954       6

(There's no regression with -j 8 -c 8, the problem is when
significantly overloaded, the same circumstances under which Matheusz
reported a great improvement).  So... it's very rarely accessing the
kqueue directly... but its existence somehow slows things down.
Curiously, when using poll() it's actually calling poll() ~90/sec for
me:

syscall                     seconds   calls  errors
sendto                  0.352784808    3226       0
recvfrom                0.614855254    4125     916
poll                    0.319396480     916       0
gettimeofday            2.659035352   22456       0
                      ------------- ------- -------
                        3.946071894   30723     916

I don't know what's going on here.  Based on the reports so far, we
know that kqueue gives a speedup when using bare metal with pgbench
running on a different machine, but a slowdown when using
virtualisation and pgbench running on the same machine (and I just
checked that that's observable with both Unix sockets and TCP
sockets).  That gave me the idea of looking at pgbench itself:

Unpatched:

syscall                     seconds   calls  errors
ppoll                   0.004869268       1       0
sendto                 16.489416911    7033       0
recvfrom               21.137606238    7049       0
                      ------------- ------- -------
                       37.631892417   14083       0

Patched:

syscall                     seconds   calls  errors
ppoll                   0.002773195       1       0
sendto                 16.597880468    7217       0
recvfrom               25.646406008    7238       0
                      ------------- ------- -------
                       42.247059671   14456       0

I don't know why the existence of the kqueue should make recvfrom()
slower on the pgbench side.  That's probably something to look into
off-line with some FreeBSD guru help.  Degraded performance for
clients on the same machine does seem to be a show stopper for this
patch for now.  Thanks for testing!

-- 
Thomas Munro
http://www.enterprisedb.com

Re: [HACKERS] kqueue

От

Matteo Beccati

Дата:

01 октября 2018 г., 20:25:45

Hi Thomas,

On 01/10/2018 01:09, Thomas Munro wrote:
> I don't know why the existence of the kqueue should make recvfrom()
> slower on the pgbench side.  That's probably something to look into
> off-line with some FreeBSD guru help.  Degraded performance for
> clients on the same machine does seem to be a show stopper for this
> patch for now.  Thanks for testing!

Glad to be helpful!

I've tried running pgbench from a separate VM and in fact kqueue 
consistently takes the lead with 5-10% more tps on select/prepared 
pgbench on NetBSD too.

What I have observed is that sys cpu usage is ~65% (35% idle) with 
kqueue, while unpatched master averages at 55% (45% idle): relatively 
speaking that's almost 25% less idle cpu available for a local pgbench 
to do its own stuff.

Running pgbench locally shows an average 47% usr / 53% sys cpu 
distribution w/ kqueue vs more like 50-50 w/ vanilla, so I'm inclined to 
think that's the reason why we see a performance drop instead. Thoguhts?

Cheers
-- 
Matteo Beccati

Development & Consulting - http://www.beccati.com/

Re: [HACKERS] kqueue

От

Andres Freund

Дата:

01 октября 2018 г., 20:28:03

On 2018-10-01 19:25:45 +0200, Matteo Beccati wrote:
> On 01/10/2018 01:09, Thomas Munro wrote:
> > I don't know why the existence of the kqueue should make recvfrom()
> > slower on the pgbench side.  That's probably something to look into
> > off-line with some FreeBSD guru help.  Degraded performance for
> > clients on the same machine does seem to be a show stopper for this
> > patch for now.  Thanks for testing!
> 
> Glad to be helpful!
> 
> I've tried running pgbench from a separate VM and in fact kqueue
> consistently takes the lead with 5-10% more tps on select/prepared pgbench
> on NetBSD too.
> 
> What I have observed is that sys cpu usage is ~65% (35% idle) with kqueue,
> while unpatched master averages at 55% (45% idle): relatively speaking
> that's almost 25% less idle cpu available for a local pgbench to do its own
> stuff.

This suggest that either the the wakeup logic between kqueue and poll,
or the internal locking could be at issue.  Is it possible that poll
triggers a directed wakeup path, but kqueue doesn't?

Greetings,

Andres Freund

Re: [HACKERS] kqueue

От

Thomas Munro

Дата:

09 октября 2018 г., 21:10:01

On Tue, Oct 2, 2018 at 6:28 AM Andres Freund <andres@anarazel.de> wrote:
> On 2018-10-01 19:25:45 +0200, Matteo Beccati wrote:
> > On 01/10/2018 01:09, Thomas Munro wrote:
> > > I don't know why the existence of the kqueue should make recvfrom()
> > > slower on the pgbench side.  That's probably something to look into
> > > off-line with some FreeBSD guru help.  Degraded performance for
> > > clients on the same machine does seem to be a show stopper for this
> > > patch for now.  Thanks for testing!
> >
> > Glad to be helpful!
> >
> > I've tried running pgbench from a separate VM and in fact kqueue
> > consistently takes the lead with 5-10% more tps on select/prepared pgbench
> > on NetBSD too.
> >
> > What I have observed is that sys cpu usage is ~65% (35% idle) with kqueue,
> > while unpatched master averages at 55% (45% idle): relatively speaking
> > that's almost 25% less idle cpu available for a local pgbench to do its own
> > stuff.
>
> This suggest that either the the wakeup logic between kqueue and poll,
> or the internal locking could be at issue.  Is it possible that poll
> triggers a directed wakeup path, but kqueue doesn't?

I am following up with some kernel hackers.  In the meantime, here is
a rebase for the new split-line configure.in, to turn cfbot green.

-- 
Thomas Munro
http://www.enterprisedb.com

Вложения

0001-Add-kqueue-2-support-for-WaitEventSet-v12.patch

Re: [HACKERS] kqueue

От

Rui DeSousa

Дата:

19 декабря 2019 г., 23:41:39

> On Apr 10, 2018, at 9:05 PM, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
>
> On Wed, Dec 6, 2017 at 12:53 AM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> On Thu, Jun 22, 2017 at 7:19 PM, Thomas Munro
>> <thomas.munro@enterprisedb.com> wrote:
>>> I don't plan to resubmit this patch myself, but I was doing some
>>> spring cleaning and rebasing today and I figured it might be worth
>>> quietly leaving a working patch here just in case anyone from the
>>> various BSD communities is interested in taking the idea further.
>
> I heard through the grapevine of some people currently investigating
> performance problems on busy FreeBSD systems, possibly related to the
> postmaster pipe.  I suspect this patch might be a part of the solution
> (other patches probably needed to get maximum value out of this patch:
> reuse WaitEventSet objects in some key places, and get rid of high
> frequency PostmasterIsAlive() read() calls).  The autoconf-fu in the
> last version bit-rotted so it seemed like a good time to post a
> rebased patch.
>
> --
> Thomas Munro
> http://www.enterprisedb.com
> <kqueue-v9.patch>

Hi,

I’m instrested in the kqueue patch and would like to know its current state and possible timeline for inclusion in the
basecode.  I have several large FreeBSD systems running PostgreSQL 11 that I believe currently displays this issue.
Thesystem has 88 vCPUs, 512GB Ram, and very active application with over 1000 connections to the database.  The system
exhibitshigh kernel CPU usage servicing poll() for connections that are idle.    

I’ve being testing pg_bouncer to reduce the number of connections and thus system CPU usage; however, not all
connectionscan go through pg_bouncer.  

Thanks,
Rui.

Re: [HACKERS] kqueue

От

Thomas Munro

Дата:

20 декабря 2019 г., 00:26:22

0001-Add-kqueue-2-support-for-WaitEventSet-v14.patch

Re: [HACKERS] kqueue

От

Matteo Beccati

Дата:

21 января 2020 г., 09:55:46

Hi,

On 21/01/2020 02:06, Thomas Munro wrote:
> [1] https://www.postgresql.org/message-id/CA%2BhUKGJAC4Oqao%3DqforhNey20J8CiG2R%3DoBPqvfR0vOJrFysGw%40mail.gmail.com

I had a NetBSD 8.0 VM lying around and I gave the patch a spin on latest
master.

With the kqueue patch, a pgbench -c basically hangs the whole postgres
instance. Not sure if it's a kernel issue, HyperVM issue o what, but
when it hangs, I can't even kill -9 the postgres processes or get the VM
to properly shutdown. The same doesn't happen, of course, with vanilla
postgres.

If the patch gets merged, I'd say it's safer not to enable it on NetBSD
and eventually leave it up to the pkgsrc team.


Cheers
-- 
Matteo Beccati

Development & Consulting - http://www.beccati.com/

Re: [HACKERS] kqueue

От

Tom Lane

Дата:

22 января 2020 г., 16:06:59

Matteo Beccati <php@beccati.com> writes:
> On 21/01/2020 02:06, Thomas Munro wrote:
>> [1] https://www.postgresql.org/message-id/CA%2BhUKGJAC4Oqao%3DqforhNey20J8CiG2R%3DoBPqvfR0vOJrFysGw%40mail.gmail.com

> I had a NetBSD 8.0 VM lying around and I gave the patch a spin on latest
> master.
> With the kqueue patch, a pgbench -c basically hangs the whole postgres
> instance. Not sure if it's a kernel issue, HyperVM issue o what, but
> when it hangs, I can't even kill -9 the postgres processes or get the VM
> to properly shutdown. The same doesn't happen, of course, with vanilla
> postgres.

I'm a bit confused about what you are testing --- the kqueue patch
as per this thread, or that plus the WaitLatch refactorizations in
the other thread you point to above?

I've gotten through check-world successfully with the v14 kqueue patch
atop yesterday's HEAD on:

* macOS Catalina 10.15.2 (current release)
* FreeBSD/amd64 12.0-RELEASE-p12
* NetBSD/amd64 8.1
* NetBSD/arm 8.99.41
* OpenBSD/amd64 6.5

(These OSes are all on bare metal, no VMs involved)

This just says it doesn't lock up, of course.  I've not attempted
any performance-oriented tests.

            regards, tom lane

Re: [HACKERS] kqueue

От

Matteo Beccati

Дата:

22 января 2020 г., 16:11:44

On 22/01/2020 17:06, Tom Lane wrote:
> Matteo Beccati <php@beccati.com> writes:
>> On 21/01/2020 02:06, Thomas Munro wrote:
>>> [1]
https://www.postgresql.org/message-id/CA%2BhUKGJAC4Oqao%3DqforhNey20J8CiG2R%3DoBPqvfR0vOJrFysGw%40mail.gmail.com
> 
>> I had a NetBSD 8.0 VM lying around and I gave the patch a spin on latest
>> master.
>> With the kqueue patch, a pgbench -c basically hangs the whole postgres
>> instance. Not sure if it's a kernel issue, HyperVM issue o what, but
>> when it hangs, I can't even kill -9 the postgres processes or get the VM
>> to properly shutdown. The same doesn't happen, of course, with vanilla
>> postgres.
> 
> I'm a bit confused about what you are testing --- the kqueue patch
> as per this thread, or that plus the WaitLatch refactorizations in
> the other thread you point to above?

my bad, I tested the v14 patch attached to the email.

The quoted url was just above the patch name in the email client and
somehow my brain thought I was quoting the v14 patch name.


Cheers
-- 
Matteo Beccati

Development & Consulting - http://www.beccati.com/

Re: [HACKERS] kqueue

От

Tom Lane

Дата:

22 января 2020 г., 17:19:04

Matteo Beccati <php@beccati.com> writes:
> On 22/01/2020 17:06, Tom Lane wrote:
>> Matteo Beccati <php@beccati.com> writes:
>>> I had a NetBSD 8.0 VM lying around and I gave the patch a spin on latest
>>> master.
>>> With the kqueue patch, a pgbench -c basically hangs the whole postgres
>>> instance. Not sure if it's a kernel issue, HyperVM issue o what, but
>>> when it hangs, I can't even kill -9 the postgres processes or get the VM
>>> to properly shutdown. The same doesn't happen, of course, with vanilla
>>> postgres.

>> I'm a bit confused about what you are testing --- the kqueue patch
>> as per this thread, or that plus the WaitLatch refactorizations in
>> the other thread you point to above?

> my bad, I tested the v14 patch attached to the email.

Thanks for clarifying.

FWIW, I can't replicate the problem here using NetBSD 8.1 amd64
on bare metal.  I tried various pgbench parameters up to "-c 20 -j 20"
(on a 4-cores-plus-hyperthreading CPU), and it seems fine.

One theory is that NetBSD fixed something since 8.0, but I trawled
their 8.1 release notes [1], and the only items mentioning kqueue
or kevent are for fixes in the pty and tun drivers, neither of which
seem relevant.  (But wait ... could your VM setup be dependent on
a tunnel network interface for outside-the-VM connectivity?  Still
hard to see the connection though.)

My guess is that what you're seeing is a VM bug.

            regards, tom lane

[1] https://cdn.netbsd.org/pub/NetBSD/NetBSD-8.1/CHANGES-8.1

Re: [HACKERS] kqueue

От

Tom Lane

Дата:

22 января 2020 г., 19:19:11

I wrote:
> This just says it doesn't lock up, of course.  I've not attempted
> any performance-oriented tests.

I've now done some light performance testing -- just stuff like
pgbench -S -M prepared -c 20 -j 20 -T 60 bench

I cannot see any improvement on either FreeBSD 12 or NetBSD 8.1,
either as to net TPS or as to CPU load.  If anything, the TPS
rate is a bit lower with the patch, though I'm not sure that
that effect is above the noise level.

It's certainly possible that to see any benefit you need stress
levels above what I can manage on the small box I've got these
OSes on.  Still, it'd be nice if a performance patch could show
some improved performance, before we take any portability risks
for it.

            regards, tom lane

Re: [HACKERS] kqueue

От

Rui DeSousa

Дата:

22 января 2020 г., 20:38:07

On Jan 22, 2020, at 2:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I cannot see any improvement on either FreeBSD 12 or NetBSD 8.1,
either as to net TPS or as to CPU load. If anything, the TPS
rate is a bit lower with the patch, though I'm not sure that
that effect is above the noise level.

It's certainly possible that to see any benefit you need stress
levels above what I can manage on the small box I've got these
OSes on. Still, it'd be nice if a performance patch could show
some improved performance, before we take any portability risks
for it.

Tom,

Here is two charts comparing a patched and unpatched system. These systems are very large and have just shy of thousand connections each with averages of 20 to 30 active queries concurrently running at times including hundreds if not thousand of queries hitting the database in rapid succession. The effect is the unpatched system generates a lot of system load just handling idle connections where as the patched version is not impacted by idle sessions or sessions that have already received data.

On Sat, Jan 25, 2020 at 11:29:11AM +1300, Thomas Munro wrote:
> On Thu, Jan 23, 2020 at 9:38 AM Rui DeSousa <rui@crazybean.net> wrote:
> > On Jan 22, 2020, at 2:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> It's certainly possible that to see any benefit you need stress
> >> levels above what I can manage on the small box I've got these
> >> OSes on.  Still, it'd be nice if a performance patch could show
> >> some improved performance, before we take any portability risks
> >> for it.
> 
> You might need more than one CPU socket, or at least lots more cores
> so that you can create enough contention.  That was needed to see the
> regression caused by commit ac1d794 on Linux[1].
> 
> > Here is two charts comparing a patched and unpatched system.
> > These systems are very large and have just shy of thousand
> > connections each with averages of 20 to 30 active queries concurrently
> > running at times including hundreds if not thousand of queries hitting
> > the database in rapid succession.  The effect is the unpatched system
> > generates a lot of system load just handling idle connections where as
> > the patched version is not impacted by idle sessions or sessions that
> > have already received data.
> 
> Thanks.  I can reproduce something like this on an Azure 72-vCPU
> system, using pgbench -S -c800 -j32.  The point of those settings is
> to have many backends, but they're all alternating between work and
> sleep.  That creates a stream of poll() syscalls, and system time goes
> through the roof (all CPUs pegged, but it's ~half system).  Profiling
> the kernel with dtrace, I see the most common stack (by a long way) is
> in a poll-related lock, similar to a profile Rui sent me off-list from
> his production system.  Patched, there is very little system time and
> the TPS number goes from 539k to 781k.
> 
> [1]
https://www.postgresql.org/message-id/flat/CAB-SwXZh44_2ybvS5Z67p_CDz%3DXFn4hNAD%3DCnMEF%2BQqkXwFrGg%40mail.gmail.com

Just to add some data...

I tried the kqueue v14 patch on a AWS EC2 m5a.24xlarge (96 vCPU) with
FreeBSD 12.1, driving from a m5.8xlarge (32 vCPU) CentOS 7 system.

I also use pgbench with a scale factor of 1000, with -S -c800 -j32.

Comparing pg 12.1 vs 13-devel (30012a04):

* TPS increased from ~93,000 to ~140,000, ~ 32% increase
* system time dropped from ~ 78% to ~ 70%, ~ 8% decrease
* user time increased from ~16% to ~ 23%, ~7% increase

I don't have any profile data, but I've attached a couple chart showing
the processor utilization over a 15 minute interval from the database
system.

Regards,
Mark
-- 
Mark Wong
2ndQuadrant - PostgreSQL Solutions for the Enterprise
https://www.2ndQuadrant.com/

Вложения

Re: [HACKERS] kqueue

От

Thomas Munro

Дата:

05 февраля 2020 г., 04:59:50

On Wed, Jan 29, 2020 at 11:54 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> If there are no further objections, I'm planning to commit this sooner
> rather than later, so that it gets plenty of air time on developer and
> build farm machines.  If problems are discovered on a particular
> platform, there's a pretty good escape hatch: you can define
> WAIT_USE_POLL, and if it turns out to be necessary, we could always do
> something in src/template similar to what we do for semaphores.

I updated the error messages to match the new "unified" style, adjust
a couple of comments, and pushed.  Thanks to all the people who
tested.  I'll keep an eye on the build farm.

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: kqueue

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения