Обсуждение: Re: [PERFORM] 9.4 -> 9.5 regression with queries through pgbouncer on RHEL 6
Hi Vladimir, Thanks for these reports. On 2016-06-13 00:42:19 +0300, Vladimir Borodin wrote: > perf report -g -i pg9?_all.data >/tmp/pg9?_perf_report.txt Any chance you could redo the reports with --no-children --call-graph=fractal added? The mode that includes child overheads unfortunately makes the output hard to interpet/compare. > The results from pg9?_perf_report.txt are attached. Note that in all cases some events were lost, i.e.: > > root@pgload05g ~ # perf report -g -i pg94_all.data >/tmp/pg94_perf_report.txt > Failed to open [vsyscall], continuing without symbols > Warning: > Processed 537137 events and lost 7846 chunks! You can reduce the overhead by reducing the sampling frequency, e.g. by specifying -F 300. Greetings, Andres Freund
Re: [PERFORM] 9.4 -> 9.5 regression with queries through pgbouncer on RHEL 6
От
Vladimir Borodin
Дата:
13 июня 2016 г., в 0:51, Andres Freund <andres@anarazel.de> написал(а):Hi Vladimir,
Thanks for these reports.
On 2016-06-13 00:42:19 +0300, Vladimir Borodin wrote:perf report -g -i pg9?_all.data >/tmp/pg9?_perf_report.txt
Any chance you could redo the reports with --no-children --call-graph=fractal
added? The mode that includes child overheads unfortunately makes the
output hard to interpet/compare.
Of course. Not sure if that is important but I upgraded perf for that (because --no-children option was introduced in ~3.16), so perf record and perf report were done with different perf versions.
Also I’ve done the same test on same host (RHEL 6) but with 4.6 kernel/perf and writing perf data to /dev/shm for not loosing events. Perf report output is also attached but important thing is that the regression is not so significant:
root@pgload05g ~ # uname -r
4.6.0-1.el6.elrepo.x86_64
root@pgload05g ~ # cat /proc/sys/kernel/sched_autogroup_enabled
1
root@pgload05g ~ # /tmp/run.sh
RHEL 6 9.4 71634 0.893
RHEL 6 9.5 54005 1.185
RHEL 6 9.6 65550 0.976
root@pgload05g ~ # echo 0 >/proc/sys/kernel/sched_autogroup_enabled
root@pgload05g ~ # /tmp/run.sh
RHEL 6 9.4 73041 0.876
RHEL 6 9.5 60105 1.065
RHEL 6 9.6 67984 0.941
root@pgload05g ~ #
The results from pg9?_perf_report.txt are attached. Note that in all cases some events were lost, i.e.:
root@pgload05g ~ # perf report -g -i pg94_all.data >/tmp/pg94_perf_report.txt
Failed to open [vsyscall], continuing without symbols
Warning:
Processed 537137 events and lost 7846 chunks!
You can reduce the overhead by reducing the sampling frequency, e.g. by
specifying -F 300.
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Вложения
Re: [PERFORM] 9.4 -> 9.5 regression with queries through pgbouncer on RHEL 6
От
Vladimir Borodin
Дата:
13 июня 2016 г., в 21:58, Vladimir Borodin <root@simply.name> написал(а):<pg94_perf_report.txt.gz>
13 июня 2016 г., в 0:51, Andres Freund <andres@anarazel.de> написал(а):Hi Vladimir,
Thanks for these reports.
On 2016-06-13 00:42:19 +0300, Vladimir Borodin wrote:perf report -g -i pg9?_all.data >/tmp/pg9?_perf_report.txt
Any chance you could redo the reports with --no-children --call-graph=fractal
added? The mode that includes child overheads unfortunately makes the
output hard to interpet/compare.Of course. Not sure if that is important but I upgraded perf for that (because --no-children option was introduced in ~3.16), so perf record and perf report were done with different perf versions.<pg95_perf_report.txt.gz><pg96_perf_report.txt.gz><pg96_perf_report_4.6.txt.gz>Also I’ve done the same test on same host (RHEL 6) but with 4.6 kernel/perf and writing perf data to /dev/shm for not loosing events. Perf report output is also attached but important thing is that the regression is not so significant:root@pgload05g ~ # uname -r4.6.0-1.el6.elrepo.x86_64root@pgload05g ~ # cat /proc/sys/kernel/sched_autogroup_enabled1root@pgload05g ~ # /tmp/run.shRHEL 6 9.4 71634 0.893RHEL 6 9.5 54005 1.185RHEL 6 9.6 65550 0.976root@pgload05g ~ # echo 0 >/proc/sys/kernel/sched_autogroup_enabledroot@pgload05g ~ # /tmp/run.shRHEL 6 9.4 73041 0.876RHEL 6 9.5 60105 1.065RHEL 6 9.6 67984 0.941root@pgload05g ~ #<pg95_perf_report_4.6.txt.gz><pg94_perf_report_4.6.txt.gz>
Andres, is there any chance that you would find time to look at those results? Are they actually useful?
The results from pg9?_perf_report.txt are attached. Note that in all cases some events were lost, i.e.:
root@pgload05g ~ # perf report -g -i pg94_all.data >/tmp/pg94_perf_report.txt
Failed to open [vsyscall], continuing without symbols
Warning:
Processed 537137 events and lost 7846 chunks!
You can reduce the overhead by reducing the sampling frequency, e.g. by
specifying -F 300.
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2016-06-13 21:58:30 +0300, Vladimir Borodin wrote: > > > 13 июня 2016 г., в 0:51, Andres Freund <andres@anarazel.de> написал(а): > > > > Hi Vladimir, > > > > Thanks for these reports. > > > > On 2016-06-13 00:42:19 +0300, Vladimir Borodin wrote: > >> perf report -g -i pg9?_all.data >/tmp/pg9?_perf_report.txt > > > > Any chance you could redo the reports with --no-children --call-graph=fractal > > added? The mode that includes child overheads unfortunately makes the > > output hard to interpet/compare. > > Of course. Not sure if that is important but I upgraded perf for that (because --no-children option was introduced in ~3.16),so perf record and perf report were done with different perf versions. > > > > Also I’ve done the same test on same host (RHEL 6) but with 4.6 kernel/perf and writing perf data to /dev/shm for not loosingevents. Perf report output is also attached but important thing is that the regression is not so significant: > > root@pgload05g ~ # uname -r > 4.6.0-1.el6.elrepo.x86_64 > root@pgload05g ~ # cat /proc/sys/kernel/sched_autogroup_enabled > 1 > root@pgload05g ~ # /tmp/run.sh > RHEL 6 9.4 71634 0.893 > RHEL 6 9.5 54005 1.185 > RHEL 6 9.6 65550 0.976 > root@pgload05g ~ # echo 0 >/proc/sys/kernel/sched_autogroup_enabled > root@pgload05g ~ # /tmp/run.sh > RHEL 6 9.4 73041 0.876 > RHEL 6 9.5 60105 1.065 > RHEL 6 9.6 67984 0.941 > root@pgload05g ~ # Hm. Have you measured how large the slowdown is if you connect via tcp to pgbouncer, but have pgbouncer connect to postgres via unix sockets? Andres
On 2016-07-04 16:30:51 +0300, Vladimir Borodin wrote: > > > 13 июня 2016 г., в 21:58, Vladimir Borodin <root@simply.name> написал(а): > > > >> > >> 13 июня 2016 г., в 0:51, Andres Freund <andres@anarazel.de <mailto:andres@anarazel.de>> написал(а): > >> > >> Hi Vladimir, > >> > >> Thanks for these reports. > >> > >> On 2016-06-13 00:42:19 +0300, Vladimir Borodin wrote: > >>> perf report -g -i pg9?_all.data >/tmp/pg9?_perf_report.txt > >> > >> Any chance you could redo the reports with --no-children --call-graph=fractal > >> added? The mode that includes child overheads unfortunately makes the > >> output hard to interpet/compare. > > > > Of course. Not sure if that is important but I upgraded perf for that (because --no-children option was introduced in~3.16), so perf record and perf report were done with different perf versions. > > > > <pg94_perf_report.txt.gz> > > <pg95_perf_report.txt.gz> > > <pg96_perf_report.txt.gz> > > > > Also I’ve done the same test on same host (RHEL 6) but with 4.6 > > kernel/perf and writing perf data to /dev/shm for not loosing > > events. Perf report output is also attached but important thing is > > that the regression is not so significant: FWIW, you can instead use -F 300 or something to reduce the sampling frequency. > > root@pgload05g ~ # uname -r > > 4.6.0-1.el6.elrepo.x86_64 > > root@pgload05g ~ # cat /proc/sys/kernel/sched_autogroup_enabled > > 1 > > root@pgload05g ~ # /tmp/run.sh > > RHEL 6 9.4 71634 0.893 > > RHEL 6 9.5 54005 1.185 > > RHEL 6 9.6 65550 0.976 > > root@pgload05g ~ # echo 0 >/proc/sys/kernel/sched_autogroup_enabled > > root@pgload05g ~ # /tmp/run.sh > > RHEL 6 9.4 73041 0.876 > > RHEL 6 9.5 60105 1.065 > > RHEL 6 9.6 67984 0.941 > > root@pgload05g ~ # > > > > <pg96_perf_report_4.6.txt.gz> > > <pg95_perf_report_4.6.txt.gz> > > <pg94_perf_report_4.6.txt.gz> > > Andres, is there any chance that you would find time to look at those results? Are they actually useful? I don't really see anything suspicious in the profile. This looks more like a kernel scheduler issue than a postgres bottleneck one. It seems that somehow using nonblocking IO (started in 9.5) causes scheduling issues when pgbouncer is also local. Could you do perf stat -ddd -a sleep 10 or something during both runs? I suspect that the context switch ratios will be quite different. Andres