Also I’ve done the same test on same host (RHEL 6) but with 4.6 kernel/perf and writing perf data to /dev/shm for not loosing events. Perf report output is also attached but important thing is that the regression is not so significant:
root@pgload05g ~ # uname -r
4.6.0-1.el6.elrepo.x86_64
root@pgload05g ~ # cat /proc/sys/kernel/sched_autogroup_enabled
1
root@pgload05g ~ # /tmp/run.sh
RHEL 6 9.4 71634 0.893
RHEL 6 9.5 54005 1.185
RHEL 6 9.6 65550 0.976
root@pgload05g ~ # echo 0 >/proc/sys/kernel/sched_autogroup_enabled
root@pgload05g ~ # /tmp/run.sh
RHEL 6 9.4 73041 0.876
RHEL 6 9.5 60105 1.065
RHEL 6 9.6 67984 0.941
root@pgload05g ~ #