On Tue, Jan 10, 2017 at 6:42 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> Attached result of performance test with scale factor = 500 and the
> test script I used. I measured each test at four times and plot
> average of last three execution times to sf_500.png file. When table
> has index, vacuum execution time is smallest when number of index and
> parallel degree is same.
It does seem from those results that parallel heap scans aren't paying
off, and in fact are hurting.
It could be your I/O that's at odds with the parallel degree settings
rather than the approach (ie: your I/O system can't handle that many
parallel scans), but in any case it does warrant a few more tests.
I'd suggest you try to:
1. Disable parallel lazy vacuum, leave parallel index scans
2. Limit parallel degree to number of indexes, leaving parallel lazy
vacuum enabled
3. Cap lazy vacuum parallel degree by effective_io_concurrency, and
index scan parallel degree to number of indexes
And compare against your earlier test results.
I suspect 1 could be the winner, but 3 has a chance too (if e_i_c is
properly set up for your I/O system).