Re: Parallel Seq Scan vs kernel read ahead

Поиск

Список

Период

Сортировка

От	David Rowley
Тема	Re: Parallel Seq Scan vs kernel read ahead
Дата	11 июня 2020 г. 03:05:25
Msg-id	CAApHDvqFf5P5qXx61-FhLyPCD5yFq1GB+-MSCxPWAg_ceiuCmg@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Parallel Seq Scan vs kernel read ahead (Amit Kapila <amit.kapila16@gmail.com>)
Ответы	Re: Parallel Seq Scan vs kernel read ahead
Список	pgsql-hackers

Дерево обсуждения

On Thu, 11 Jun 2020 at 14:09, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Jun 11, 2020 at 7:18 AM David Rowley <dgrowleyml@gmail.com> wrote:
> >
> > On Thu, 11 Jun 2020 at 01:24, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > Can we try the same test with 4, 8, 16 workers as well?  I don't
> > > foresee any problem with a higher number of workers but it might be
> > > better to once check that if it is not too much additional work.
> >
> > I ran the tests again with up to 7 workers. The CPU here only has 8
> > cores (a laptop), so I'm not sure if there's much sense in going
> > higher than that?
> >
>
> I think it proves your point that there is a value in going for step
> size greater than 64.  However, I think the difference at higher sizes
> is not significant.  For example, the difference between 8192 and
> 16384 doesn't seem much if we leave higher worker count where the data
> could be a bit misleading due to variation.  I could see that there is
> a clear and significant difference till 1024 but after that difference
> is not much.

I guess the danger with going too big is that we have some Seqscan
filter that causes some workers to do very little to nothing with the
rows, despite discarding them and other workers are left with rows
that are not filtered and require some expensive processing.  Keeping
the number of blocks on the smaller side would reduce the chances of
someone being hit by that.   The algorithm I proposed above still can
be capped by doing something like nblocks = Min(1024,
pg_nextpower2_32(pbscan->phs_nblocks / 1024));  That way we'll end up
with:


 rel_size | stepsize
----------+----------
 16 kB    |        1
 32 kB    |        1
 64 kB    |        1
 128 kB   |        1
 256 kB   |        1
 512 kB   |        1
 1024 kB  |        1
 2048 kB  |        1
 4096 kB  |        1
 8192 kB  |        1
 16 MB    |        2
 32 MB    |        4
 64 MB    |        8
 128 MB   |       16
 256 MB   |       32
 512 MB   |       64
 1024 MB  |      128
 2048 MB  |      256
 4096 MB  |      512
 8192 MB  |     1024
 16 GB    |     1024
 32 GB    |     1024
 64 GB    |     1024
 128 GB   |     1024
 256 GB   |     1024
 512 GB   |     1024
 1024 GB  |     1024
 2048 GB  |     1024
 4096 GB  |     1024
 8192 GB  |     1024
 16 TB    |     1024
 32 TB    |     1024
(32 rows)

David

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Parallel Seq Scan vs kernel read ahead