Re: Cache relation sizes?

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: Cache relation sizes?
Дата
Msg-id CAEepm=3f9Ho1jKohAUF=ueDqN5LUfdLv5k8FK9DNYaCP=si1Cg@mail.gmail.com
обсуждение исходный текст
Ответ на RE: Cache relation sizes?  ("Jamison, Kirk" <k.jamison@jp.fujitsu.com>)
Ответы RE: Cache relation sizes?  ("Jamison, Kirk" <k.jamison@jp.fujitsu.com>)
Список pgsql-hackers
On Thu, Dec 27, 2018 at 8:00 PM Jamison, Kirk <k.jamison@jp.fujitsu.com> wrote:
> I also find this proposed feature to be beneficial for performance, especially when we want to extend or truncate
largetables. 
> As mentioned by David, currently there is a query latency spike when we make generic plan for partitioned table with
manypartitions. 
> I tried to apply Thomas' patch for that use case. Aside from measuring the planning and execution time,
> I also monitored the lseek calls using simple strace, with and without the patch.

Thanks for looking into this and testing!

> Setup 8192 table partitions.

> (1) set plan_cache_mode = 'force_generic_plan';
>     Planning Time: 1678.680 ms
>     Planning Time: 1596.566 ms

> (2) plan_cache_mode = 'auto’
>     Planning Time: 768.669 ms
>     Planning Time: 181.690 ms

> (3) set plan_cache_mode = 'force_generic_plan';
>     Planning Time: 14.294 ms
>     Planning Time: 13.976 ms

> If I did the test correctly, I am not sure though as to why the patch did not affect the generic planning performance
oftable with many partitions. 
> However, the number of lseek calls was greatly reduced with Thomas’ patch.
> I also did not get considerable speed up in terms of latency average using pgbench –S (read-only, unprepared).
> I am assuming this might be applicable to other use cases as well.
> (I just tested the patch, but haven’t dug up the patch details yet).

The result for (2) is nice.  Even though you had to use 8192
partitions to see it.

> Would you like to submit this to the commitfest to get more reviews for possible idea/patch improvement?

For now I think this still in the experiment/hack phase and I have a
ton of other stuff percolating in this commitfest already (and a week
of family holiday in the middle of January).  But if you have ideas
about the validity of the assumptions, the reason it breaks initdb, or
any other aspect of this approach (or alternatives), please don't let
me stop you, and of course please feel free to submit this, an
improved version or an alternative proposal yourself!  Unfortunately I
wouldn't have time to nurture it this time around, beyond some
drive-by comments.

Assorted armchair speculation:  I wonder how much this is affected by
the OS and KPTI, virtualisation technology, PCID support, etc.  Back
in the good old days, Linux's lseek(SEEK_END) stopped acquiring the
inode mutex when reading the size, at least in the generic
implementation used by most filesystems (I wonder if our workloads
were indirectly responsible for that optimisation?) so maybe it became
about as fast as a syscall could possibly be, but now the baseline for
how fast syscalls can be has moved and it also depends on your
hardware, and it also has external costs that depend on what memory
you touch in between syscalls.  Also, other operating systems might
still acquire a per-underlying-file/vnode/whatever lock (<checks
source code>... yes) and the contention for that might depend on what
else is happening, so that a single standalone test wouldn't capture
that but a super busy DB with a rapidly expanding and contracting
table that many other sessions are trying to observe with
lseek(SEEK_END) could slow down more.

--
Thomas Munro
http://www.enterprisedb.com


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Следующее
От: Fabien COELHO
Дата:
Сообщение: Re: pg_dumpall --exclude-database option