Re: perf tuning for 28 cores and 252GB RAM

Поиск
Список
Период
Сортировка
От Fabio Ugo Venchiarutti
Тема Re: perf tuning for 28 cores and 252GB RAM
Дата
Msg-id aec848d3-db99-9f53-bbfc-0d029d03207f@ocado.com
обсуждение исходный текст
Ответ на Re: perf tuning for 28 cores and 252GB RAM  (Jeff Janes <jeff.janes@gmail.com>)
Ответы Re: perf tuning for 28 cores and 252GB RAM  (Andres Freund <andres@anarazel.de>)
Список pgsql-general
On 18/06/2019 00:45, Jeff Janes wrote:
> On Mon, Jun 17, 2019 at 4:51 PM Michael Curry <curry@cs.umd.edu
> <mailto:curry@cs.umd.edu>> wrote:
>
>     I am using a Postgres instance in an HPC cluster, where they have
>     generously given me an entire node. This means I have 28 cores and
>     252GB RAM. I have to assume that the very conservative default
>     settings for things like buffers and max working memory are too
>     small here.
>
>     We have about 20 billion rows in a single large table.
>
>
> What is that in bytes?  Do you only have that one table?
>
>     The database is not intended to run an application but rather to
>     allow a few individuals to do data analysis, so we can guarantee the
>     number of concurrent queries will be small, and that nothing else
>     will need to use the server. Creating multiple different indices on
>     a few subsets of the columns will be needed to support the kinds of
>     queries we want.
>
>     What settings should be changed to maximize performance?
>
>
> With 28 cores for only a few users, parallelization will probably be
> important.  That feature is fairly new to PostgreSQL and rapidly
> improving from version to version, so you will want to use the last
> version you can (v11).  And then increase the values for
> max_worker_processes, max_parallel_maintenance_workers,
> max_parallel_workers_per_gather, and max_parallel_workers.  With the
> potential for so many parallel workers running at once, you wouldn't
> want to go overboard on work_mem, maybe 2GB.  If you don't think all
> allowed users will be running large queries at the same time (because
> they are mostly thinking what query to run, or thinking about the
> results of the last one they ran, rather than actually running queries),
> then maybe higher than that.
>
> If your entire database can comfortably fit in RAM, I would make
> shared_buffers large enough to hold the entire database.  If not, I
> would set the value small (say, 8GB) and let the OS do the heavy lifting
> of deciding what to keep in cache.


Does the backend mmap() data files when that's possible?


I've heard the "use the page cache" suggestion before, from users and
hackers alike, but I never quite heard a solid argument dismissing
potential overhead-related ill effects of the seek() & read() syscalls
if they're needed, especially on many random page fetches.


Given that shmem-based shared_buffers are bound to be mapped into the
backend's address space anyway, why isn't that considered always
preferable/cheaper?



I'm aware that there are other benefits in counting on the page cache
(eg: staying hot in the face of a backend restart), however I'm
considering performance in steady state here.



TIA



If you go with the first option, you
> probably want to use pg_prewarm after each restart to get the data into
> cache as fast as you can, rather than let it get loaded in naturally as
> you run queries;  Also, you would probably want to set random_page_cost
> and seq_page_cost quite low, like maybe 0.1 and 0.05.
>
> You haven't described what kind of IO capacity and setup you have,
> knowing that could suggest other changes to make.  Also, seeing the
> results of `explain (analyze, buffers)`, especially with track_io_timing
> turned on, for some actual queries could provide good insight for what
> else might need changing.
>
> Cheers,
>
> Jeff





--
Regards

Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology

--


Notice:  This email is confidential and may contain copyright material of
members of the Ocado Group. Opinions and views expressed in this message
may not necessarily reflect the opinions and views of the members of the
Ocado Group. 

 

If you are not the intended recipient, please notify us
immediately and delete all copies of this message. Please note that it is
your responsibility to scan this message for viruses. 

 

Fetch and Sizzle
are trading names of Speciality Stores Limited and Fabled is a trading name
of Marie Claire Beauty Limited, both members of the Ocado Group.

 


References to the “Ocado Group” are to Ocado Group plc (registered in
England and Wales with number 7098618) and its subsidiary undertakings (as
that expression is defined in the Companies Act 2006) from time to time.  
The registered office of Ocado Group plc is Buildings One & Two, Trident
Place, Mosquito Way, Hatfield, Hertfordshire, AL10 9UL.



В списке pgsql-general по дате отправления:

Предыдущее
От: Merlin Moncure
Дата:
Сообщение: Re: perf tuning for 28 cores and 252GB RAM
Следующее
От: Andres Freund
Дата:
Сообщение: Re: perf tuning for 28 cores and 252GB RAM