Обсуждение: Big Memory Boxes and pgtune

Поиск
Список
Период
Сортировка

Big Memory Boxes and pgtune

От
"Warner, Gary, Jr"
Дата:
I've recently been blessed to move one of my databases onto a huge IBM P8 computer.   Its a power PC architecture with
208-way cores (so postgres SHOULD believe there are 160 cores available) and 1 TB of RAM. 

I've always done my postgres tuning with a copy of "pgtune" which says in the output:

# WARNING
# this tool not being optimal
# for very high memory systems

So . . . what would I want to do differently based on the fact that I have a "very high memory system"?

Suggestions welcome!

(There are several different databases, mostly related to our work in social media and malware analytics.   The
databasesare smaller than available RAM.   Around 80 million social media profiles with 700M or so links, growing by
10%a week or so.  The malware database has extracted statistics and data about 17 million malware samples, growing by
about5% per week.   The Social Media side has just shy of 100 'fetchers' that insert/update (but don't delete.)   A few
humananalysts interact with the data, hitting some pretty heavy queries as they do link analysis and natural language
processing,but mostly the database use is the "fetchers") 

------------------------------------------------------------------------------
Gary Warner, Director CETIFS      gar@uab.edu
Center for Emerging Technology Investigations Forensics & Security
The University of Alabama at Birmingham (UAB)
205.422.2113
------------------------------------------------------------------------------



Re: Big Memory Boxes and pgtune

От
Kevin Grittner
Дата:
On Fri, Oct 28, 2016 at 10:44 AM, Warner, Gary, Jr <gar@uab.edu> wrote:

> I've recently been blessed to move one of my databases onto a
> huge IBM P8 computer.   Its a power PC architecture with 20 8-way
> cores (so postgres SHOULD believe there are 160 cores available)
> and 1 TB of RAM.

> So . . . what would I want to do differently based on the fact
> that I have a "very high memory system"?

What OS are you looking at?

The first advice I would give is to use a very recent version of
both the OS and PostgreSQL.  Such large machines are a recent
enough phenomenon that older software is not likely to be optimized
to perform well on it.  For similar reasons, be sure to stay up to
date with minor releases of both.

If the OS has support for them, you probably want to become
familiar with these commands:

numactl --hardware
lscpu

You may want to benchmark different options, but I suspect that you
will see better performance by putting each database on a separate
cluster and using cpusets (or the equivalent) so that each cluster
uses a subset of the 160 cores and the RAM directly attached to the
subset.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Big Memory Boxes and pgtune

От
"Joshua D. Drake"
Дата:
On 10/28/2016 08:44 AM, Warner, Gary, Jr wrote:
> I've recently been blessed to move one of my databases onto a huge IBM P8 computer.   Its a power PC architecture
with20 8-way cores (so postgres SHOULD believe there are 160 cores available) and 1 TB of RAM. 
>
> I've always done my postgres tuning with a copy of "pgtune" which says in the output:
>
> # WARNING
> # this tool not being optimal
> # for very high memory systems
>
> So . . . what would I want to do differently based on the fact that I have a "very high memory system"?

The most obvious is that you are going to want to have (depending on
PostgreSQL version):

* A very high shared_buffers (in newer releases, it is not uncommon to
have many, many GB of)
* Use that work_mem baby. You have 1TB available? Take your average data
set return, and make work_mem at least that.
* IIRC (and this may be old advice), maintenance_work_mem up to 4GB. As
I recall it won't effectively use more than that but I could be wrong.

Lastly but most importantly, test test test.

JD

--
Command Prompt, Inc.                  http://the.postgres.company/
                         +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.
Unless otherwise stated, opinions are my own.


Re: Big Memory Boxes and pgtune

От
Jim Nasby
Дата:
On 10/28/16 2:33 PM, Joshua D. Drake wrote:
> * A very high shared_buffers (in newer releases, it is not uncommon to
> have many, many GB of)

Keep in mind that you might get very poor results if shared_buffers is
large, but not large enough to fit the entire database. In that case
buffer replacement will be *extremely* expensive. Some operations will
use a different buffer replacement strategy, so you might be OK if some
of the database doesn't fit in shared buffers; that will depend a lot on
your access patterns.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


Re: Big Memory Boxes and pgtune

От
Scott Marlowe
Дата:
On Wed, Nov 2, 2016 at 5:46 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> On 10/28/16 2:33 PM, Joshua D. Drake wrote:
>>
>> * A very high shared_buffers (in newer releases, it is not uncommon to
>> have many, many GB of)
>
>
> Keep in mind that you might get very poor results if shared_buffers is
> large, but not large enough to fit the entire database. In that case buffer
> replacement will be *extremely* expensive. Some operations will use a
> different buffer replacement strategy, so you might be OK if some of the
> database doesn't fit in shared buffers; that will depend a lot on your
> access patterns.


This. Especially on machines with fast CPUs / memory and SSDs
underneath, lots of buffers can sometimes just get in the way. The
linux kernel (and most other kernels) has hundreds, even thousands of
man hours put into the file caching code and it's often faster to let
the kernel do that job with the extra memory.

Only a benchmark of a production type load can tell you what to
expect, and only production itself will reveal the absolute truth.
Where I used to work we had 5TB databases on machines with anywhere
from 128GB to 512GB and honesly the extra memory didn't make a huge
difference. They had 8 disk RAID-5 arrays with controller caching
turned off, because it got in the way, and cranking up shared_buffers
didn't not make them faster. I think we settled on something under
10GB on most of them