Re: Some tuning suggestions on a Red Hat 6.7 - PG 9.5.3 production environment

Поиск

Список

Период

Сортировка

От	domenico febbo
Тема	Re: Some tuning suggestions on a Red Hat 6.7 - PG 9.5.3 production environment
Дата	14 ноября 2016 г. 20:36:57
Msg-id	CALqvQ1PJuYdDrduEb6PkC1kXBRUiHVrKH3zAnF5DxPAXP_ncmQ@mail.gmail.com обсуждение исходный текст
Ответ на	Some tuning suggestions on a Red Hat 6.7 - PG 9.5.3 production environment (Pietro Pugni <pietro.pugni@gmail.com>)
Ответы	Re: Some tuning suggestions on a Red Hat 6.7 - PG 9.5.3 production environment Re: Some tuning suggestions on a Red Hat 6.7 - PG 9.5.3 production environment
Список	pgsql-performance

Дерево обсуждения

dear Pietro,
are you sure about

effective_io_concurrency = 30

could you please explain the type of disk storage?

Il 14/Nov/2016 12:46, "Pietro Pugni" <pietro.pugni@gmail.com> ha scritto:

Dear list,
I’m looking for some guidelines on how to optimize the configuration of a production database dedicated to a DWH application.
I run the application on different machines and have solved several issues since now but am struggling on a production environment running Red Hat 6.7 and PostgreSQL 9.5.3.
My application does a lot of reads and many writes (plain “SELECT … INTO” and “INSERT”, no “UPDATE”), but on a order of magnitude lower than the reads.
The work flow consists of two big blocks: an ETL phase and the workloads on the data imported during the ETL phase.

The biggest schema has about 1.2 billions of rows distributed over a ten of tables; many of those tables are partitioned and have indexes. At the moment the database stores two schemas but I plan to add other three schemas of similar size.

The machine is virtualized and has 8 CPUs at about 3GHz, 64GB of RAM and 5TB of storage. It runs on Red Hat 6.7, kernel 2.6.x

The configuration changes I made so far are:
max_connections = 30
shared_buffers = 32GB
work_mem = 256MB
maintenance_work_mem = 4GB
effective_io_concurrency = 30
checkpoint_completion_target = 0.9
random_page_cost = 2.0
effective_cache_size = 48GB
default_statistics_target = 1000

autovacuum is on and the collation is ‘C’.

The first issue I faced was about maintenance_work_mem because I set it to 16GB and the server silently crashed during a VACUUM because I didn’t consider that it could take up to autovacuum_max_workers * maintenance_work_mem (roughly 48GB). So I lowered maintenance_work_mem to 4GB and it did work. Should I set maintenance_work_mem to a smaller value (1GB) after the ETL terminates or can I leave it at 4GB without degrading the overall performance?

The second issue emerged during a intensive parallel query. I implemented a splitter that parallelize certain kind of queries. There were 8 similar queries running that was working on 8 overall disjoined subsets of the same table; this table has roughly 4.5 millions of rows. These queries uses SELECT DISTINCT, ORDER BY, OVER (PARTITION BY … ORDER BY) and COALESCE(). At a certain point the server crashed and I found the following error in the logs:

postgres server process was terminated by signal 9 killed

After some research, I found that probably it was the OOM killer. Running “dmesg” tells that effectively it was. Reading the documentation and this answer on SO ( http://stackoverflow.com/questions/16418173/psql-seems-to-timeout-with-long-queries ), I realized that probably the issue is due to a misconfiguration. The value I set for this pg instance don’t seem to be so wrong, except maybe from maintenance_work_mem. I will certainly disable OOM as suggested by the official docs ( https://www.postgresql.org/docs/current/static/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT ) but was wondering if I could tune the configuration a little better. Can someone give me some more advices?

I run the same application with different data (and workload) on other machines, but they have different configurations (Ubuntu 16.0.4). On one of them I previously disabled the virtual memory overcommit and never experienced that issue, but the machine has 128GB of RAM.

I hope to have been clear enough.
Thank you everyone
Pietro

В списке pgsql-performance по дате отправления:

Предыдущее

От: Jeff Janes
Дата: 14 ноября 2016 г., 19:50:19
Сообщение: Re: Query planner chooses index scan backward instead of better index option

Следующее

От: Stephen Cresswell
Дата: 14 ноября 2016 г., 20:54:12
Сообщение: Why is the optimiser choosing a sub-optimal plan?

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Some tuning suggestions on a Red Hat 6.7 - PG 9.5.3 production environment

Предыдущее

Следующее