Обсуждение: postgres crashes on insert in 40 different threads

Поиск
Список
Период
Сортировка

postgres crashes on insert in 40 different threads

От
Dzmitry
Дата:
Hey folks, 
 I have postgres server running on ubuntu 12, Intel Xeon 8 CPUs 29 GB RAM. 
With following settings:
max_connections = 550
shared_buffers = 12GB
temp_buffers = 8MB
max_prepared_transactions = 0
work_mem = 50MB
maintenance_work_mem = 1GB
fsync = on
wal_buffers = 16MB
commit_delay = 50  
commit_siblings = 7
checkpoint_segments = 32
checkpoint_completion_target = 0.9
effective_cache_size = 22GB
autovacuum = on
autovacuum_vacuum_threshold = 1800
autovacuum_analyze_threshold = 900

I am doing a lot of writes to DB in 40 different threads – so every thread check if record exists – if not => insert record, if exists => update record. 
During this update, my disk IO almost always – 100% and sometimes it crash my DB with following message: 

2013-08-19 03:18:00 UTC LOG:  checkpointer process (PID 28354) was terminated by signal 9: Killed
2013-08-19 03:18:00 UTC LOG:  terminating any other active server processes
2013-08-19 03:18:00 UTC WARNING:  terminating connection because of crash of another server process
2013-08-19 03:18:00 UTC DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2013-08-19 03:18:00 UTC HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2013-08-19 03:18:00 UTC WARNING:  terminating connection because of crash of another server process
2013-08-19 03:18:00 UTC DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2013-08-19 03:18:00 UTC HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2013-08-19 03:18:00 UTC WARNING:  terminating connection because of crash of another server process
2013-08-19 03:18:00 UTC DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

My DB size is not very big – 169GB. 

Anyone know how can I get rid of DB crash  ? 


Thanks,
  Dzmitry

Re: postgres crashes on insert in 40 different threads

От
Stéphane Schildknecht
Дата:
Le 19/08/2013 10:07, Dzmitry a écrit :
> Hey folks,
>  I have postgres server running on ubuntu 12,Intel Xeon 8 CPUs 29 GB RAM.
> With following settings:
> max_connections = 550
> shared_buffers = 12GB
> temp_buffers = 8MB
> max_prepared_transactions = 0
> work_mem = 50MB
> maintenance_work_mem = 1GB
> fsync = on
> wal_buffers = 16MB
> commit_delay = 50
> commit_siblings = 7
> checkpoint_segments = 32
> checkpoint_completion_target = 0.9
> effective_cache_size = 22GB
> autovacuum = on
> autovacuum_vacuum_threshold = 1800
> autovacuum_analyze_threshold = 900
>
> I am doing a lot of writes to DB in 40 different threads – so every thread
> check if record exists – if not => insert record, if exists => update record.
> During this update, my disk IO almost always – 100% and sometimes it crash my
> DB with following message:
>
> 2013-08-19 03:18:00 UTC LOG:  checkpointer process (PID 28354) was terminated
> by signal 9: Killed
> 2013-08-19 03:18:00 UTC LOG:  terminating any other active server processes
> 2013-08-19 03:18:00 UTC WARNING:  terminating connection because of crash of
> another server process
> 2013-08-19 03:18:00 UTC DETAIL:  The postmaster has commanded this server
> process to roll back the current transaction and exit, because another server
> process exited abnormally and possibly corrupted shared memory.
> 2013-08-19 03:18:00 UTC HINT:  In a moment you should be able to reconnect to
> the database and repeat your command.
> 2013-08-19 03:18:00 UTC WARNING:  terminating connection because of crash of
> another server process
> 2013-08-19 03:18:00 UTC DETAIL:  The postmaster has commanded this server
> process to roll back the current transaction and exit, because another server
> process exited abnormally and possibly corrupted shared memory.
> 2013-08-19 03:18:00 UTC HINT:  In a moment you should be able to reconnect to
> the database and repeat your command.
> 2013-08-19 03:18:00 UTC WARNING:  terminating connection because of crash of
> another server process
> 2013-08-19 03:18:00 UTC DETAIL:  The postmaster has commanded this server
> process to roll back the current transaction and exit, because another server
> process exited abnormally and possibly corrupted shared memory.
>
> My DB size is not very big – 169GB.
>
> Anyone know how can I get rid of DB crash  ?
>
>
> Thanks,
>   Dzmitry
>

The fact that the checkpointer was killed -9 let me think the OOMKiller has
detected you were out of memory.

Could that be the case?

12GB of shared_buffers on a 29Gb box is too high. You should try to lower that
value to 6GB, for instance.
550*50MB, that is 27GB of RAM that PostgreSQL could try to adress.

I can imagine your system is swapping a lot, and you exhaust swap memory before
crash.

Regards,

--
Stéphane Schildknecht
Loxodata - Conseil, expertise et formations



Re: postgres crashes on insert in 40 different threads

От
Dzmitry
Дата:
I don't think it's the case. I am using newrelic for monitoring my DB
servers(I have one master and 2 slaves - all use the same configuration) -
memory is not going above 12.5GB, so I have a good reserve, also I don't
see any swapping there :(

Thanks,
  Dzmitry





On 8/19/13 11:36 AM, "Stéphane Schildknecht"
<stephane.schildknecht@postgresql.fr> wrote:

>Le 19/08/2013 10:07, Dzmitry a écrit :
>> Hey folks,
>>  I have postgres server running on ubuntu 12,Intel Xeon 8 CPUs 29 GB
>>RAM.
>> With following settings:
>> max_connections = 550
>> shared_buffers = 12GB
>> temp_buffers = 8MB
>> max_prepared_transactions = 0
>> work_mem = 50MB
>> maintenance_work_mem = 1GB
>> fsync = on
>> wal_buffers = 16MB
>> commit_delay = 50
>> commit_siblings = 7
>> checkpoint_segments = 32
>> checkpoint_completion_target = 0.9
>> effective_cache_size = 22GB
>> autovacuum = on
>> autovacuum_vacuum_threshold = 1800
>> autovacuum_analyze_threshold = 900
>>
>> I am doing a lot of writes to DB in 40 different threads ­ so every
>>thread
>> check if record exists ­ if not => insert record, if exists => update
>>record.
>> During this update, my disk IO almost always ­ 100% and sometimes it
>>crash my
>> DB with following message:
>>
>> 2013-08-19 03:18:00 UTC LOG:  checkpointer process (PID 28354) was
>>terminated
>> by signal 9: Killed
>> 2013-08-19 03:18:00 UTC LOG:  terminating any other active server
>>processes
>> 2013-08-19 03:18:00 UTC WARNING:  terminating connection because of
>>crash of
>> another server process
>> 2013-08-19 03:18:00 UTC DETAIL:  The postmaster has commanded this
>>server
>> process to roll back the current transaction and exit, because another
>>server
>> process exited abnormally and possibly corrupted shared memory.
>> 2013-08-19 03:18:00 UTC HINT:  In a moment you should be able to
>>reconnect to
>> the database and repeat your command.
>> 2013-08-19 03:18:00 UTC WARNING:  terminating connection because of
>>crash of
>> another server process
>> 2013-08-19 03:18:00 UTC DETAIL:  The postmaster has commanded this
>>server
>> process to roll back the current transaction and exit, because another
>>server
>> process exited abnormally and possibly corrupted shared memory.
>> 2013-08-19 03:18:00 UTC HINT:  In a moment you should be able to
>>reconnect to
>> the database and repeat your command.
>> 2013-08-19 03:18:00 UTC WARNING:  terminating connection because of
>>crash of
>> another server process
>> 2013-08-19 03:18:00 UTC DETAIL:  The postmaster has commanded this
>>server
>> process to roll back the current transaction and exit, because another
>>server
>> process exited abnormally and possibly corrupted shared memory.
>>
>> My DB size is not very big ­ 169GB.
>>
>> Anyone know how can I get rid of DB crash  ?
>>
>>
>> Thanks,
>>   Dzmitry
>>
>
>The fact that the checkpointer was killed -9 let me think the OOMKiller
>has
>detected you were out of memory.
>
>Could that be the case?
>
>12GB of shared_buffers on a 29Gb box is too high. You should try to lower
>that
>value to 6GB, for instance.
>550*50MB, that is 27GB of RAM that PostgreSQL could try to adress.
>
>I can imagine your system is swapping a lot, and you exhaust swap memory
>before
>crash.
>
>Regards,
>
>--
>Stéphane Schildknecht
>Loxodata - Conseil, expertise et formations
>
>
>
>--
>Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
>To make changes to your subscription:
>http://www.postgresql.org/mailpref/pgsql-admin




Re: postgres crashes on insert in 40 different threads

От
Albe Laurenz
Дата:
Dzmitry wrote:
> On 8/19/13 11:36 AM, "Stéphane Schildknecht" <stephane.schildknecht@postgresql.fr> wrote:
>> Le 19/08/2013 10:07, Dzmitry a écrit :
>>>  I have postgres server running on ubuntu 12,Intel Xeon 8 CPUs 29 GB RAM.
>>> With following settings:
>>> max_connections = 550
>>> shared_buffers = 12GB
>>> temp_buffers = 8MB
>>> max_prepared_transactions = 0
>>> work_mem = 50MB
>>> maintenance_work_mem = 1GB
>>> fsync = on
>>> wal_buffers = 16MB
>>> commit_delay = 50
>>> commit_siblings = 7
>>> checkpoint_segments = 32
>>> checkpoint_completion_target = 0.9
>>> effective_cache_size = 22GB
>>> autovacuum = on
>>> autovacuum_vacuum_threshold = 1800
>>> autovacuum_analyze_threshold = 900
>>>
>>> I am doing a lot of writes to DB in 40 different threads ­ so every thread
>>> check if record exists ­ if not => insert record, if exists => update record.
>>> During this update, my disk IO almost always ­ 100% and sometimes it crash my
>>> DB with following message:
>>>
>>> 2013-08-19 03:18:00 UTC LOG:  checkpointer process (PID 28354) was terminated by signal 9: Killed
[...]
>>>
>>> My DB size is not very big ­ 169GB.
>>>
>>> Anyone know how can I get rid of DB crash  ?

>> The fact that the checkpointer was killed -9 let me think the OOMKiller has
>> detected you were out of memory.
>>
>> Could that be the case?
>>
>> 12GB of shared_buffers on a 29Gb box is too high. You should try to lower that
>> value to 6GB, for instance.
>> 550*50MB, that is 27GB of RAM that PostgreSQL could try to adress.
>>
>> I can imagine your system is swapping a lot, and you exhaust swap memory before crash.

> I don't think it's the case. I am using newrelic for monitoring my DB
> servers(I have one master and 2 slaves - all use the same configuration) -
> memory is not going above 12.5GB, so I have a good reserve, also I don't
> see any swapping there :(

You can check by examining /var/log/messages to see if the OOM
killer is at fault, which is highly likely.

The OOM killer uses heuristics, so it does the wrong thing occasionally.

The documentation is helpful:
http://www.postgresql.org/docs/9.2/static/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT

Yours,
Laurenz Albe

Re: postgres crashes on insert in 40 different threads

От
Dzmitry
Дата:
Do you mean postgres log file(in postgres.conf)

log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
log_min_messages = warning

Or /var/log/messages ? Because I haven't this file :(


Thanks,
  Dzmitry





On 8/19/13 12:26 PM, "Albe Laurenz" <laurenz.albe@wien.gv.at> wrote:

>Dzmitry wrote:
>> On 8/19/13 11:36 AM, "Stéphane Schildknecht"
>><stephane.schildknecht@postgresql.fr> wrote:
>>> Le 19/08/2013 10:07, Dzmitry a écrit :
>>>>  I have postgres server running on ubuntu 12,Intel Xeon 8 CPUs 29 GB
>>>>RAM.
>>>> With following settings:
>>>> max_connections = 550
>>>> shared_buffers = 12GB
>>>> temp_buffers = 8MB
>>>> max_prepared_transactions = 0
>>>> work_mem = 50MB
>>>> maintenance_work_mem = 1GB
>>>> fsync = on
>>>> wal_buffers = 16MB
>>>> commit_delay = 50
>>>> commit_siblings = 7
>>>> checkpoint_segments = 32
>>>> checkpoint_completion_target = 0.9
>>>> effective_cache_size = 22GB
>>>> autovacuum = on
>>>> autovacuum_vacuum_threshold = 1800
>>>> autovacuum_analyze_threshold = 900
>>>>
>>>> I am doing a lot of writes to DB in 40 different threads ­ so every
>>>>thread
>>>> check if record exists ­ if not => insert record, if exists => update
>>>>record.
>>>> During this update, my disk IO almost always ­ 100% and sometimes it
>>>>crash my
>>>> DB with following message:
>>>>
>>>> 2013-08-19 03:18:00 UTC LOG:  checkpointer process (PID 28354) was
>>>>terminated by signal 9: Killed
>[...]
>>>>
>>>> My DB size is not very big ­ 169GB.
>>>>
>>>> Anyone know how can I get rid of DB crash  ?
>
>>> The fact that the checkpointer was killed -9 let me think the
>>>OOMKiller has
>>> detected you were out of memory.
>>>
>>> Could that be the case?
>>>
>>> 12GB of shared_buffers on a 29Gb box is too high. You should try to
>>>lower that
>>> value to 6GB, for instance.
>>> 550*50MB, that is 27GB of RAM that PostgreSQL could try to adress.
>>>
>>> I can imagine your system is swapping a lot, and you exhaust swap
>>>memory before crash.
>
>> I don't think it's the case. I am using newrelic for monitoring my DB
>> servers(I have one master and 2 slaves - all use the same
>>configuration) -
>> memory is not going above 12.5GB, so I have a good reserve, also I don't
>> see any swapping there :(
>
>You can check by examining /var/log/messages to see if the OOM
>killer is at fault, which is highly likely.
>
>The OOM killer uses heuristics, so it does the wrong thing occasionally.
>
>The documentation is helpful:
>http://www.postgresql.org/docs/9.2/static/kernel-resources.html#LINUX-MEMO
>RY-OVERCOMMIT
>
>Yours,
>Laurenz Albe




Re: postgres crashes on insert in 40 different threads

От
Mael Rimbault
Дата:
Hi !

Since Maverick Ubuntu developers disabled logging to /var/log/messages
by default.
You should check /var/log/syslog instead.

--
Mael

2013/8/19 Dzmitry <dzmitry.nikitsin@gmail.com>:
> Do you mean postgres log file(in postgres.conf)
>
> log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
> log_min_messages = warning
>
> Or /var/log/messages ? Because I haven't this file :(
>
>
> Thanks,
>   Dzmitry
>
>
>
>
>
> On 8/19/13 12:26 PM, "Albe Laurenz" <laurenz.albe@wien.gv.at> wrote:
>
>>Dzmitry wrote:
>>> On 8/19/13 11:36 AM, "Stéphane Schildknecht"
>>><stephane.schildknecht@postgresql.fr> wrote:
>>>> Le 19/08/2013 10:07, Dzmitry a écrit :
>>>>>  I have postgres server running on ubuntu 12,Intel Xeon 8 CPUs 29 GB
>>>>>RAM.
>>>>> With following settings:
>>>>> max_connections = 550
>>>>> shared_buffers = 12GB
>>>>> temp_buffers = 8MB
>>>>> max_prepared_transactions = 0
>>>>> work_mem = 50MB
>>>>> maintenance_work_mem = 1GB
>>>>> fsync = on
>>>>> wal_buffers = 16MB
>>>>> commit_delay = 50
>>>>> commit_siblings = 7
>>>>> checkpoint_segments = 32
>>>>> checkpoint_completion_target = 0.9
>>>>> effective_cache_size = 22GB
>>>>> autovacuum = on
>>>>> autovacuum_vacuum_threshold = 1800
>>>>> autovacuum_analyze_threshold = 900
>>>>>
>>>>> I am doing a lot of writes to DB in 40 different threads ­ so every
>>>>>thread
>>>>> check if record exists ­ if not => insert record, if exists => update
>>>>>record.
>>>>> During this update, my disk IO almost always ­ 100% and sometimes it
>>>>>crash my
>>>>> DB with following message:
>>>>>
>>>>> 2013-08-19 03:18:00 UTC LOG:  checkpointer process (PID 28354) was
>>>>>terminated by signal 9: Killed
>>[...]
>>>>>
>>>>> My DB size is not very big ­ 169GB.
>>>>>
>>>>> Anyone know how can I get rid of DB crash  ?
>>
>>>> The fact that the checkpointer was killed -9 let me think the
>>>>OOMKiller has
>>>> detected you were out of memory.
>>>>
>>>> Could that be the case?
>>>>
>>>> 12GB of shared_buffers on a 29Gb box is too high. You should try to
>>>>lower that
>>>> value to 6GB, for instance.
>>>> 550*50MB, that is 27GB of RAM that PostgreSQL could try to adress.
>>>>
>>>> I can imagine your system is swapping a lot, and you exhaust swap
>>>>memory before crash.
>>
>>> I don't think it's the case. I am using newrelic for monitoring my DB
>>> servers(I have one master and 2 slaves - all use the same
>>>configuration) -
>>> memory is not going above 12.5GB, so I have a good reserve, also I don't
>>> see any swapping there :(
>>
>>You can check by examining /var/log/messages to see if the OOM
>>killer is at fault, which is highly likely.
>>
>>The OOM killer uses heuristics, so it does the wrong thing occasionally.
>>
>>The documentation is helpful:
>>http://www.postgresql.org/docs/9.2/static/kernel-resources.html#LINUX-MEMO
>>RY-OVERCOMMIT
>>
>>Yours,
>>Laurenz Albe
>
>
>
>
> --
> Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-admin


Re: postgres crashes on insert in 40 different threads

От
Albe Laurenz
Дата:
>> Dzmitry wrote:
>>> On 8/19/13 11:36 AM, "Stéphane Schildknecht"
>>><stephane.schildknecht@postgresql.fr> wrote:
>>>> Le 19/08/2013 10:07, Dzmitry a écrit :
>>>>>  I have postgres server running on ubuntu 12,Intel Xeon 8 CPUs 29 GB
>>>>>RAM.
>>>>> With following settings:
>>>>> max_connections = 550
>>>>> shared_buffers = 12GB
>>>>> temp_buffers = 8MB
>>>>> max_prepared_transactions = 0
>>>>> work_mem = 50MB
>>>>> maintenance_work_mem = 1GB
>>>>> fsync = on
>>>>> wal_buffers = 16MB
>>>>> commit_delay = 50
>>>>> commit_siblings = 7
>>>>> checkpoint_segments = 32
>>>>> checkpoint_completion_target = 0.9
>>>>> effective_cache_size = 22GB
>>>>> autovacuum = on
>>>>> autovacuum_vacuum_threshold = 1800
>>>>> autovacuum_analyze_threshold = 900
>>>>>
>>>>> I am doing a lot of writes to DB in 40 different threads ­ so every
>>>>>thread
>>>>> check if record exists ­ if not => insert record, if exists => update
>>>>>record.
>>>>> During this update, my disk IO almost always ­ 100% and sometimes it
>>>>>crash my
>>>>> DB with following message:
>>>>>
>>>>> 2013-08-19 03:18:00 UTC LOG:  checkpointer process (PID 28354) was
>>>>>terminated by signal 9: Killed
>>[...]
>>>>>
>>>>> My DB size is not very big ­ 169GB.
>>>>>
>>>>> Anyone know how can I get rid of DB crash  ?
>>
>>>> The fact that the checkpointer was killed -9 let me think the
>>>>OOMKiller has
>>>> detected you were out of memory.
>>>>
>>>> Could that be the case?
>>>>
>>>> 12GB of shared_buffers on a 29Gb box is too high. You should try to
>>>>lower that
>>>> value to 6GB, for instance.
>>>> 550*50MB, that is 27GB of RAM that PostgreSQL could try to adress.
>>>>
>>>> I can imagine your system is swapping a lot, and you exhaust swap
>>>>memory before crash.
>>
>>> I don't think it's the case. I am using newrelic for monitoring my DB
>>> servers(I have one master and 2 slaves - all use the same
>>>configuration) -
>>> memory is not going above 12.5GB, so I have a good reserve, also I don't
>>> see any swapping there :(
>>
>> You can check by examining /var/log/messages to see if the OOM
>> killer is at fault, which is highly likely.
>>
>> The OOM killer uses heuristics, so it does the wrong thing occasionally.
>>
>> The documentation is helpful:
>> http://www.postgresql.org/docs/9.2/static/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT

> Do you mean postgres log file(in postgres.conf)
> 
> log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
> log_min_messages = warning
> 
> Or /var/log/messages ? Because I haven't this file :(

I meant the latter.
/var/log/messages is just where syslog output is directed on some
Linux distributions.  I don't know Ubuntu, so sorry if I got
it wrong.  Maybe it is /var/log/syslog on Ubuntu.
In case of doubt check your syslog configuration.

Yours,
Laurenz Albe

Re: postgres crashes on insert in 40 different threads

От
Dzmitry
Дата:
Thank you guys ! Found in logs(db-slave1 is my replica  that use streaming
replication):

Aug 18 15:49:38 db-slave1 kernel: [25094456.525703] postgres invoked
oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
Aug 18 15:49:38 db-slave1 kernel: [25094456.525708] postgres cpuset=/
mems_allowed=0
Aug 18 15:49:38 db-slave1 kernel: [25094456.525712] Pid: 26418, comm:
postgres Not tainted 3.2.0-40-virtual #64-Ubuntu


Aug 18 15:49:48 db-slave1 kernel: [25094456.621937] Out of memory: Kill
process 2414 (postgres) score 417 or sacrifice child
Aug 18 15:49:48 db-slave1 kernel: [25094456.621949] Killed process 2414
(postgres) total-vm:13057464kB, anon-rss:28560kB, file-rss:12773376kB



Aug 19 03:18:00 db-slave1 kernel: [25135758.540539] postgres invoked
oom-killer: gfp_mask=0x280da, order=0, oom_adj=0, oom_score_adj=0
Aug 19 03:18:01 db-slave1 kernel: [25135758.540544] postgres cpuset=/
mems_allowed=0
Aug 19 03:18:07 db-slave1 kernel: [25135758.540548] Pid: 23994, comm:
postgres Not tainted 3.2.0-40-virtual #64-Ubuntu

Aug 19 03:18:07 db-slave1 kernel: [25135758.626405] Out of memory: Kill
process 28354 (postgres) score 348 or sacrifice child

Aug 19 03:18:07 db-slave1 kernel: [25135758.626418] Killed process 28354
(postgres) total-vm:13021248kB, anon-rss:8704kB, file-rss:10686512kB
Aug 19 03:18:07 db-slave1 kernel: [25135763.068736] postgres invoked
oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
Aug 19 03:18:07 db-slave1 kernel: [25135763.068740] postgres cpuset=/
mems_allowed=0
Aug 19 03:18:07 db-slave1 kernel: [25135763.068743] Pid: 6780, comm:
postgres Not tainted 3.2.0-40-virtual #64-Ubuntu

Aug 19 03:18:07 db-slave1 kernel: [25135763.150285] Out of memory: Kill
process 20322 (postgres) score 301 or sacrifice child
Aug 19 03:18:07 db-slave1 kernel: [25135763.150297] Killed process 20322
(postgres) total-vm:13058892kB, anon-rss:47172kB, file-rss:9186604kB

So I will do as Stéphane advised - make shared buffers 6GB. Do you know if
I need to do anything else - increase shared memory(SHMMAX/SHMMIN)
parameters ?


Right now I have
Shhmax - 13223870464
Shmall - 4194304



Thanks,
  Dzmitry





On 8/19/13 1:05 PM, "Albe Laurenz" <laurenz.albe@wien.gv.at> wrote:

>>> Dzmitry wrote:
>>>> On 8/19/13 11:36 AM, "Stéphane Schildknecht"
>>>><stephane.schildknecht@postgresql.fr> wrote:
>>>>> Le 19/08/2013 10:07, Dzmitry a écrit :
>>>>>>  I have postgres server running on ubuntu 12,Intel Xeon 8 CPUs 29 GB
>>>>>>RAM.
>>>>>> With following settings:
>>>>>> max_connections = 550
>>>>>> shared_buffers = 12GB
>>>>>> temp_buffers = 8MB
>>>>>> max_prepared_transactions = 0
>>>>>> work_mem = 50MB
>>>>>> maintenance_work_mem = 1GB
>>>>>> fsync = on
>>>>>> wal_buffers = 16MB
>>>>>> commit_delay = 50
>>>>>> commit_siblings = 7
>>>>>> checkpoint_segments = 32
>>>>>> checkpoint_completion_target = 0.9
>>>>>> effective_cache_size = 22GB
>>>>>> autovacuum = on
>>>>>> autovacuum_vacuum_threshold = 1800
>>>>>> autovacuum_analyze_threshold = 900
>>>>>>
>>>>>> I am doing a lot of writes to DB in 40 different threads ­ so every
>>>>>>thread
>>>>>> check if record exists ­ if not => insert record, if exists =>
>>>>>>update
>>>>>>record.
>>>>>> During this update, my disk IO almost always ­ 100% and sometimes it
>>>>>>crash my
>>>>>> DB with following message:
>>>>>>
>>>>>> 2013-08-19 03:18:00 UTC LOG:  checkpointer process (PID 28354) was
>>>>>>terminated by signal 9: Killed
>>>[...]
>>>>>>
>>>>>> My DB size is not very big ­ 169GB.
>>>>>>
>>>>>> Anyone know how can I get rid of DB crash  ?
>>>
>>>>> The fact that the checkpointer was killed -9 let me think the
>>>>>OOMKiller has
>>>>> detected you were out of memory.
>>>>>
>>>>> Could that be the case?
>>>>>
>>>>> 12GB of shared_buffers on a 29Gb box is too high. You should try to
>>>>>lower that
>>>>> value to 6GB, for instance.
>>>>> 550*50MB, that is 27GB of RAM that PostgreSQL could try to adress.
>>>>>
>>>>> I can imagine your system is swapping a lot, and you exhaust swap
>>>>>memory before crash.
>>>
>>>> I don't think it's the case. I am using newrelic for monitoring my DB
>>>> servers(I have one master and 2 slaves - all use the same
>>>>configuration) -
>>>> memory is not going above 12.5GB, so I have a good reserve, also I
>>>>don't
>>>> see any swapping there :(
>>>
>>> You can check by examining /var/log/messages to see if the OOM
>>> killer is at fault, which is highly likely.
>>>
>>> The OOM killer uses heuristics, so it does the wrong thing
>>>occasionally.
>>>
>>> The documentation is helpful:
>>>
>>>http://www.postgresql.org/docs/9.2/static/kernel-resources.html#LINUX-ME
>>>MORY-OVERCOMMIT
>
>> Do you mean postgres log file(in postgres.conf)
>>
>> log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
>> log_min_messages = warning
>>
>> Or /var/log/messages ? Because I haven't this file :(
>
>I meant the latter.
>/var/log/messages is just where syslog output is directed on some
>Linux distributions.  I don't know Ubuntu, so sorry if I got
>it wrong.  Maybe it is /var/log/syslog on Ubuntu.
>In case of doubt check your syslog configuration.
>
>Yours,
>Laurenz Albe




Re: postgres crashes on insert in 40 different threads

От
Stéphane Schildknecht
Дата:
Le 19/08/2013 15:06, Dzmitry a écrit :
> Thank you guys ! Found in logs(db-slave1 is my replica  that use streaming
> replication):
>
> Aug 18 15:49:38 db-slave1 kernel: [25094456.525703] postgres invoked
> oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
> Aug 18 15:49:38 db-slave1 kernel: [25094456.525708] postgres cpuset=/
> mems_allowed=0
> Aug 18 15:49:38 db-slave1 kernel: [25094456.525712] Pid: 26418, comm:
> postgres Not tainted 3.2.0-40-virtual #64-Ubuntu
>
>
> Aug 18 15:49:48 db-slave1 kernel: [25094456.621937] Out of memory: Kill
> process 2414 (postgres) score 417 or sacrifice child
> Aug 18 15:49:48 db-slave1 kernel: [25094456.621949] Killed process 2414
> (postgres) total-vm:13057464kB, anon-rss:28560kB, file-rss:12773376kB
>
>
>
> Aug 19 03:18:00 db-slave1 kernel: [25135758.540539] postgres invoked
> oom-killer: gfp_mask=0x280da, order=0, oom_adj=0, oom_score_adj=0
> Aug 19 03:18:01 db-slave1 kernel: [25135758.540544] postgres cpuset=/
> mems_allowed=0
> Aug 19 03:18:07 db-slave1 kernel: [25135758.540548] Pid: 23994, comm:
> postgres Not tainted 3.2.0-40-virtual #64-Ubuntu
>
> Aug 19 03:18:07 db-slave1 kernel: [25135758.626405] Out of memory: Kill
> process 28354 (postgres) score 348 or sacrifice child
>
> Aug 19 03:18:07 db-slave1 kernel: [25135758.626418] Killed process 28354
> (postgres) total-vm:13021248kB, anon-rss:8704kB, file-rss:10686512kB
> Aug 19 03:18:07 db-slave1 kernel: [25135763.068736] postgres invoked
> oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
> Aug 19 03:18:07 db-slave1 kernel: [25135763.068740] postgres cpuset=/
> mems_allowed=0
> Aug 19 03:18:07 db-slave1 kernel: [25135763.068743] Pid: 6780, comm:
> postgres Not tainted 3.2.0-40-virtual #64-Ubuntu
>
> Aug 19 03:18:07 db-slave1 kernel: [25135763.150285] Out of memory: Kill
> process 20322 (postgres) score 301 or sacrifice child
> Aug 19 03:18:07 db-slave1 kernel: [25135763.150297] Killed process 20322
> (postgres) total-vm:13058892kB, anon-rss:47172kB, file-rss:9186604kB
>
> So I will do as Stéphane advised - make shared buffers 6GB. Do you know if
> I need to do anything else - increase shared memory(SHMMAX/SHMMIN)
> parameters ?
>
>
> Right now I have
> Shhmax - 13223870464
> Shmall - 4194304
>
>
>
> Thanks,
>    Dzmitry
>
>
>
>
>
> On 8/19/13 1:05 PM, "Albe Laurenz" <laurenz.albe@wien.gv.at> wrote:
>
>>>> Dzmitry wrote:
>>>>> On 8/19/13 11:36 AM, "Stéphane Schildknecht"
>>>>> <stephane.schildknecht@postgresql.fr> wrote:
>>>>>> Le 19/08/2013 10:07, Dzmitry a écrit :
>>>>>>>   I have postgres server running on ubuntu 12,Intel Xeon 8 CPUs 29 GB
>>>>>>> RAM.
>>>>>>> With following settings:
>>>>>>> max_connections = 550
>>>>>>> shared_buffers = 12GB
>>>>>>> temp_buffers = 8MB
>>>>>>> max_prepared_transactions = 0
>>>>>>> work_mem = 50MB
>>>>>>> maintenance_work_mem = 1GB
>>>>>>> fsync = on
>>>>>>> wal_buffers = 16MB
>>>>>>> commit_delay = 50
>>>>>>> commit_siblings = 7
>>>>>>> checkpoint_segments = 32
>>>>>>> checkpoint_completion_target = 0.9
>>>>>>> effective_cache_size = 22GB
>>>>>>> autovacuum = on
>>>>>>> autovacuum_vacuum_threshold = 1800
>>>>>>> autovacuum_analyze_threshold = 900
>>>>>>>
>>>>>>> I am doing a lot of writes to DB in 40 different threads ­ so every
>>>>>>> thread
>>>>>>> check if record exists ­ if not => insert record, if exists =>
>>>>>>> update
>>>>>>> record.
>>>>>>> During this update, my disk IO almost always ­ 100% and sometimes it
>>>>>>> crash my
>>>>>>> DB with following message:
>>>>>>>
>>>>>>> 2013-08-19 03:18:00 UTC LOG:  checkpointer process (PID 28354) was
>>>>>>> terminated by signal 9: Killed
>>>> [...]
>>>>>>> My DB size is not very big ­ 169GB.
>>>>>>>
>>>>>>> Anyone know how can I get rid of DB crash  ?
>>>>>> The fact that the checkpointer was killed -9 let me think the
>>>>>> OOMKiller has
>>>>>> detected you were out of memory.
>>>>>>
>>>>>> Could that be the case?
>>>>>>
>>>>>> 12GB of shared_buffers on a 29Gb box is too high. You should try to
>>>>>> lower that
>>>>>> value to 6GB, for instance.
>>>>>> 550*50MB, that is 27GB of RAM that PostgreSQL could try to adress.
>>>>>>
>>>>>> I can imagine your system is swapping a lot, and you exhaust swap
>>>>>> memory before crash.
>>>>> I don't think it's the case. I am using newrelic for monitoring my DB
>>>>> servers(I have one master and 2 slaves - all use the same
>>>>> configuration) -
>>>>> memory is not going above 12.5GB, so I have a good reserve, also I
>>>>> don't
>>>>> see any swapping there :(
>>>> You can check by examining /var/log/messages to see if the OOM
>>>> killer is at fault, which is highly likely.
>>>>
>>>> The OOM killer uses heuristics, so it does the wrong thing
>>>> occasionally.
>>>>
>>>> The documentation is helpful:
>>>>
>>>> http://www.postgresql.org/docs/9.2/static/kernel-resources.html#LINUX-ME
>>>> MORY-OVERCOMMIT
>>> Do you mean postgres log file(in postgres.conf)
>>>
>>> log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
>>> log_min_messages = warning
>>>
>>> Or /var/log/messages ? Because I haven't this file :(
>> I meant the latter.
>> /var/log/messages is just where syslog output is directed on some
>> Linux distributions.  I don't know Ubuntu, so sorry if I got
>> it wrong.  Maybe it is /var/log/syslog on Ubuntu.
>> In case of doubt check your syslog configuration.
>>
>> Yours,
>> Laurenz Albe
>
>
>


As Laurenz said, you should have a look at documentation.

It explains how you can lower the risk OOMKiller kills your PostgreSQL processes.
1. You can set vm.overcommit_memory to 2 in sysctl.conf
2. You can adjust the value of oom_score_adj in startup script to prevent
OOMKiller to kill Postmaster
3. You can lower shared_buffers, work_mem and max_connections.

Regards,

--
Stéphane Schildknecht
Loxodata - Conseil, expertise et formations



Re: postgres crashes on insert in 40 different threads

От
Scott Ribe
Дата:
On Aug 19, 2013, at 7:23 AM, Stéphane Schildknecht <stephane.schildknecht@postgresql.fr> wrote:

> As Laurenz said, you should have a look at documentation.
>
> It explains how you can lower the risk OOMKiller kills your PostgreSQL processes.
> 1. You can set vm.overcommit_memory to 2 in sysctl.conf
> 2. You can adjust the value of oom_score_adj in startup script to prevent OOMKiller to kill Postmaster
> 3. You can lower shared_buffers, work_mem and max_connections.

4. Use pgbouncer, and radically lower the number of pg connections used.

--
Scott Ribe
scott_ribe@elevated-dev.com
http://www.elevated-dev.com/
(303) 722-0567 voice






Re: postgres crashes on insert in 40 different threads

От
Frank Lanitz
Дата:
Am 19.08.2013 10:07, schrieb Dzmitry:
> I am doing a lot of writes to DB in 40 different threads – so every
> thread check if record exists – if not => insert record, if exists =>
> update record.
> During this update, my disk IO almost always – 100% and sometimes it
> crash my DB with following message:

Is this really needed to have so many threads and chances for race
conditions?

Beside the mention RAM thing you box might have to low IO capacities for
this kind of trnasactions. What is the load and in detail the iowait saying?

Cheers,
Frank


Re: postgres crashes on insert in 40 different threads

От
Dzmitry
Дата:
I am already using pgpool, is it bad have 550 connections, is it much ?
Thank you guys for all your help.

Thanks,
  Dzmitry





On 8/19/13 4:43 PM, "Scott Ribe" <scott_ribe@elevated-dev.com> wrote:

>On Aug 19, 2013, at 7:23 AM, Stéphane Schildknecht
><stephane.schildknecht@postgresql.fr> wrote:
>
>> As Laurenz said, you should have a look at documentation.
>>
>> It explains how you can lower the risk OOMKiller kills your PostgreSQL
>>processes.
>> 1. You can set vm.overcommit_memory to 2 in sysctl.conf
>> 2. You can adjust the value of oom_score_adj in startup script to
>>prevent OOMKiller to kill Postmaster
>> 3. You can lower shared_buffers, work_mem and max_connections.
>
>4. Use pgbouncer, and radically lower the number of pg connections used.
>
>--
>Scott Ribe
>scott_ribe@elevated-dev.com
>http://www.elevated-dev.com/
>(303) 722-0567 voice
>
>
>
>
>
>
>--
>Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
>To make changes to your subscription:
>http://www.postgresql.org/mailpref/pgsql-admin




Re: postgres crashes on insert in 40 different threads

От
Dzmitry
Дата:
Yes- I need so many threads, every night I need load jobs from xml to DB.
I need do it as fast as I can, currently it take 4h to load all of
them(around 1000000 jobs).
CPU/IO wait/percent is about 25%. Do you know how can I check other params
related to IO wait ?

Thanks,
  Dzmitry





On 8/19/13 5:59 PM, "Frank Lanitz" <frank@frank.uvena.de> wrote:

>Am 19.08.2013 10:07, schrieb Dzmitry:
>> I am doing a lot of writes to DB in 40 different threads ­ so every
>> thread check if record exists ­ if not => insert record, if exists =>
>> update record.
>> During this update, my disk IO almost always ­ 100% and sometimes it
>> crash my DB with following message:
>
>Is this really needed to have so many threads and chances for race
>conditions?
>
>Beside the mention RAM thing you box might have to low IO capacities for
>this kind of trnasactions. What is the load and in detail the iowait
>saying?
>
>Cheers,
>Frank
>
>
>--
>Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
>To make changes to your subscription:
>http://www.postgresql.org/mailpref/pgsql-admin




Re: postgres crashes on insert in 40 different threads

От
Scott Ribe
Дата:
On Aug 19, 2013, at 9:07 AM, Dzmitry <dzmitry.nikitsin@gmail.com> wrote:

> Yes- I need so many threads...

This is not at all clear from your answer. At most, 1 thread per logical core can execute at any one time. And while
manycan be waiting on IO instead of executing, there is still a rather low limit to how many threads it takes to
saturateIO capacity. If you actually tried to *run* 550 processes at the same time, you'd just be wasting time on
contention.

So, back to the question I just asked a minute ago in the prior email, do you have any idea how many pg connections are
actuallybeing used? 

--
Scott Ribe
scott_ribe@elevated-dev.com
http://www.elevated-dev.com/
(303) 722-0567 voice






Re: postgres crashes on insert in 40 different threads

От
Dzmitry
Дата:
Sorry for confusion.
I have 2 machines, each machine run 1 process, with 20 threads in each -
total 40 threads. Every thread connected to postgres and insert data. I am
using technology, that keep connection opens. It means every thread open 1
connection when it starts, and will closed it - only when killed. But
threads performing background jobs - it means they running always, so I
keep connection always open.

Thanks,
  Dzmitry





On 8/19/13 6:17 PM, "Scott Ribe" <scott_ribe@elevated-dev.com> wrote:

>On Aug 19, 2013, at 9:07 AM, Dzmitry <dzmitry.nikitsin@gmail.com> wrote:
>
>> Yes- I need so many threads...
>
>This is not at all clear from your answer. At most, 1 thread per logical
>core can execute at any one time. And while many can be waiting on IO
>instead of executing, there is still a rather low limit to how many
>threads it takes to saturate IO capacity. If you actually tried to *run*
>550 processes at the same time, you'd just be wasting time on contention.
>
>So, back to the question I just asked a minute ago in the prior email, do
>you have any idea how many pg connections are actually being used?
>
>--
>Scott Ribe
>scott_ribe@elevated-dev.com
>http://www.elevated-dev.com/
>(303) 722-0567 voice
>
>
>
>




Re: postgres crashes on insert in 40 different threads

От
Scott Ribe
Дата:
On Aug 19, 2013, at 9:21 AM, Dzmitry <dzmitry.nikitsin@gmail.com> wrote:

> I have 2 machines, each machine run 1 process, with 20 threads in each -
> total 40 threads. Every thread connected to postgres and insert data. I am
> using technology, that keep connection opens. It means every thread open 1
> connection when it starts, and will closed it - only when killed. But
> threads performing background jobs - it means they running always, so I
> keep connection always open.

So, 40 threads, not 440?

Was 440 connections a typo, or are there 400 connections you aren't telling us about?

--
Scott Ribe
scott_ribe@elevated-dev.com
http://www.elevated-dev.com/
(303) 722-0567 voice






Re: postgres crashes on insert in 40 different threads

От
Scott Ribe
Дата:
On Aug 19, 2013, at 9:22 AM, Dzmitry <dzmitry.nikitsin@gmail.com> wrote:

> I am using pgpool to balance load to slave servers.

So, to be clear, you're not using pgbouncer after all?

--
Scott Ribe
scott_ribe@elevated-dev.com
http://www.elevated-dev.com/
(303) 722-0567 voice






Re: postgres crashes on insert in 40 different threads

От
Dzmitry
Дата:
No, I am not using pgbouncer, I am using pgpool.

Total - I have 440 connections to postgres(I have rails application
running on some servers - each application setup 60 connections to DB and
keep if forever(until will not be killed), also I have some machines that
do background processing, that keep connections too).

Part that do a lot of writes(that update jobs from xml feed every night) -
have 40 threads and keep 40 connections.

Thanks,
  Dzmitry





On 8/19/13 6:26 PM, "Scott Ribe" <scott_ribe@elevated-dev.com> wrote:

>On Aug 19, 2013, at 9:22 AM, Dzmitry <dzmitry.nikitsin@gmail.com> wrote:
>
>> I am using pgpool to balance load to slave servers.
>
>So, to be clear, you're not using pgbouncer after all?
>
>--
>Scott Ribe
>scott_ribe@elevated-dev.com
>http://www.elevated-dev.com/
>(303) 722-0567 voice
>
>
>
>




Re: postgres crashes on insert in 40 different threads

От
Scott Ribe
Дата:
On Aug 19, 2013, at 9:55 AM, Dzmitry <dzmitry.nikitsin@gmail.com> wrote:

> No, I am not using pgbouncer, I am using pgpool.
>
> Total - I have 440 connections to postgres(I have rails application
> running on some servers - each application setup 60 connections to DB and
> keep if forever(until will not be killed), also I have some machines that
> do background processing, that keep connections too).
>
> Part that do a lot of writes(that update jobs from xml feed every night) -
> have 40 threads and keep 40 connections.

That's extreme, and probably counter-productive.

How many cores do you have on those rails servers? Probably not 64, right? Not 32? 16? 12? 8, even? Assuming <64, what
advantagedo you expect from 60 connections? Same comment applies to the 40 connections doing the update jobs--more
connectionsthan cores is unlikely to be helping anything, and more connections than 2x cores is almost guaranteed to be
worsethan fewer.  

Postgres connections are of the heavy-weight variety: process per connection, not thread per connection, not thread-per
coreevent-driven. In particular, I'd worry about work_mem in your configuration. You've either got to set it really low
andlive with queries going to disk too quickly for sorts and so on, or have it a decent size and have the risk that too
manyqueries at once will trigger OOM. 

Given your configuration, I wouldn't even start with pgbouncer for connection pooling. I'd first just slash the number
ofconnections everywhere by 1/2, or even 1/4 and see what effect that had. Then as a second step I'd look at where
connectionpooling might be used effectively. 

--
Scott Ribe
scott_ribe@elevated-dev.com
http://www.elevated-dev.com/
(303) 722-0567 voice






Re: postgres crashes on insert in 40 different threads

От
Dzmitry
Дата:
Ok,thank you, it's a good point. Need to review & make fixes with what I
have.

Thanks,
  Dzmitry





On 8/24/13 6:14 PM, "Scott Ribe" <scott_ribe@elevated-dev.com> wrote:

>On Aug 19, 2013, at 9:55 AM, Dzmitry <dzmitry.nikitsin@gmail.com> wrote:
>
>> No, I am not using pgbouncer, I am using pgpool.
>>
>> Total - I have 440 connections to postgres(I have rails application
>> running on some servers - each application setup 60 connections to DB
>>and
>> keep if forever(until will not be killed), also I have some machines
>>that
>> do background processing, that keep connections too).
>>
>> Part that do a lot of writes(that update jobs from xml feed every
>>night) -
>> have 40 threads and keep 40 connections.
>
>That's extreme, and probably counter-productive.
>
>How many cores do you have on those rails servers? Probably not 64,
>right? Not 32? 16? 12? 8, even? Assuming <64, what advantage do you
>expect from 60 connections? Same comment applies to the 40 connections
>doing the update jobs--more connections than cores is unlikely to be
>helping anything, and more connections than 2x cores is almost guaranteed
>to be worse than fewer.
>
>Postgres connections are of the heavy-weight variety: process per
>connection, not thread per connection, not thread-per core event-driven.
>In particular, I'd worry about work_mem in your configuration. You've
>either got to set it really low and live with queries going to disk too
>quickly for sorts and so on, or have it a decent size and have the risk
>that too many queries at once will trigger OOM.
>
>Given your configuration, I wouldn't even start with pgbouncer for
>connection pooling. I'd first just slash the number of connections
>everywhere by 1/2, or even 1/4 and see what effect that had. Then as a
>second step I'd look at where connection pooling might be used
>effectively.
>
>--
>Scott Ribe
>scott_ribe@elevated-dev.com
>http://www.elevated-dev.com/
>(303) 722-0567 voice
>
>
>
>




Re: postgres crashes on insert in 40 different threads

От
Scott Ribe
Дата:
On Aug 25, 2013, at 2:04 AM, Dzmitry <dzmitry.nikitsin@gmail.com> wrote:

> Ok,thank you, it's a good point. Need to review & make fixes with what I
> have.

Yes, as suggested I would try just reducing first.

Where connection pooling could help is if your load across all those rails apps is not spread out evenly, but varies
overtime such that there is *no* single small number of workers that is right for all conditions. Because the same
commentsabout no benefit from tons of threads applies to the pg server as well: so 110 connections simultaneously doing
workis not likely to be better than some smaller number... 

--
Scott Ribe
scott_ribe@elevated-dev.com
http://www.elevated-dev.com/
(303) 722-0567 voice