Обсуждение: CPU spikes and transactions

Поиск
Список
Период
Сортировка

CPU spikes and transactions

От
Tony Kay
Дата:
Hi,

I'm running 9.1.6 w/22GB shared buffers, and 32GB overall RAM on a 16 Opteron 6276 CPU box. We limit connections to roughly 120, but our webapp is configured to allocate a thread-local connection, so those connections are rarely doing anything more than half the time.

We have been running smoothly for over a year on this configuration, and recently started having huge CPU spikes that bring the system to its knees. Given that it is a multiuser system, it has been quite hard to pinpoint the exact cause, but I think we've narrowed it down to two data import jobs that were running in semi-long transactions (clusters of row inserts).

The tables affected by these inserts are used in common queries.

The imports will bring in a row count of perhaps 10k on average covering 4 tables.

The insert transactions are at isolation level read committed (the default for the JDBC driver).

When the import would run (again, theory...we have not been able to reproduce), we would end up maxed out on CPU, with a load average of 50 for 16 CPUs (our normal busy usage is a load average of 5 out of 16 CPUs).

When looking at the active queries, most of them are against the tables that are affected by these imports.

Our workaround (that is holding at present) was to drop the transactions on those imports (which is not optimal, but fortunately is acceptable for this particular data). This workaround has prevented any further incidents, but is of course inconclusive.

Does this sound familiar to anyone, and if so, please advise.

Thanks in advance,

Tony Kay

Re: CPU spikes and transactions

От
Tony Kay
Дата:
Hi Calvin,

Yes, I have sar data on all systems going back for years. 

Since others are going to probably want to be assured I am really "reading the data" right:

- This is 92% user CPU time, 5% sys, and 1% soft
- On some of the problems, I _do_ see a short spike of pgswpout's (memory pressure), but again, not enough to end up using much system time
- The database disks are idle (all data being used is in RAM)..and are SSDs....average service times are barely measurable in ms.

If I had to guess, I'd say it was spinlock misbehavior....I cannot understand why ekse a transaction blocking other things would drive the CPUs so hard into the ground with user time.

Tony



On Mon, Oct 14, 2013 at 4:05 PM, Calvin Dodge <caldodge@gmail.com> wrote:
Have you tried running "vmstat 1" during these times? If so, what is
the percentage of WAIT time?  Given that IIRC shared buffers should be
no more than 25% of installed memory, I wonder if too little is
available for system caching of disk reads.  A high WAIT percentage
would indicate excessive I/O (especially random seeks).

Calvin Dodge

On Mon, Oct 14, 2013 at 6:00 PM, Tony Kay <tony@teamunify.com> wrote:
> Hi,
>
> I'm running 9.1.6 w/22GB shared buffers, and 32GB overall RAM on a 16
> Opteron 6276 CPU box. We limit connections to roughly 120, but our webapp is
> configured to allocate a thread-local connection, so those connections are
> rarely doing anything more than half the time.
>
> We have been running smoothly for over a year on this configuration, and
> recently started having huge CPU spikes that bring the system to its knees.
> Given that it is a multiuser system, it has been quite hard to pinpoint the
> exact cause, but I think we've narrowed it down to two data import jobs that
> were running in semi-long transactions (clusters of row inserts).
>
> The tables affected by these inserts are used in common queries.
>
> The imports will bring in a row count of perhaps 10k on average covering 4
> tables.
>
> The insert transactions are at isolation level read committed (the default
> for the JDBC driver).
>
> When the import would run (again, theory...we have not been able to
> reproduce), we would end up maxed out on CPU, with a load average of 50 for
> 16 CPUs (our normal busy usage is a load average of 5 out of 16 CPUs).
>
> When looking at the active queries, most of them are against the tables that
> are affected by these imports.
>
> Our workaround (that is holding at present) was to drop the transactions on
> those imports (which is not optimal, but fortunately is acceptable for this
> particular data). This workaround has prevented any further incidents, but
> is of course inconclusive.
>
> Does this sound familiar to anyone, and if so, please advise.
>
> Thanks in advance,
>
> Tony Kay
>

Re: CPU spikes and transactions

От
Tomas Vondra
Дата:
On 15.10.2013 01:00, Tony Kay wrote:
> Hi,
>
> I'm running 9.1.6 w/22GB shared buffers, and 32GB overall RAM on a
> 16 Opteron 6276 CPU box. We limit connections to roughly 120, but
> our webapp is configured to allocate a thread-local connection, so
> those connections are rarely doing anything more than half the time.

Lower your shared buffers to about 20% of your RAM, unless you've tested
it's actually helping in your particular case. It's unlikely you'll get
better performance by using more than that, especially on older
versions, so it's wiser to leave the rest for page cache.

It might even be one of the causes of the performance issue you're
seeing, as shared buffers are not exactly overhead-free.

See this for more details on tuning:

   http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

You're on a rather old 9.1.x version, BTW. The last version in this
branch is 9.1.10 and there are some important security fixes (e.g. in
9.1.9). Not sure if there are any fixes relevant to the performance
issue, though.

A few initial questions:

* What OS are we dealing with?

* So how many active connections are there on average (see
  pg_stat_activity for connections running queries)?

* How much data are we talking about? In total and in the imports?

> We have been running smoothly for over a year on this configuration,
> and recently started having huge CPU spikes that bring the system to
> its knees. Given that it is a multiuser system, it has been quite
> hard to pinpoint the exact cause, but I think we've narrowed it down
> to two data import jobs that were running in semi-long transactions
> (clusters of row inserts).
>
> The tables affected by these inserts are used in common queries.
>
> The imports will bring in a row count of perhaps 10k on average
> covering 4 tables.
>
> The insert transactions are at isolation level read committed (the
> default for the JDBC driver).
>
> When the import would run (again, theory...we have not been able to
> reproduce), we would end up maxed out on CPU, with a load average of
> 50 for 16 CPUs (our normal busy usage is a load average of 5 out of
> 16 CPUs).
>
> When looking at the active queries, most of them are against the
> tables that are affected by these imports.

Which processes consume most CPU time? Are those backends executing the
queries, or some background processes (checkpointer, autovacuum, ...)?

Can you post a "top -c" output collected at the time of the CPU peak?

Also, try to collect a few snapshots of pg_stat_bgwriter catalog before
and during the loads. Don't forget to include the timestamp:

   select now(), * from pg_stat_bgwriter;

and when you're at it, pg_stat_database snapshots might be handy too
(not sure if you're running a single database or multiple ones), so use
either

  select now(), * from pg_stat_database;

or

  select now(), * from pg_stat_database where datname = '..dbname..';

That should give us at least some insight into what's happening.

> Our workaround (that is holding at present) was to drop the
> transactions on those imports (which is not optimal, but fortunately
> is acceptable for this particular data). This workaround has
> prevented any further incidents, but is of course inconclusive.
>
> Does this sound familiar to anyone, and if so, please advise.

I'm wondering how this could be related to the transactions, and IIRC
the stats (e.g. # of inserted rows) are sent at commit time. That might
trigger the autovacuum. But without the transactions the autovacuum
would be triggered sooner ...

regards
Tomas


Re: CPU spikes and transactions

От
Tomas Vondra
Дата:
On 15.10.2013 01:26, Tony Kay wrote:
> Hi Calvin,
>
> Yes, I have sar data on all systems going back for years.
>
> Since others are going to probably want to be assured I am really
> "reading the data" right:
>
> - This is 92% user CPU time, 5% sys, and 1% soft
> - On some of the problems, I _do_ see a short spike of pgswpout's
> (memory pressure), but again, not enough to end up using much system time
> - The database disks are idle (all data being used is in RAM)..and are
> SSDs....average service times are barely measurable in ms.

OK. Can you share the data? Maybe we'll notice something suspicious.

> If I had to guess, I'd say it was spinlock misbehavior....I cannot
> understand why ekse a transaction blocking other things would drive
> the CPUs so hard into the ground with user time.

Have you tried running perf, to verify the time is actually spent on
spinlocks?

Tomas


Re: CPU spikes and transactions

От
Merlin Moncure
Дата:
On Mon, Oct 14, 2013 at 6:45 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
> On 15.10.2013 01:26, Tony Kay wrote:
>> Hi Calvin,
>>
>> Yes, I have sar data on all systems going back for years.
>>
>> Since others are going to probably want to be assured I am really
>> "reading the data" right:
>>
>> - This is 92% user CPU time, 5% sys, and 1% soft
>> - On some of the problems, I _do_ see a short spike of pgswpout's
>> (memory pressure), but again, not enough to end up using much system time
>> - The database disks are idle (all data being used is in RAM)..and are
>> SSDs....average service times are barely measurable in ms.
>
> OK. Can you share the data? Maybe we'll notice something suspicious.
>
>> If I had to guess, I'd say it was spinlock misbehavior....I cannot
>> understand why ekse a transaction blocking other things would drive
>> the CPUs so hard into the ground with user time.
>
> Have you tried running perf, to verify the time is actually spent on
> spinlocks?

+1 this.  It is almost certainly spinlocks, but we need to know which
one and why.  plz install debug symbols and run a perf during normal
and high load conditions.

merlin


Re: CPU spikes and transactions

От
Tony Kay
Дата:

On Mon, Oct 14, 2013 at 4:42 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
On 15.10.2013 01:00, Tony Kay wrote:
> Hi,
>
> I'm running 9.1.6 w/22GB shared buffers, and 32GB overall RAM on a
> 16 Opteron 6276 CPU box. We limit connections to roughly 120, but
> our webapp is configured to allocate a thread-local connection, so
> those connections are rarely doing anything more than half the time.

Lower your shared buffers to about 20% of your RAM, unless you've tested
it's actually helping in your particular case. It's unlikely you'll get
better performance by using more than that, especially on older
versions, so it's wiser to leave the rest for page cache.

It might even be one of the causes of the performance issue you're
seeing, as shared buffers are not exactly overhead-free.

See this for more details on tuning:

   http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

I had followed the general directions from several sources years ago, which indicate up to 40% of RAM. We've been running very large shared buffers for 4 years now, but it is difficult to generate a good real load without testing against users, so we have not felt the need to move it around. In general, I don't tend to tinker with a setting that has been fine for this long without good reason. I've been wanting to upgrade to the newer mmap-based versions of pgsql, but was waiting to re-tune this when I did so.

Why do you suspect that shared_buffers would cause the behavior I'm seeing?
 


You're on a rather old 9.1.x version, BTW. The last version in this
branch is 9.1.10 and there are some important security fixes (e.g. in
9.1.9). Not sure if there are any fixes relevant to the performance
issue, though.

An upgrade to 9.1.10 is planned.
 
A few initial questions:

* What OS are we dealing with?

CentOS el6
 

* So how many active connections are there on average (see
  pg_stat_activity for connections running queries)?

about 40-60
 

* How much data are we talking about? In total and in the imports?

80GB database. The imports are maybe 1-3 MB...often much smaller. 10k rows would be a probably average.
 

> We have been running smoothly for over a year on this configuration,
> and recently started having huge CPU spikes that bring the system to
> its knees. Given that it is a multiuser system, it has been quite
> hard to pinpoint the exact cause, but I think we've narrowed it down
> to two data import jobs that were running in semi-long transactions
> (clusters of row inserts).
>
> The tables affected by these inserts are used in common queries.
>
> The imports will bring in a row count of perhaps 10k on average
> covering 4 tables.
>
> The insert transactions are at isolation level read committed (the
> default for the JDBC driver).
>
> When the import would run (again, theory...we have not been able to
> reproduce), we would end up maxed out on CPU, with a load average of
> 50 for 16 CPUs (our normal busy usage is a load average of 5 out of
> 16 CPUs).
>
> When looking at the active queries, most of them are against the
> tables that are affected by these imports.

Which processes consume most CPU time? Are those backends executing the
queries, or some background processes (checkpointer, autovacuum, ...)?


The backends executing the queries...most of the queries that seem hung usually run in a few ms.
 
Can you post a "top -c" output collected at the time of the CPU peak?


Don't have process accounting, so I cannot regenerate that; however, I can tell you what queries were active at one of them. 

There were 36 of the queries agains table ind_event (which is one affected by the import). Those queries usually take 5-10ms, and we never see more than 2 active during normal operation. These had been active for _minutes_....a sample of the running queries:

time_active   |       datname       | procpid |   query                                      
-----------------+---------------------+---------+-------------------------------------------
 00:08:10.891105 | tudb                |    9058 | select * from mr_uss_ind_event_x where (tu
 00:08:10.981845 | tudb                |    8977 | select * from mr_uss_ind_event_x where (tu
 00:07:08.883347 | tudb                |    8930 | select * from mr_uss_ind_event_x where org
 00:07:15.266393 | tudb                |    8927 | select * from mr_uss_ind_event_x where org
 00:07:27.587133 | tudb                |   11867 | update msg_result set dt_result=$1,msg_id=
 00:08:06.458885 | tudb                |    8912 | select * from mr_uss_ind_event_x where org
 00:06:43.036487 | tudb                |    8887 | select * from mr_uss_ind_event_x where (tu
 00:07:01.992367 | tudb                |    8831 | select * from mr_uss_ind_event_x where (tu
 00:06:59.217721 | tudb                |    8816 | select * from mr_uss_ind_event_x where org
 00:07:07.558848 | tudb                |    8811 | update md_invoice set unbilled_amt=unbille
 00:07:30.636192 | tudb                |    8055 | select * from mr_uss_ind_event_x where (tu
 00:07:26.761053 | tudb                |    8053 | update msg_result set dt_result=$1,msg_id=
 00:06:46.021084 | tudb                |    8793 | select * from mr_uss_ind_event_x where (tu
 00:07:26.412781 | tudb                |    8041 | select * from mr_uss_ind_event_x where org
 00:07:43.315019 | tudb                |    8031 | select * from mr_uss_ind_event_x where org
 00:07:42.651143 | tudb                |    7990 | select * from mr_uss_ind_event_x where org
 00:06:45.258232 | tudb                |    7973 | select * from mr_uss_ind_event_x where (tu
 00:07:46.135027 | tudb                |    7961 | select * from mr_uss_ind_event_x where (tu
 00:07:31.814513 | tudb                |    7959 | select * from mr_uss_ind_event_x where (tu
 00:07:27.739959 | tudb                |    8221 | select * from mr_uss_ind_event_x where org
 00:07:21.554369 | tudb                |    8191 | select * from mr_uss_ind_event_x where org
 00:07:30.665133 | tudb                |    7953 | select * from mr_uss_ind_event_x where org
 00:07:17.727028 | tudb                |    7950 | select * from mr_uss_ind_event_x where org
 00:07:25.657611 | tudb                |    7948 | select * from mr_uss_ind_event_x where org
 00:07:28.118856 | tudb                |    7939 | select * from mr_uss_ind_event_x where org
 00:07:32.436705 | tudb                |    7874 | insert into mr_uss_ind_event (prelimtime_c
 00:08:12.090187 | tudb                |    7873 | select * from mr_uss_ind_event_x where (tu
 00:07:19.181981 | tudb                |    7914 | select * from mr_uss_ind_event_x where (tu
 00:07:04.234119 | tudb                |    7909 | select * from mr_uss_ind_event_x where (tu
 00:06:52.614609 | tudb                |    7856 | select * from mr_uss_ind_event_x where org
 00:07:18.667903 | tudb                |    7908 | select * from mr_uss_ind_event_x where (tu

The insert listed there is coming from that import...the others are querying a view that includes that table in a join.

Also, try to collect a few snapshots of pg_stat_bgwriter catalog before
and during the loads. Don't forget to include the timestamp:

   select now(), * from pg_stat_bgwriter;


This is a live production system, and it will take me some doing to generate a load on a test server that triggers the condition. I'll be certain to gather this and the other stats if I can trigger it.
 
and when you're at it, pg_stat_database snapshots might be handy too
(not sure if you're running a single database or multiple ones), so use
either

  select now(), * from pg_stat_database;

or

  select now(), * from pg_stat_database where datname = '..dbname..';

That should give us at least some insight into what's happening.

> Our workaround (that is holding at present) was to drop the
> transactions on those imports (which is not optimal, but fortunately
> is acceptable for this particular data). This workaround has
> prevented any further incidents, but is of course inconclusive.
>
> Does this sound familiar to anyone, and if so, please advise.

I'm wondering how this could be related to the transactions, and IIRC
the stats (e.g. # of inserted rows) are sent at commit time. That might
trigger the autovacuum. But without the transactions the autovacuum
would be triggered sooner ...

regards
Tomas


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: CPU spikes and transactions

От
Tony Kay
Дата:
Thanks for the tip. I forgot there were kernel stats on spinlocks.

I'm not sure we'll be able to get it to tip in a test environment, and we're unwilling to revert the code in production in order to have our users trigger it. We'll try triggering it on our test server, and if we manage, I'll get you the stats.

Thanks!

Tony




On Tue, Oct 15, 2013 at 6:00 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Mon, Oct 14, 2013 at 6:45 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
> On 15.10.2013 01:26, Tony Kay wrote:
>> Hi Calvin,
>>
>> Yes, I have sar data on all systems going back for years.
>>
>> Since others are going to probably want to be assured I am really
>> "reading the data" right:
>>
>> - This is 92% user CPU time, 5% sys, and 1% soft
>> - On some of the problems, I _do_ see a short spike of pgswpout's
>> (memory pressure), but again, not enough to end up using much system time
>> - The database disks are idle (all data being used is in RAM)..and are
>> SSDs....average service times are barely measurable in ms.
>
> OK. Can you share the data? Maybe we'll notice something suspicious.
>
>> If I had to guess, I'd say it was spinlock misbehavior....I cannot
>> understand why ekse a transaction blocking other things would drive
>> the CPUs so hard into the ground with user time.
>
> Have you tried running perf, to verify the time is actually spent on
> spinlocks?

+1 this.  It is almost certainly spinlocks, but we need to know which
one and why.  plz install debug symbols and run a perf during normal
and high load conditions.

merlin


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: CPU spikes and transactions

От
Julien Cigar
Дата:
On Tue, Oct 15, 2013 at 08:59:08AM -0700, Tony Kay wrote:
> On Mon, Oct 14, 2013 at 4:42 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
>
> > On 15.10.2013 01:00, Tony Kay wrote:
> > > Hi,
> > >
> > > I'm running 9.1.6 w/22GB shared buffers, and 32GB overall RAM on a
> > > 16 Opteron 6276 CPU box. We limit connections to roughly 120, but
> > > our webapp is configured to allocate a thread-local connection, so
> > > those connections are rarely doing anything more than half the time.
> >
> > Lower your shared buffers to about 20% of your RAM, unless you've tested
> > it's actually helping in your particular case. It's unlikely you'll get
> > better performance by using more than that, especially on older
> > versions, so it's wiser to leave the rest for page cache.
> >
> > It might even be one of the causes of the performance issue you're
> > seeing, as shared buffers are not exactly overhead-free.
> >
> > See this for more details on tuning:
> >
> >    http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
>
>
> I had followed the general directions from several sources years ago, which
> indicate up to 40% of RAM. We've been running very large shared buffers for

in general it's best to start with 10-15% of the RAM and no more then
2-4 GB

> 4 years now, but it is difficult to generate a good real load without
> testing against users, so we have not felt the need to move it around. In
> general, I don't tend to tinker with a setting that has been fine for this
> long without good reason. I've been wanting to upgrade to the newer
> mmap-based versions of pgsql, but was waiting to re-tune this when I did so.
>
> Why do you suspect that shared_buffers would cause the behavior I'm seeing?
>

for two reasons:

- some of the overhead of bgwriter and checkpoints is more or less linear
in the size of shared_buffers, for example it could be possible that a
large quantity of data could be dirty when a checkpoint occurs).

- the OS cache is also being used for reads and writes, the larger
  shared_buffers is, the more you risk double buffering (same blocks
  in the OS cache and in the database buffer cache).

>
> >
> >
> > You're on a rather old 9.1.x version, BTW. The last version in this
> > branch is 9.1.10 and there are some important security fixes (e.g. in
> > 9.1.9). Not sure if there are any fixes relevant to the performance
> > issue, though.
> >
> > An upgrade to 9.1.10 is planned.
>
>
> > A few initial questions:
> >
> > * What OS are we dealing with?
> >
>
> CentOS el6
>
>
> >
> > * So how many active connections are there on average (see
> >   pg_stat_activity for connections running queries)?
> >
>
> about 40-60
>
>
> >
> > * How much data are we talking about? In total and in the imports?
> >
>
> 80GB database. The imports are maybe 1-3 MB...often much smaller. 10k rows
> would be a probably average.
>
>
> >
> > > We have been running smoothly for over a year on this configuration,
> > > and recently started having huge CPU spikes that bring the system to
> > > its knees. Given that it is a multiuser system, it has been quite
> > > hard to pinpoint the exact cause, but I think we've narrowed it down
> > > to two data import jobs that were running in semi-long transactions
> > > (clusters of row inserts).
> > >
> > > The tables affected by these inserts are used in common queries.
> > >
> > > The imports will bring in a row count of perhaps 10k on average
> > > covering 4 tables.
> > >
> > > The insert transactions are at isolation level read committed (the
> > > default for the JDBC driver).
> > >
> > > When the import would run (again, theory...we have not been able to
> > > reproduce), we would end up maxed out on CPU, with a load average of
> > > 50 for 16 CPUs (our normal busy usage is a load average of 5 out of
> > > 16 CPUs).
> > >
> > > When looking at the active queries, most of them are against the
> > > tables that are affected by these imports.
> >
> > Which processes consume most CPU time? Are those backends executing the
> > queries, or some background processes (checkpointer, autovacuum, ...)?
> >
> >
> The backends executing the queries...most of the queries that seem hung
> usually run in a few ms.
>
>
> > Can you post a "top -c" output collected at the time of the CPU peak?
> >
> >
> Don't have process accounting, so I cannot regenerate that; however, I can
> tell you what queries were active at one of them.
>
> There were 36 of the queries agains table ind_event (which is one affected
> by the import). Those queries usually take 5-10ms, and we never see more
> than 2 active during normal operation. These had been active for
> _minutes_....a sample of the running queries:
>
> time_active   |       datname       | procpid |   query
>
> -----------------+---------------------+---------+-------------------------------------------
>  00:08:10.891105 | tudb                |    9058 | select * from
> mr_uss_ind_event_x where (tu
>  00:08:10.981845 | tudb                |    8977 | select * from
> mr_uss_ind_event_x where (tu
>  00:07:08.883347 | tudb                |    8930 | select * from
> mr_uss_ind_event_x where org
>  00:07:15.266393 | tudb                |    8927 | select * from
> mr_uss_ind_event_x where org
>  00:07:27.587133 | tudb                |   11867 | update msg_result set
> dt_result=$1,msg_id=
>  00:08:06.458885 | tudb                |    8912 | select * from
> mr_uss_ind_event_x where org
>  00:06:43.036487 | tudb                |    8887 | select * from
> mr_uss_ind_event_x where (tu
>  00:07:01.992367 | tudb                |    8831 | select * from
> mr_uss_ind_event_x where (tu
>  00:06:59.217721 | tudb                |    8816 | select * from
> mr_uss_ind_event_x where org
>  00:07:07.558848 | tudb                |    8811 | update md_invoice set
> unbilled_amt=unbille
>  00:07:30.636192 | tudb                |    8055 | select * from
> mr_uss_ind_event_x where (tu
>  00:07:26.761053 | tudb                |    8053 | update msg_result set
> dt_result=$1,msg_id=
>  00:06:46.021084 | tudb                |    8793 | select * from
> mr_uss_ind_event_x where (tu
>  00:07:26.412781 | tudb                |    8041 | select * from
> mr_uss_ind_event_x where org
>  00:07:43.315019 | tudb                |    8031 | select * from
> mr_uss_ind_event_x where org
>  00:07:42.651143 | tudb                |    7990 | select * from
> mr_uss_ind_event_x where org
>  00:06:45.258232 | tudb                |    7973 | select * from
> mr_uss_ind_event_x where (tu
>  00:07:46.135027 | tudb                |    7961 | select * from
> mr_uss_ind_event_x where (tu
>  00:07:31.814513 | tudb                |    7959 | select * from
> mr_uss_ind_event_x where (tu
>  00:07:27.739959 | tudb                |    8221 | select * from
> mr_uss_ind_event_x where org
>  00:07:21.554369 | tudb                |    8191 | select * from
> mr_uss_ind_event_x where org
>  00:07:30.665133 | tudb                |    7953 | select * from
> mr_uss_ind_event_x where org
>  00:07:17.727028 | tudb                |    7950 | select * from
> mr_uss_ind_event_x where org
>  00:07:25.657611 | tudb                |    7948 | select * from
> mr_uss_ind_event_x where org
>  00:07:28.118856 | tudb                |    7939 | select * from
> mr_uss_ind_event_x where org
>  00:07:32.436705 | tudb                |    7874 | insert into
> mr_uss_ind_event (prelimtime_c
>  00:08:12.090187 | tudb                |    7873 | select * from
> mr_uss_ind_event_x where (tu
>  00:07:19.181981 | tudb                |    7914 | select * from
> mr_uss_ind_event_x where (tu
>  00:07:04.234119 | tudb                |    7909 | select * from
> mr_uss_ind_event_x where (tu
>  00:06:52.614609 | tudb                |    7856 | select * from
> mr_uss_ind_event_x where org
>  00:07:18.667903 | tudb                |    7908 | select * from
> mr_uss_ind_event_x where (tu
>
> The insert listed there is coming from that import...the others are
> querying a view that includes that table in a join.
>
> Also, try to collect a few snapshots of pg_stat_bgwriter catalog before
> > and during the loads. Don't forget to include the timestamp:
> >
> >    select now(), * from pg_stat_bgwriter;
> >
> >
> This is a live production system, and it will take me some doing to
> generate a load on a test server that triggers the condition. I'll be
> certain to gather this and the other stats if I can trigger it.
>
>
> > and when you're at it, pg_stat_database snapshots might be handy too
> > (not sure if you're running a single database or multiple ones), so use
> > either
> >
> >   select now(), * from pg_stat_database;
> >
> > or
> >
> >   select now(), * from pg_stat_database where datname = '..dbname..';
> >
> > That should give us at least some insight into what's happening.
> >
> > > Our workaround (that is holding at present) was to drop the
> > > transactions on those imports (which is not optimal, but fortunately
> > > is acceptable for this particular data). This workaround has
> > > prevented any further incidents, but is of course inconclusive.
> > >
> > > Does this sound familiar to anyone, and if so, please advise.
> >
> > I'm wondering how this could be related to the transactions, and IIRC
> > the stats (e.g. # of inserted rows) are sent at commit time. That might
> > trigger the autovacuum. But without the transactions the autovacuum
> > would be triggered sooner ...
> >
> > regards
> > Tomas
> >
> >
> > --
> > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-performance
> >

--
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.


Re: CPU spikes and transactions

От
Tony Kay
Дата:

On Tue, Oct 15, 2013 at 10:26 AM, Julien Cigar <jcigar@ulb.ac.be> wrote:

for two reasons:

- some of the overhead of bgwriter and checkpoints is more or less linear
in the size of shared_buffers, for example it could be possible that a
large quantity of data could be dirty when a checkpoint occurs).

- the OS cache is also being used for reads and writes, the larger
  shared_buffers is, the more you risk double buffering (same blocks
  in the OS cache and in the database buffer cache).

Excellent. Thank you for the information. My suspicion has always been that the shared_buffers are "level 1 cache", so it seems to me that you'd want that to be large enough to hold your entire database if you could. However, I'm now realizing that I was _also_ assuming the IPC shared memory was also being locked via mlock to prevent swapping it out, and I'm now getting the feeling that this isn't true, which means the double buffering could lead to swap space use on buffer cache pressure...which I do occasionally see in ways I had not expected.

We do streaming replication and also store them for snapshot PITR backups, so I am intimately familiar with our write load, and I can say it is pretty low (we ship a 16MB WAL file about every 10-15 minutes during our busiest times).

That said, I can see how an import that is doing a bunch of writes could possibly spread those over a large area that would then consume a lot of CPU on the writer and checkpoint; however, I do not see how either of those would cause 40-60 different postgres backgroud processes (all running a normally "light query") to spin off into oblivion unless the write work load is somehow threaded into the background workers (which I'm sure it isn't). So, I think we're still dealing with a spinlock issue.

We're going to upgrade to 9.1.10 (with debug symbols) Thursday night and add another 64GB of RAM. I'll tune shared_buffers down to 2GB at that time and bump effective_cache_size up at the same time. My large history of sar data will make it apparent pretty quickly if that is a win/lose/tie.

If we have another spike in production, we'll be ready to measure it more accurately.

Thanks,

Tony

Re: CPU spikes and transactions

От
Julien Cigar
Дата:
On Tue, Oct 15, 2013 at 12:07:38PM -0700, Tony Kay wrote:
> On Tue, Oct 15, 2013 at 10:26 AM, Julien Cigar <jcigar@ulb.ac.be> wrote:
>
> >
> > for two reasons:
> >
> > - some of the overhead of bgwriter and checkpoints is more or less linear
> > in the size of shared_buffers, for example it could be possible that a
> > large quantity of data could be dirty when a checkpoint occurs).
> >
> > - the OS cache is also being used for reads and writes, the larger
> >   shared_buffers is, the more you risk double buffering (same blocks
> >   in the OS cache and in the database buffer cache).
> >
>
> Excellent. Thank you for the information. My suspicion has always been that
> the shared_buffers are "level 1 cache", so it seems to me that you'd want
> that to be large enough to hold your entire database if you could. However,
> I'm now realizing that I was _also_ assuming the IPC shared memory was also
> being locked via mlock to prevent swapping it out, and I'm now getting the

on FreeBSD you can use kern.ipc.shm_use_phys=1 to lock shared memory
pages in core (note that it's useless on 9.3 as mmap is now used to
allocate shared memory)

(and BTW I'm curious is someone has done benchmarks on FreeBSD + 9.3 +
mmap, because enabling kern.ipc.shm_use_phys leads to a 2-4x perf
improvement in some benchmarks)

> feeling that this isn't true, which means the double buffering could lead
> to swap space use on buffer cache pressure...which I do occasionally see in
> ways I had not expected.

you can't avoid double buffering sometime, for example if a block is
read from disk and has not been requested previously it will first go to
the OS cache and then to the buffer cache. In an ideal world block that
are most frequently used should be in the database buffer cache, and
others in the OS cache.

>
> We do streaming replication and also store them for snapshot PITR backups,
> so I am intimately familiar with our write load, and I can say it is pretty
> low (we ship a 16MB WAL file about every 10-15 minutes during our busiest
> times).
>
> That said, I can see how an import that is doing a bunch of writes could
> possibly spread those over a large area that would then consume a lot of
> CPU on the writer and checkpoint; however, I do not see how either of those
> would cause 40-60 different postgres backgroud processes (all running a
> normally "light query") to spin off into oblivion unless the write work
> load is somehow threaded into the background workers (which I'm sure it
> isn't). So, I think we're still dealing with a spinlock issue.
>
> We're going to upgrade to 9.1.10 (with debug symbols) Thursday night and
> add another 64GB of RAM. I'll tune shared_buffers down to 2GB at that time
> and bump effective_cache_size up at the same time. My large history of sar
> data will make it apparent pretty quickly if that is a win/lose/tie.
>
> If we have another spike in production, we'll be ready to measure it more
> accurately.
>
> Thanks,
>
> Tony

--
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.


Re: CPU spikes and transactions

От
Tomas Vondra
Дата:
On 15.10.2013 21:07, Tony Kay wrote:
>
> On Tue, Oct 15, 2013 at 10:26 AM, Julien Cigar <jcigar@ulb.ac.be
> <mailto:jcigar@ulb.ac.be>> wrote:
>
>
> for two reasons:
>
> - some of the overhead of bgwriter and checkpoints is more or less
> linear in the size of shared_buffers, for example it could be
> possible that a large quantity of data could be dirty when a
> checkpoint occurs).
>
> - the OS cache is also being used for reads and writes, the larger
> shared_buffers is, the more you risk double buffering (same blocks
> in the OS cache and in the database buffer cache).
>
>
> Excellent. Thank you for the information. My suspicion has always
> been that the shared_buffers are "level 1 cache", so it seems to me
> that you'd want that to be large enough to hold your entire database
> if you could. However, I'm now realizing that I was _also_ assuming
> the IPC shared memory was also being locked via mlock to prevent
> swapping it out, and I'm now getting the feeling that this isn't
> true, which means the double buffering could lead to swap space use
> on buffer cache pressure...which I do occasionally see in ways I had
> not expected.
>
> We do streaming replication and also store them for snapshot PITR
> backups, so I am intimately familiar with our write load, and I can
> say it is pretty low (we ship a 16MB WAL file about every 10-15
> minutes during our busiest times).
>
> That said, I can see how an import that is doing a bunch of writes
> could possibly spread those over a large area that would then
> consume a lot of CPU on the writer and checkpoint; however, I do not
> see how either of those would cause 40-60 different postgres
> backgroud processes (all running a normally "light query") to spin
> off into oblivion unless the write work load is somehow threaded into
> the background workers (which I'm sure it isn't). So, I think we're
> still dealing with a spinlock issue.
>
> We're going to upgrade to 9.1.10 (with debug symbols) Thursday night
> and add another 64GB of RAM. I'll tune shared_buffers down to 2GB at
> that time and bump effective_cache_size up at the same time. My
> large history of sar data will make it apparent pretty quickly if
> that is a win/lose/tie.

Don't be too aggressive, though. You haven't identified the bottleneck
yet and 2GB might be too low. For example we're running 9.1.x too, and
we're generally quite happy with 10GB shared buffers (on machines with
96GB of RAM). So although 22GB is definitely too much, but 2GB might be
too low, especially if you add more RAM into the machine.

What you may do is inspect the buffer cache with this contrib module:

  http://www.postgresql.org/docs/9.1/interactive/pgbuffercache.html

Doing something as simple as this:

  select (reldatabase is not null), count(*)
    from pg_buffercache group by 1;

  select usagecount, count(*)
    from pg_buffercache where reldatabase is not null group by 1;

  select isdirty, count(*)
    from pg_buffercache where reldatabase is not null group by 1;

should tell you some very basic metrics, i.e. what portion of buffers
you actually use, what is the LRU usage count histogram and what portion
of shared buffers is dirty.

Again, you'll have to run this repeatedly during the import job, to get
an idea of what's going on and size the shared buffers reasonably for
your workload.

Be careful - this needs to acquire some locks to get a consistent
result, so don't run that too frequently.

Tomas


Re: CPU spikes and transactions

От
Merlin Moncure
Дата:
On Tue, Oct 15, 2013 at 12:26 PM, Julien Cigar <jcigar@ulb.ac.be> wrote:
> On Tue, Oct 15, 2013 at 08:59:08AM -0700, Tony Kay wrote:
>> On Mon, Oct 14, 2013 at 4:42 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
>>
>> > On 15.10.2013 01:00, Tony Kay wrote:
>> > > Hi,
>> > >
>> > > I'm running 9.1.6 w/22GB shared buffers, and 32GB overall RAM on a
>> > > 16 Opteron 6276 CPU box. We limit connections to roughly 120, but
>> > > our webapp is configured to allocate a thread-local connection, so
>> > > those connections are rarely doing anything more than half the time.
>> >
>> > Lower your shared buffers to about 20% of your RAM, unless you've tested
>> > it's actually helping in your particular case. It's unlikely you'll get
>> > better performance by using more than that, especially on older
>> > versions, so it's wiser to leave the rest for page cache.
>> >
>> > It might even be one of the causes of the performance issue you're
>> > seeing, as shared buffers are not exactly overhead-free.
>> >
>> > See this for more details on tuning:
>> >
>> >    http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
>>
>>
>> I had followed the general directions from several sources years ago, which
>> indicate up to 40% of RAM. We've been running very large shared buffers for
>
> in general it's best to start with 10-15% of the RAM and no more then
> 2-4 GB
>
>> 4 years now, but it is difficult to generate a good real load without
>> testing against users, so we have not felt the need to move it around. In
>> general, I don't tend to tinker with a setting that has been fine for this
>> long without good reason. I've been wanting to upgrade to the newer
>> mmap-based versions of pgsql, but was waiting to re-tune this when I did so.
>>
>> Why do you suspect that shared_buffers would cause the behavior I'm seeing?
>>
>
> for two reasons:
>
> - some of the overhead of bgwriter and checkpoints is more or less linear
> in the size of shared_buffers, for example it could be possible that a
> large quantity of data could be dirty when a checkpoint occurs).
>
> - the OS cache is also being used for reads and writes, the larger
>   shared_buffers is, the more you risk double buffering (same blocks
>   in the OS cache and in the database buffer cache).

That's good reasoning but is not related to the problem faced by the
OP.  The real reason why I recommend to keep shared buffers at max
2GB, always, is because we have major contention issues which we
presume are in the buffer area (either in the mapping or in the clock
sweep) but could be something else entirely.  These issues tend to
show up on fast machines in all- or mostly- read workloads.

We are desperate for profiles demonstrating the problem in production
workloads.  If OP is willing to install and run perf in production
(which is not a bad idea anyways), then my advice is to change nothing
until we have a chance to grab a profile.  These types of problems are
notoriously difficult to reproduce in test environments.

merlin


Re: CPU spikes and transactions

От
Dave Owens
Дата:
Hi,

Apologies for resurrecting this old thread, but it seems like this is better than starting a new conversation.

We are now running 9.1.13 and have doubled the CPU and memory.  So 2x 16 Opteron 6276 (32 cores total), and 64GB memory.  shared_buffers set to 20G, effective_cache_size set to 40GB.

We were able to record perf data during the latest incident of high CPU utilization. perf report is below:

Samples: 31M of event 'cycles', Event count (approx.): 16289978380877 
 44.74%       postmaster  [kernel.kallsyms]             [k] _spin_lock_irqsave                                     
 15.03%       postmaster  postgres                      [.] 0x00000000002ea937                                     
  3.14%       postmaster  postgres                      [.] s_lock                                                 
  2.30%       postmaster  [kernel.kallsyms]             [k] compaction_alloc                                       
  2.21%       postmaster  postgres                      [.] HeapTupleSatisfiesMVCC                                 
  1.75%       postmaster  postgres                      [.] hash_search_with_hash_value                            
  1.25%       postmaster  postgres                      [.] ExecScanHashBucket                                     
  1.20%       postmaster  postgres                      [.] SHMQueueNext                                           
  1.05%       postmaster  postgres                      [.] slot_getattr                                           
  1.04%             init  [kernel.kallsyms]             [k] native_safe_halt                                       
  0.73%       postmaster  postgres                      [.] LWLockAcquire                                          
  0.59%       postmaster  [kernel.kallsyms]             [k] page_fault                                             
  0.52%       postmaster  postgres                      [.] ExecQual                                               
  0.40%       postmaster  postgres                      [.] ExecStoreTuple                                         
  0.38%       postmaster  postgres                      [.] ExecScan                                               
  0.37%       postmaster  postgres                      [.] check_stack_depth                                      
  0.35%       postmaster  postgres                      [.] SearchCatCache                                         
  0.35%       postmaster  postgres                      [.] CheckForSerializableConflictOut                        
  0.34%       postmaster  postgres                      [.] LWLockRelease                                          
  0.30%       postmaster  postgres                      [.] _bt_checkkeys                                          
  0.28%       postmaster  libc-2.12.so                  [.] memcpy                                                 
  0.27%       postmaster  [kernel.kallsyms]             [k] get_pageblock_flags_group                              
  0.27%       postmaster  postgres                      [.] int4eq                                                 
  0.27%       postmaster  postgres                      [.] heap_page_prune_opt                                    
  0.27%       postmaster  postgres                      [.] pgstat_init_function_usage                             
  0.26%       postmaster  [kernel.kallsyms]             [k] _spin_lock                                             
  0.25%       postmaster  postgres                      [.] _bt_compare                                            
  0.24%       postmaster  postgres                      [.] pgstat_end_function_usage

...please let me know if we need to produce the report differently to be useful.

We will begin reducing shared_buffers incrementally over the coming days.


Dave Owens


On Tue, Oct 15, 2013 at 8:14 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Tue, Oct 15, 2013 at 12:26 PM, Julien Cigar <jcigar@ulb.ac.be> wrote:
> On Tue, Oct 15, 2013 at 08:59:08AM -0700, Tony Kay wrote:
>> On Mon, Oct 14, 2013 at 4:42 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
>>
>> > On 15.10.2013 01:00, Tony Kay wrote:
>> > > Hi,
>> > >
>> > > I'm running 9.1.6 w/22GB shared buffers, and 32GB overall RAM on a
>> > > 16 Opteron 6276 CPU box. We limit connections to roughly 120, but
>> > > our webapp is configured to allocate a thread-local connection, so
>> > > those connections are rarely doing anything more than half the time.
>> >
>> > Lower your shared buffers to about 20% of your RAM, unless you've tested
>> > it's actually helping in your particular case. It's unlikely you'll get
>> > better performance by using more than that, especially on older
>> > versions, so it's wiser to leave the rest for page cache.
>> >
>> > It might even be one of the causes of the performance issue you're
>> > seeing, as shared buffers are not exactly overhead-free.
>> >
>> > See this for more details on tuning:
>> >
>> >    http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
>>
>>
>> I had followed the general directions from several sources years ago, which
>> indicate up to 40% of RAM. We've been running very large shared buffers for
>
> in general it's best to start with 10-15% of the RAM and no more then
> 2-4 GB
>
>> 4 years now, but it is difficult to generate a good real load without
>> testing against users, so we have not felt the need to move it around. In
>> general, I don't tend to tinker with a setting that has been fine for this
>> long without good reason. I've been wanting to upgrade to the newer
>> mmap-based versions of pgsql, but was waiting to re-tune this when I did so.
>>
>> Why do you suspect that shared_buffers would cause the behavior I'm seeing?
>>
>
> for two reasons:
>
> - some of the overhead of bgwriter and checkpoints is more or less linear
> in the size of shared_buffers, for example it could be possible that a
> large quantity of data could be dirty when a checkpoint occurs).
>
> - the OS cache is also being used for reads and writes, the larger
>   shared_buffers is, the more you risk double buffering (same blocks
>   in the OS cache and in the database buffer cache).

That's good reasoning but is not related to the problem faced by the
OP.  The real reason why I recommend to keep shared buffers at max
2GB, always, is because we have major contention issues which we
presume are in the buffer area (either in the mapping or in the clock
sweep) but could be something else entirely.  These issues tend to
show up on fast machines in all- or mostly- read workloads.

We are desperate for profiles demonstrating the problem in production
workloads.  If OP is willing to install and run perf in production
(which is not a bad idea anyways), then my advice is to change nothing
until we have a chance to grab a profile.  These types of problems are
notoriously difficult to reproduce in test environments.

merlin


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: CPU spikes and transactions

От
Merlin Moncure
Дата:
On Tue, May 13, 2014 at 6:04 PM, Dave Owens <dave@teamunify.com> wrote:
Hi,

Apologies for resurrecting this old thread, but it seems like this is better than starting a new conversation.

We are now running 9.1.13 and have doubled the CPU and memory.  So 2x 16 Opteron 6276 (32 cores total), and 64GB memory.  shared_buffers set to 20G, effective_cache_size set to 40GB.

We were able to record perf data during the latest incident of high CPU utilization. perf report is below:

Samples: 31M of event 'cycles', Event count (approx.): 16289978380877 
 44.74%       postmaster  [kernel.kallsyms]             [k] _spin_lock_irqsave                                     
 15.03%       postmaster  postgres                      [.] 0x00000000002ea937                                     
  3.14%       postmaster  postgres                      [.] s_lock                                                 
  2.30%       postmaster  [kernel.kallsyms]             [k] compaction_alloc                                       
  2.21%       postmaster  postgres                      [.] HeapTupleSatisfiesMVCC                                 
  1.75%       postmaster  postgres                      [.] hash_search_with_hash_value                            
  1.25%       postmaster  postgres                      [.] ExecScanHashBucket                                     
  1.20%       postmaster  postgres                      [.] SHMQueueNext                                           
  1.05%       postmaster  postgres                      [.] slot_getattr                                           
  1.04%             init  [kernel.kallsyms]             [k] native_safe_halt                                       
  0.73%       postmaster  postgres                      [.] LWLockAcquire                                          
  0.59%       postmaster  [kernel.kallsyms]             [k] page_fault                                             
  0.52%       postmaster  postgres                      [.] ExecQual                                               
  0.40%       postmaster  postgres                      [.] ExecStoreTuple                                         
  0.38%       postmaster  postgres                      [.] ExecScan                                               
  0.37%       postmaster  postgres                      [.] check_stack_depth                                      
  0.35%       postmaster  postgres                      [.] SearchCatCache                                         
  0.35%       postmaster  postgres                      [.] CheckForSerializableConflictOut                        
  0.34%       postmaster  postgres                      [.] LWLockRelease                                          
  0.30%       postmaster  postgres                      [.] _bt_checkkeys                                          
  0.28%       postmaster  libc-2.12.so                  [.] memcpy                                                 
  0.27%       postmaster  [kernel.kallsyms]             [k] get_pageblock_flags_group                              
  0.27%       postmaster  postgres                      [.] int4eq                                                 
  0.27%       postmaster  postgres                      [.] heap_page_prune_opt                                    
  0.27%       postmaster  postgres                      [.] pgstat_init_function_usage                             
  0.26%       postmaster  [kernel.kallsyms]             [k] _spin_lock                                             
  0.25%       postmaster  postgres                      [.] _bt_compare                                            
  0.24%       postmaster  postgres                      [.] pgstat_end_function_usage

...please let me know if we need to produce the report differently to be useful.

We will begin reducing shared_buffers incrementally over the coming days.


This is definitely pointing at THP compaction which is increasingly emerging as a possible culprit for suddenly occurring (and just as suddenly resolving) cpu spikes.  The evidence I see is:

*) Lots of time in kernel
*) "compaction_alloc"
*) otherwise normal postgres profile (not lots of time in s_lock, LWLock, or other weird things)


Please check the value of THP (see here: http://structureddata.org/2012/06/18/linux-6-transparent-huge-pages-and-hadaoop-workloads/) and various other workloads.   If it is enabled consider disabling it...this will revert to pre linux 6 behavior.  If you are going to attack this from the point of view of lowering shared buffers, do not bother with incremental...head straight for 2GB or it's unlikely the problem will be fixed.   THP compaction is not a postgres problem...mysql is affected as is other server platforms.  If THP is indeed causing the problem, it couldn't hurt to get on the horn withe linux guys.  Last I heard they claimed this kind of thing was fixed but I don't know where things stand now.

merlin

Re: CPU spikes and transactions

От
Jeff Janes
Дата:
On Tue, May 13, 2014 at 4:04 PM, Dave Owens <dave@teamunify.com> wrote:
Hi,

Apologies for resurrecting this old thread, but it seems like this is better than starting a new conversation.

We are now running 9.1.13 and have doubled the CPU and memory.  So 2x 16 Opteron 6276 (32 cores total), and 64GB memory.  shared_buffers set to 20G, effective_cache_size set to 40GB.

We were able to record perf data during the latest incident of high CPU utilization. perf report is below:

Samples: 31M of event 'cycles', Event count (approx.): 16289978380877 
 44.74%       postmaster  [kernel.kallsyms]             [k] _spin_lock_irqsave                                     
 15.03%       postmaster  postgres                      [.] 0x00000000002ea937                                     
  3.14%       postmaster  postgres                      [.] s_lock                                                 
  2.30%       postmaster  [kernel.kallsyms]             [k] compaction_alloc                                       
  2.21%       postmaster  postgres                      [.] HeapTupleSatisfiesMVCC                                 


compaction_alloc points to "transparent huge pages" kernel problem, while HeapTupleSatisfiesMVCC points to the problem with each backend taking a ProcArrayLock for every not-yet-committed tuple it encounters.  I don't know which of those leads to the _spin_lock_irqsave.  It seems more likely to be transparent huge pages that does that, but perhaps both of them do.

If it is the former, you can find other message on this list about disabling it.  If it is the latter, your best bet is to commit your bulk inserts as soon as possible (this might be improved for 9.5, if we can figure out how to test the alternatives). Please let us know what works.  

If lowering shared_buffers works, I wonder if disabling the transparent huge page compaction issue might let you bring shared_buffers back up again.  


Cheers,

Jeff