Обсуждение: Unusually high IO for autovacuum worker

Поиск

Список

Период

Сортировка

Unusually high IO for autovacuum worker

От

Vlad Bailescu

Дата:

01 февраля 2013 г., 08:14:54

Hi everyone,

We are experiencing a strange situation after upgrading our Postgres from
8.4 to 9.1 and our OS from Debian 6 to Ubuntu 12.04 LTS.

We have an application where we are collecting timestamp-based data. Since
a month of data goes over 30M rows (and growing each month), we decided to
partition the data on an year-and-month basis. We basically have one master
table and partition tables like master_y2013m01 and so on.

Because of our application stack (Hibernate ORM) we use a before_insert
trigger to insert rows into the right partition and in the master table and
then an after_insert trigger to delete them from master table (we need the
original insert to return the inserted row info). This means the master
table becomes quickly populated with dead tuples and triggers the
autovacuum frequently (we are using default settings for autovacuum).

After upgrading our DB from 8.4 to 9.1 (and moving from Debian 6 to Ubuntu
12.04) we noticed a large increase in IO and traced it back to auto vacuum:
iotop and collectd report bursts of 4-6M/s taking 90 seconds followed by a
30 seconds pause. That's a lot of data to chew on especially since the
master table is only a couple of megabytes (indexes included).

Can anyone please explain why this is happening and how we can get around
it?

Thank you,
Vlad

Re: Unusually high IO for autovacuum worker

От

Pavel Stehule

Дата:

01 февраля 2013 г., 08:30:08

2013/2/1 Vlad Bailescu <vlad@mojitosoftware.com>:
> Hi everyone,
>
> We are experiencing a strange situation after upgrading our Postgres from
> 8.4 to 9.1 and our OS from Debian 6 to Ubuntu 12.04 LTS.
>
> We have an application where we are collecting timestamp-based data. Since a
> month of data goes over 30M rows (and growing each month), we decided to
> partition the data on an year-and-month basis. We basically have one master
> table and partition tables like master_y2013m01 and so on.
>
> Because of our application stack (Hibernate ORM) we use a before_insert
> trigger to insert rows into the right partition and in the master table and
> then an after_insert trigger to delete them from master table (we need the
> original insert to return the inserted row info). This means the master
> table becomes quickly populated with dead tuples and triggers the autovacuum
> frequently (we are using default settings for autovacuum).
>
> After upgrading our DB from 8.4 to 9.1 (and moving from Debian 6 to Ubuntu
> 12.04) we noticed a large increase in IO and traced it back to auto vacuum:
> iotop and collectd report bursts of 4-6M/s taking 90 seconds followed by a
> 30 seconds pause. That's a lot of data to chew on especially since the
> master table is only a couple of megabytes (indexes included).

we had similar problem - you can try to move stat file to ramdisk with
setting stats_temp_directory with target on somewhere on ramdisk

http://www.postgresql.org/message-id/CAFj8pRCRPv3i-9WwsSCa=my9M+J5B7K1i1c7TWi0pFLRyjkpWQ@mail.gmail.com

Regards

Pavel Stehule


>
> Can anyone please explain why this is happening and how we can get around
> it?
>
> Thank you,
> Vlad

Re: Unusually high IO for autovacuum worker

От

Pavan Deolasee

Дата:

01 февраля 2013 г., 09:02:15

On Fri, Feb 1, 2013 at 1:44 PM, Vlad Bailescu <vlad@mojitosoftware.com> wrote:
> Hi everyone,
>
> We are experiencing a strange situation after upgrading our Postgres from
> 8.4 to 9.1 and our OS from Debian 6 to Ubuntu 12.04 LTS.
>
> We have an application where we are collecting timestamp-based data. Since a
> month of data goes over 30M rows (and growing each month), we decided to
> partition the data on an year-and-month basis. We basically have one master
> table and partition tables like master_y2013m01 and so on.
>
> Because of our application stack (Hibernate ORM) we use a before_insert
> trigger to insert rows into the right partition and in the master table and
> then an after_insert trigger to delete them from master table (we need the
> original insert to return the inserted row info). This means the master
> table becomes quickly populated with dead tuples and triggers the autovacuum
> frequently (we are using default settings for autovacuum).
>
> After upgrading our DB from 8.4 to 9.1 (and moving from Debian 6 to Ubuntu
> 12.04) we noticed a large increase in IO and traced it back to auto vacuum:
> iotop and collectd report bursts of 4-6M/s taking 90 seconds followed by a
> 30 seconds pause. That's a lot of data to chew on especially since the
> master table is only a couple of megabytes (indexes included).
>
> Can anyone please explain why this is happening and how we can get around
> it?

Do you know for sure that its the master table that generating all the
vacuum traffic ? What about the partition tables ? Do they get any
updates/deletes ? It might be useful if you could turn autovacuum
logging ON and see which tables are getting vacuumed and correlate
that with the increased IO activity that you're seeing.

Did you change any other parameters like checkpoint timeout/segments
etc ? It might be worthwhile to log checkpoint activities as well to
be doubly sure.

Thanks,
Pavan

--
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

Re: Unusually high IO for autovacuum worker

От

Vlad Bailescu

Дата:

01 февраля 2013 г., 09:55:04

On Fri, Feb 1, 2013 at 11:01 AM, Pavan Deolasee <pavan.deolasee@gmail.com>wrote:

>
> Do you know for sure that its the master table that generating all the
> vacuum traffic ? What about the partition tables ? Do they get any
> updates/deletes ? It might be useful if you could turn autovacuum
> logging ON and see which tables are getting vacuumed and correlate
> that with the increased IO activity that you're seeing.
>
> Did you change any other parameters like checkpoint timeout/segments
> etc ? It might be worthwhile to log checkpoint activities as well to
> be doubly sure.
>
> Thanks,
> Pavan
>

 I'm pretty sure the io is from the autovacuum on master table because it's
last_autovacuum stats update almost every minute and iotop shows something
like:

Total DISK READ:       5.80 M/s | Total DISK WRITE:     115.85 K/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>
COMMAND


 7681 be/4 postgres    5.93 M/s    0.00 B/s  0.00 % 63.62 % postgres:
autovacuum worker process   fleet
15837 be/4 postgres    0.00 B/s   23.17 K/s  0.00 %  4.98 % postgres: fleet
fleet 127.0.0.1(53537) idle
16683 be/4 postgres    0.00 B/s   23.17 K/s  0.00 %  2.68 % postgres: fleet
fleet 127.0.0.1(53978) idle
18599 be/4 postgres    0.00 B/s    7.72 K/s  0.00 %  1.25 % postgres: fleet
fleet 127.0.0.1(59529) idle
15608 be/4 postgres    0.00 B/s   23.17 K/s  0.00 %  0.04 % postgres: fleet
fleet 127.0.0.1(53349) idle
10253 be/4 postgres    0.00 B/s    7.72 K/s  0.00 %  0.01 % postgres: fleet
fleet 127.0.0.1(50639) idle
16524 be/4 postgres    0.00 B/s    7.72 K/s  0.00 %  0.01 % postgres: fleet
fleet 127.0.0.1(53939) idle
10248 be/4 postgres    0.00 B/s    7.72 K/s  0.00 %  0.01 % postgres: fleet
fleet 127.0.0.1(50634) idle
18606 be/4 postgres    0.00 B/s    7.72 K/s  0.00 %  0.01 % postgres: fleet
fleet 127.0.0.1(59535) idle
14849 be/4 postgres    0.00 B/s    7.72 K/s  0.00 %  0.01 % postgres: fleet
fleet 127.0.0.1(52832) idle
30308 be/4 postgres    0.00 B/s    7.72 K/s  0.00 %  0.01 % postgres: wal
writer process
21528 be/4 www-data    0.00 B/s    3.86 K/s  0.00 %  0.00 % nginx: worker
process
30307 be/4 postgres    0.00 B/s   38.62 K/s  0.00 %  0.00 % postgres:
writer process

I just checked and our config only sets:

max_connections = 210
superuser_reserved_connections = 10

wal_buffers = 8MB

checkpoint_segments = 16
checkpoint_completion_target = 0.9

random_page_cost = 3.0

shared_buffers = 9GB
effective_cache_size = 28GB
work_mem = 88MB
maintenance_work_mem = 1GB

We're running on a server with 40GB RAM and a RAID 10 over 6 10k HDDs.

Stats show only 3 tables get frequently autovacuumed

     Table Name Tuples inserted Tuples updated Tuples deleted Tuples HOT
updated Live Tuples Dead Tuples Last vacuum Last autovacuum Last analyze Last
autoanalyze Vacuum counter Autovacuum counter Analyze counter Autoanalyze
counter Size  vehicle_position 4689127 0 4689127 0 0 1985
2013-02-01 11:46:46.441227+02 2013-01-31 14:48:00.775864+02 2013-02-01
11:46:46.494234+02 0 3761 1 3754 4016 kB  vehicle 2124 934640 0 924003 2124
390
2013-02-01 11:47:46.067695+02 2013-02-01 11:47:46.325444+02 0 1293 0 2038 2448
kB  input_current 10032 1649206 0 1635902 10032 1728
2013-02-01 11:45:46.0681+02 2013-02-01 11:45:46.215048+02 0 442 0 1294 2336
kB

Re: Unusually high IO for autovacuum worker

От

Pavan Deolasee

Дата:

01 февраля 2013 г., 11:09:08

On Fri, Feb 1, 2013 at 3:24 PM, Vlad Bailescu <vlad@mojitosoftware.com>wrote:

>
>
> Stats show only 3 tables get frequently autovacuumed
>
>     Table Name Tuples inserted Tuples updated Tuples deleted Tuples HOT
> updated Live Tuples Dead Tuples Last vacuum Last autovacuum Last analyze Last
> autoanalyze Vacuum counter Autovacuum counter Analyze counter Autoanalyze
> counter Size  vehicle_position 4689127 0 4689127 0 0 1985
> 2013-02-01 11:46:46.441227+02 2013-01-31 14:48:00.775864+02 2013-02-01
> 11:46:46.494234+02 0 3761 1 3754 4016 kB  vehicle 2124 934640 0 924003
> 2124 390
> 2013-02-01 11:47:46.067695+02 2013-02-01 11:47:46.325444+02 0 1293 0 2038 2448
> kB  input_current 10032 1649206 0 1635902 10032 1728
> 2013-02-01 11:45:46.0681+02 2013-02-01 11:45:46.215048+02 0 442 0 1294 2336
> kB
>

So sure there are more than just one table involved here. Also, the last
column in the
output above shows that these are fairly large tables. In fact, if the last
column is to be
trusted, they are as large 37GB, 20GB and 12GB respectively. Note the kB at
the end of
those values. Also given the large number of inserts/deletes, I would doubt
if the table can
really remain that small. May be auto-vacuum is truncating most of it and
thats why you
see only a few MBs before it swells again by the time next vacuum cycles
runs.

Whats your rate of insertion/deletion in the first table ? You also seem to
be updating the
last two tables quite heavily, but most of them are HOT updates, so your
tables should not
bloat too much.

Thanks,
Pavan

--
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

Re: Unusually high IO for autovacuum worker

От

Vlad Bailescu

Дата:

01 февраля 2013 г., 11:49:53

On Fri, Feb 1, 2013 at 1:08 PM, Pavan Deolasee <pavan.deolasee@gmail.com>wrote:

>
>
> On Fri, Feb 1, 2013 at 3:24 PM, Vlad Bailescu <vlad@mojitosoftware.com>wrote:
>
>>
>>
>> Stats show only 3 tables get frequently autovacuumed
>>
>>     Table Name Tuples inserted Tuples updated Tuples deleted Tuples HOT
>> updated Live Tuples Dead Tuples Last vacuum Last autovacuum Last analyze Last
>> autoanalyze Vacuum counter Autovacuum counter Analyze counter Autoanalyze
>> counter Size  vehicle_position 4689127 0 4689127 0 0 1985
>> 2013-02-01 11:46:46.441227+02 2013-01-31 14:48:00.775864+02 2013-02-01
>> 11:46:46.494234+02 0 3761 1 3754 4016 kB  vehicle 2124 934640 0 924003
>> 2124 390
>> 2013-02-01 11:47:46.067695+02 2013-02-01 11:47:46.325444+02 0 1293 0 2038 2448
>> kB  input_current 10032 1649206 0 1635902 10032 1728
>> 2013-02-01 11:45:46.0681+02 2013-02-01 11:45:46.215048+02 0 442 0 1294 2336
>> kB
>>
>
> So sure there are more than just one table involved here. Also, the last
> column in the
> output above shows that these are fairly large tables. In fact, if the
> last column is to be
> trusted, they are as large 37GB, 20GB and 12GB respectively. Note the kB
> at the end of
> those values. Also given the large number of inserts/deletes, I would
> doubt if the table can
> really remain that small. May be auto-vacuum is truncating most of it and
> thats why you
> see only a few MBs before it swells again by the time next vacuum cycles
> runs.
>
> Whats your rate of insertion/deletion in the first table ? You also seem
> to be updating the
> last two tables quite heavily, but most of them are HOT updates, so your
> tables should not
> bloat too much.
>
> Thanks,
> Pavan
>
> --
> Pavan Deolasee
> http://www.linkedin.com/in/pavandeolasee
>

Pavan, it seems there's a small issue with copy-paste and column
text-align. Table sizes are:

  4136 kB  2448 kB  2336 kB
The actual partition tables are very large (ie:  vehicle_position_y2013_m01
is 6448 MB), but they don't seem to receive so much attention.

The vehicle table gets updated very often because it has a foreign key to
vehicle_position and that changes with every batch of data.

We were able to reproduce this on a test VM with minimum workload to
trigger autovacuum. Here's the relevant log portion:


2012-12-05 00:32:57 EET LOG:  database system is ready to accept connections
2012-12-05 00:32:57 EET LOG:  autovacuum launcher started
2012-12-05 00:36:38 EET LOG:  automatic analyze of table
"fleet.fleet.vehicle_position" system usage: CPU 0.00s/0.00u sec elapsed
0.03 sec
2012-12-05 00:37:39 EET LOG:  automatic analyze of table
"fleet.fleet.vehicle_position_default" system usage: CPU 0.00s/0.15u sec
elapsed 0.99 sec
2012-12-05 00:44:23 EET LOG:  automatic analyze of table
"fleet.fleet.vehicle_position" system usage: CPU 4.46s/0.61u sec elapsed
465.09 sec
2012-12-05 00:44:38 EET LOG:  automatic vacuum of table
"fleet.fleet.vehicle_position": index scans: 1
        pages: 8 removed, 0 remain
        tuples: 64 removed, 0 remain
        system usage: CPU 0.00s/0.00u sec elapsed 0.02 sec
2012-12-05 00:44:38 EET LOG:  automatic analyze of table
"fleet.fleet.vehicle_position" system usage: CPU 0.00s/0.00u sec elapsed
0.00 sec
2012-12-05 00:51:40 EET LOG:  automatic analyze of table
"fleet.fleet.vehicle_position" system usage: CPU 4.34s/0.53u sec elapsed
422.20 sec
2012-12-05 00:52:38 EET LOG:  automatic vacuum of table
"fleet.fleet.vehicle_position": index scans: 1
        pages: 10 removed, 0 remain
        tuples: 30 removed, 0 remain
        system usage: CPU 0.00s/0.00u sec elapsed 0.02 sec
2012-12-05 00:52:38 EET LOG:  automatic analyze of table
"fleet.fleet.vehicle_position" system usage: CPU 0.00s/0.00u sec elapsed
0.00 sec

During the analyze the IO was ~800-1000 K/s

Vlad

Re: Unusually high IO for autovacuum worker

От

Pavan Deolasee

Дата:

01 февраля 2013 г., 12:41:22

On Fri, Feb 1, 2013 at 5:19 PM, Vlad Bailescu <vlad@mojitosoftware.com>wrote:

>
>
> Pavan, it seems there's a small issue with copy-paste and column
> text-align. Table sizes are:
>
>   4136 kB  2448 kB  2336 kB

Ah OK. I see.

>
> 2012-12-05 00:44:23 EET LOG:  automatic analyze of table
> "fleet.fleet.vehicle_position" system usage: CPU 4.46s/0.61u sec elapsed
> 465.09 sec
>

This is the interesting piece of information. So its the auto analyze thats
causing all
the IO activity. That explains why it was a read only IO that we noticed
earlier. Whats
happening here, and something that changed from 8.4 to 9.1, is that
whenever the parent
table is analyzed, the child tables are also automatically analyzed. I
don't remember the
rational for doing this change, but in your case the analyze on the parent
table itself is
quite useless because even though you inserting a large number of new
tuples, you are
also immediately deleting them. I don't want to comment on the design
aspect of that,
but you should be able to fix this problem by disabling auto-analyze on the
parent table.

Having said that, I don't see an easy way to just disable auto-analyze on a
table. You can
run ALTER TABLE foo SET (autovacuum_enabled = false), but that would also
disable
auto-vacuum, which you certainly don't want to do because the parent table
would just
keep growing.

You can set autovacuum_analyze_threshold to an artificially high value to
mitigate the
problem and reduce the frequency of auto-analyze on the table or see if you
can completely
avoid insert/delete on the parent table.

ALTER TABLE vehicle_position SET (autovacuum_analyze_threshold  =
2000000000);

Thanks,
Pavan

--
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

Re: Unusually high IO for autovacuum worker

От

Kevin Grittner

Дата:

01 февраля 2013 г., 13:28:02

Vlad Bailescu <vlad@mojitosoftware.com> wrote:=0A=0A> Because of our applic=
ation stack (Hibernate ORM) we use a=0A> before_insert trigger to insert ro=
ws into the right partition and=0A> in the master table and then an after_i=
nsert trigger to delete=0A> them from master table (we need the original in=
sert to return the=0A> inserted row info). This means the master table beco=
mes quickly=0A> populated with dead tuples and triggers the autovacuum freq=
uently=0A> (we are using default settings for autovacuum).=0A>=0A> After up=
grading our DB from 8.4 to 9.1 (and moving from Debian 6=0A> to Ubuntu 12.0=
4) we noticed a large increase in IO and traced it=0A> back to auto vacuum:=
 iotop and collectd report bursts of 4-6M/s=0A> taking 90 seconds followed =
by a 30 seconds pause. That's a lot of=0A> data to chew on especially since=
 the master table is only a=0A> couple of megabytes (indexes included).=0A>=
=0A> Can anyone please explain why this is happening and how we can=0A> get=
 around it?=0A=0AIt sounds like it could be an issue which will be fixed in=
 the next=0Aminor release:=0A=0Ahttp://git.postgresql.org/gitweb/?p=3Dpostg=
resql.git;a=3Dcommit;h=3D5454344b968d6a189219cfd49af609a3e7d6af33=0A=0AIf t=
his is the issue, a periodic manual VACUUM of the master table=0Ashould hel=
p for now.=A0 When 9.1.8 is released you might want to=0Aupgrade without a =
lot of delay, and see if you are then OK without=0Athe manual vacuums.=0A=
=0A-Kevin

Re: Unusually high IO for autovacuum worker

От

Vlad Bailescu

Дата:

01 февраля 2013 г., 13:41:07

On Fri, Feb 1, 2013 at 2:40 PM, Pavan Deolasee <pavan.deolasee@gmail.com>wrote:

>
>
> On Fri, Feb 1, 2013 at 5:19 PM, Vlad Bailescu <vlad@mojitosoftware.com>wrote:
>
>>
>>
>> Pavan, it seems there's a small issue with copy-paste and column
>> text-align. Table sizes are:
>>
>>   4136 kB  2448 kB  2336 kB
>
>
> Ah OK. I see.
>
>
>>
>> 2012-12-05 00:44:23 EET LOG:  automatic analyze of table
>> "fleet.fleet.vehicle_position" system usage: CPU 4.46s/0.61u sec elapsed
>> 465.09 sec
>>
>
> This is the interesting piece of information. So its the auto analyze
> thats causing all
> the IO activity. That explains why it was a read only IO that we noticed
> earlier. Whats
> happening here, and something that changed from 8.4 to 9.1, is that
> whenever the parent
> table is analyzed, the child tables are also automatically analyzed. I
> don't remember the
> rational for doing this change, but in your case the analyze on the parent
> table itself is
> quite useless because even though you inserting a large number of new
> tuples, you are
> also immediately deleting them. I don't want to comment on the design
> aspect of that,
> but you should be able to fix this problem by disabling auto-analyze on
> the parent table.
>
> Having said that, I don't see an easy way to just disable auto-analyze on
> a table. You can
>  run ALTER TABLE foo SET (autovacuum_enabled = false), but that would also
> disable
> auto-vacuum, which you certainly don't want to do because the parent table
> would just
> keep growing.
>
>  You can set autovacuum_analyze_threshold to an artificially high value
> to mitigate the
> problem and reduce the frequency of auto-analyze on the table or see if
> you can completely
> avoid insert/delete on the parent table.
>
> ALTER TABLE vehicle_position SET (autovacuum_analyze_threshold  =
> 2000000000);
>
> Thanks,
> Pavan
>

Great insight Pavan,

I see this change in ANALYZE is documented since 9.0:

If the table being analyzed has one or more children, ANALYZE will gather
> statistics twice: once on the rows of the parent table only, and a second
> time on the rows of the parent table with all of its children. The
> autovacuum daemon, however, will only consider inserts or updates on the
> parent table when deciding whether to trigger an automatic analyze. If that
> table is rarely inserted into or updated, the inheritance statistics will
> not be up to date unless you run ANALYZE manually.


 We will try setting a higher threshold for autovacuum on the parent table.

As for design, we could have gone this way or use a rule to redirect
inserts from master to children. My understanding is that performance for
rules is not so good (especially since we're doing mostly single INSERTs).
Unfortunately, the normal trigger will not work for us at the moment.

I saw your proposal on -hackers, I will keep an eye on it!

Thanks you,
Vlad

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Unusually high IO for autovacuum worker