Обсуждение: [PG 8.1.0 / AIX 5.3] Vacuum processes freezing

Поиск

Список

Период

Сортировка

[PG 8.1.0 / AIX 5.3] Vacuum processes freezing

От

RESTOUX, Loïc

Дата:

13 июня 2007 г., 14:02:35

Hello everybody,

We're using PostgreSQL 8.1.0 on AIX 5.3 through NFS (a Netapp Filer hosts the database files), and we're encoutering
somesissues with vaccums. PostgreSQL binaries are built with xlc 6 (C for AIX Compiler 6.0.0.6) on AIX 5.2 (yes, I
know,building on 5.2 and running on 5.3 is not the best way to avoid bugs...).

We have strong performance constraints with this database, so we planned vacuums with a crontab :
- Frequent vacuum analyze on some heavily-updated tables (few rows, but a lot of insert/update/delete). The frequency
variesbetween 5 and 15 minutes.
- A nightly (not FULL) vacuum on the entire database.
We don't use autovacuum or FULL vacuum, because the high havailability needed for the database. We prefer to keep it
undercontrol.

Since some weeks, the amount of data hosted by the database grows, and, some nights, the database vacuum seems to
"freeze"during his execution. In verbose mode, the logs show that the vacuum clean up a table (not always the same
table),and... no more news. The system shows a vacuum process, which seems to be sleeping (no CPU used, no memory
consumption...).In addition, the logs of our application show that database transactions seems to be slower.

For some internal reasons, the only way for us to workaround this problem is to shutdown of the application and the
database.After a full restart, things are ok.

Some questions :

1) During the nightly database vacuum, some vacuums run concurrently (vacuums on heavily-updated tables). Can this
concurrencycause some deadlocks ? We're planning to patch our shell scripts to avoid this concurrency.

2) I believed that the poor performances during the vacuum freeze were due to the obsolete data statistics. But after a
fullrestart of the dabatase, performances are good. Does PostgreSQL rebuild his statistics during startup ?

3) Can we explain the freeze with a bad database configuration ? For instance, postgreSQL running out of connections,
orwhatever, causing the vacuum process to wait for free ressources ?

4) This morning, just before the database vacuum freeze, the logs show this error :
<2007-06-13 03:20:35 DFT%>ERROR: could not open relation 16391/16394/107937: A system call received an interrupt.
<2007-06-13 03:20:35 DFT%>CONTEXT: writing block 2 of relation 16391/16394/107937
<2007-06-13 03:20:40 DFT%>LOG: could not fsync segment 0 of relation 16392/16394/107925: A system call received an
interrupt.
<2007-06-13 03:20:40 DFT%>ERROR: storage sync failed on magnetic disk: A system call received an interrupt.

This is the first time we're encountering this error. Can it be a cause of the vacuum freeze ?

Regards,

--
Loic Restoux
Capgemini Telecom & Media / ITDR
tel : 02 99 27 82 30
e-mail : loic.restoux@capgemini.com

This message contains information that may be privileged or confidential and is the property of the Capgemini Group. It
isintended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorized
toread, print, retain, copy, disseminate, distribute, or use this message or any part thereof. If you receive this
messagein error, please notify the sender immediately and delete all copies of this message.

Re: [PG 8.1.0 / AIX 5.3] Vacuum processes freezing

От

Tom Lane

Дата:

13 июня 2007 г., 15:06:45

=?iso-8859-1?Q?RESTOUX=2C_Lo=EFc?= <loic.restoux@capgemini.com> writes:
> Since some weeks, the amount of data hosted by the database grows, and, som=
> e nights,  the database vacuum seems to "freeze" during his execution. In v=
> erbose mode, the logs show that the vacuum clean up a table (not always the=
>  same table), and... no more news. The system shows a vacuum process, which=
>  seems to be sleeping (no CPU used, no memory consumption...).

Have you looked into pg_locks to see if it's blocked on someone else's lock?

            regards, tom lane

Re: [PG 8.1.0 / AIX 5.3] Vacuum processes freezing

От

"Simon Riggs"

Дата:

14 июня 2007 г., 12:26:45

On Wed, 2007-06-13 at 18:33 +0200, RESTOUX, Loïc wrote:

> 2) I believed that the poor performances during the vacuum freeze were due to the obsolete data statistics. But after
afull restart of the dabatase, performances are good. Does PostgreSQL rebuild his statistics during startup ?  

You probably don't need to run VACUUM FREEZE.

VACUUM FREEZE will thrash the disks much more than normal VACUUM. We're
improving that somewhat in 8.3, but the basic issue is that VACUUM
FREEZE cleans out more dead rows and so will dirty more data blocks.

Are you concurrently running DDL, Truncate or CLUSTER? That will
interfere with the operation of VACUUM.

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com

Re: [PG 8.1.0 / AIX 5.3] Vacuum processes freezing

От

RESTOUX, Loïc

Дата:

20 июня 2007 г., 12:28:15


Hi Tom, thanks for your reply,


> Have you looked into pg_locks to see if it's blocked on
> someone else's lock?

Yes, we looked into pg_locks and the vacuumdb process wasn't blocked. The table showed
four locks for vacuum, all with grant=true.

In fact, we found that a similar bug has been fixed in 8.1.1 :
> # Fix bgwriter problems after recovering from errors (Tom)
> The background writer was found to leak buffer pins after write errors.
> While not fatal in itself, this might lead to mysterious blockages of later VACUUM commands.
( http://www.postgresql.org/docs/8.1/static/release-8-1-1.html )

Can anyone confirm that the symptoms of this bug correspond to our problem ?
We saw some logs like :
<2007-06-11 12:44:04 DFT%>LOG:  could not fsync segment 0 of relation 16391/16394/107912:
A system call received an interrupt.
<2007-06-11 12:44:04 DFT%>ERROR:  storage sync failed on magnetic disk: A system call
received an interrupt.

Or :
<2007-06-16 12:25:45 DFT%>ERROR:  could not open relation 16393/16394/107926: A system
call received an interrupt.
<2007-06-16 12:25:45 DFT%>CONTEXT:  writing block 3 of relation 16393/16394/107926

But we can't see a relation between the fsync errors and the vacuum blockages. After a fsync error,
sometimes the vacuum works fine, sometimes it hangs. Is there any way to reproduce manually this
bug, in order to confirm that our problem is caused by this bug, and that it has been fixed
in the 8.1.9 for sure ? How can I find the patch for this bug in the source code ?


Regards,

--
  Loic Restoux
  Capgemini Telecom & Media / ITDR
  tel : 02 99 27 82 30
  e-mail : loic.restoux@capgemini.fr


This message contains information that may be privileged or confidential and is the property of the Capgemini Group. It
isintended only for the person to whom it is addressed. If you are not the intended recipient,  you are not authorized
toread, print, retain, copy, disseminate,  distribute, or use this message or any part thereof. If you receive this
messagein error, please notify the sender immediately and delete all  copies of this message.

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: [PG 8.1.0 / AIX 5.3] Vacuum processes freezing

[PG 8.1.0 / AIX 5.3] Vacuum processes freezing

Re: [PG 8.1.0 / AIX 5.3] Vacuum processes freezing

Re: [PG 8.1.0 / AIX 5.3] Vacuum processes freezing

Re: [PG 8.1.0 / AIX 5.3] Vacuum processes freezing