Обсуждение: PANIC during VACUUM

Поиск

Список

Период

Сортировка

PANIC during VACUUM

От

German Becker

Дата:

29 апреля 2013 г., 19:59:02

Hi,

I am testing version 9.1.9 before putting it in production. One of my tests involved deleting a the contents of a big table ( ~ 13 GB size) and then VACUUMing it. During VACUUM PANICS. Here is the message:

PANIC: corrupted item pointer: offset = 8128, size = 80

I found the error a couple of times, allways during VACUUM after deleting the context of the same big table (after re-polpulating it of course).

The error message is always *exactly* the same i.e. the same offset and size.

When this happens the backend gets restarted and if I issue the same VACUUM command, I get the same error.

I also tried triggering the backup server (hot-standby with streaming replication, and trying the VACUUM there (to see if it may be a hardware problem in the primary) and got the same issue.

What might be causing this? Should I reported as a bug? Thanks!

Germán

Re: PANIC during VACUUM

От

Albe Laurenz

Дата:

30 апреля 2013 г., 07:34:49

German Becker wrote:
> I am testing version 9.1.9 before putting it in production. One of my tests involved deleting a the
> contents of a big table ( ~ 13 GB size) and then VACUUMing it. During VACUUM PANICS. Here is the
> message:
>
> PANIC:  corrupted item pointer: offset = 8128, size = 80
>
>
> I found the error a couple of times, allways during VACUUM after deleting the context of the same big
> table (after re-polpulating it of course).
>
> The error message is always *exactly* the same i.e. the same offset and size.
>
> When this happens the backend gets restarted and if I issue the same VACUUM command, I get the same
> error.
>
> I also tried triggering the backup server (hot-standby with streaming replication, and trying the
> VACUUM there (to see if it may be a hardware problem in the primary) and got the same issue.
>
> What might be causing this? Should I reported as a bug? Thanks!

If you mess with the database files, errors like this are to be expected.
The PANIC and restart is because the error happened during a sensitive phase.

This is not a bug.

Yours,
Laurenz Albe

Re: PANIC during VACUUM

От

German Becker

Дата:

30 апреля 2013 г., 11:36:47

Thanks for your reply. In which sense did I mess with the database files?

On Tue, Apr 30, 2013 at 4:34 AM, Albe Laurenz <laurenz.albe@wien.gv.at> wrote:

German Becker wrote:
> I am testing version 9.1.9 before putting it in production. One of my tests involved deleting a the
> contents of a big table ( ~ 13 GB size) and then VACUUMing it. During VACUUM PANICS. Here is the
> message:
>
> PANIC: corrupted item pointer: offset = 8128, size = 80
>
>
> I found the error a couple of times, allways during VACUUM after deleting the context of the same big
> table (after re-polpulating it of course).
>
> The error message is always *exactly* the same i.e. the same offset and size.
>
> When this happens the backend gets restarted and if I issue the same VACUUM command, I get the same
> error.
>
> I also tried triggering the backup server (hot-standby with streaming replication, and trying the
> VACUUM there (to see if it may be a hardware problem in the primary) and got the same issue.
>
> What might be causing this? Should I reported as a bug? Thanks!

If you mess with the database files, errors like this are to be expected.
The PANIC and restart is because the error happened during a sensitive phase.

This is not a bug.

Yours,
Laurenz Albe

Re: PANIC during VACUUM

От

German Becker

Дата:

30 апреля 2013 г., 11:46:30

Just in case there are some errors in my first email, where it says "after deleting the context of the same big table" It should say "after deleting de contents of the same big table" I essence what i did is

DELETE from table;

VACUUM table;

And I got the error

On Tue, Apr 30, 2013 at 8:36 AM, German Becker <german.becker@gmail.com> wrote:

Thanks for your reply. In which sense did I mess with the database files?

On Tue, Apr 30, 2013 at 4:34 AM, Albe Laurenz <laurenz.albe@wien.gv.at> wrote:
German Becker wrote:
> I am testing version 9.1.9 before putting it in production. One of my tests involved deleting a the
> contents of a big table ( ~ 13 GB size) and then VACUUMing it. During VACUUM PANICS. Here is the
> message:
>
> PANIC: corrupted item pointer: offset = 8128, size = 80
>
>
> I found the error a couple of times, allways during VACUUM after deleting the context of the same big
> table (after re-polpulating it of course).
>
> The error message is always *exactly* the same i.e. the same offset and size.
>
> When this happens the backend gets restarted and if I issue the same VACUUM command, I get the same
> error.
>
> I also tried triggering the backup server (hot-standby with streaming replication, and trying the
> VACUUM there (to see if it may be a hardware problem in the primary) and got the same issue.
>
> What might be causing this? Should I reported as a bug? Thanks!

If you mess with the database files, errors like this are to be expected.
The PANIC and restart is because the error happened during a sensitive phase.

This is not a bug.

Yours,
Laurenz Albe

Re: PANIC during VACUUM

От

Kevin Grittner

Дата:

30 апреля 2013 г., 11:51:31

[please don't top-post]

German Becker <german.becker@gmail.com> wrote:
> Albe Laurenz <laurenz.albe@wien.gv.at> wrote:
>> German Becker wrote:

>>> I am testing version 9.1.9 before putting it in production. One
>>> of my tests involved deleting a the contents of a big table ( ~
>>> 13 GB size) and then VACUUMing it. During VACUUM PANICS.

>> If you mess with the database files, errors like this are to be
>> expected.

> Thanks for your reply. In which sense did I mess with the
> database files?

You didn't say how you deleted the contents of that big table, and
it appears that Albe assumed you deleted or truncated the
underlying disk file rather than using the DELETE or TRUNCATE SQL
statement.

In any event, more details would help people come up with ideas on
what might be wrong.

http://wiki.postgresql.org/wiki/Guide_to_reporting_problems

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: PANIC during VACUUM

От

Albe Laurenz

Дата:

30 апреля 2013 г., 12:08:29

German Becker wrote:
> Just in case there are some errors in my first email, where it says "after deleting the context of the
> same big table" It should say "after deleting de contents of the same big table" I essence what i did
> is
>
> DELETE from table;
> VACUUM table;
>
> And I got the error

>> I am testing version 9.1.9 before putting it in production. One of my tests involved deleting a the
>> contents of a big table ( ~ 13 GB size) and then VACUUMing it. During VACUUM PANICS. Here is the
>> message:
>>
>> PANIC:  corrupted item pointer: offset = 8128, size = 80

Sorry for misunderstanding you.
A DELETE should definitely not cause such an error.

Can you provide a reproducible test case?
Is that a new database or could it be some prior corruption?

Yours,
Laurenz Albe

Re: PANIC during VACUUM

От

German Becker

Дата:

30 апреля 2013 г., 12:26:18

OK I apologise for the lack of clarity of the first message. Let me summarize the steps that lead me to the error.

I have 2 servers running Ubuntu 12.04 on which I am testing Postgres 9.1.9. I set up streaming replication between them (no synchronous replication)

Both servers have 4 SATA hard drives with ext3 file system set up as follows

sda --> / main os and the database files, except for the ones defined below

sdb ---> pg_xlog directory

sdc ----> one tablespace where heavy transaction tables are stored

sdd --> another tablespace where big historic tables are stored.

archiving mode is on and the archive location is sda (and from there to the hot-standby server)

For testing I Populate the database with the data currently in production (currently Postgres 8.3).

Then I run several load testing etc.

For tunning / improving the archiving process I needed to generate big ammount of WAL. To do so I just deleted the contents of one big table, and then VACUUM it, like this

DELETE form bigtable;

VACUUM bigtable;

And I found the error reported.

I repeated the whole process (creating a new cluster, populating it with data - allways the same data- , seting up replication) a couple of times after that and I found the error again about 90% of the time. I tried deleting a big portion of the table and the error did not appeard. It only appears after deleting ALL. Also in some cases I didn't run the VACUUM command manually, and the error ocurred during auto-vacuum-

My last test, was, in case there was a hardware problem in the primary, to trigger the standby server and try the vacuum there. With the same results.

Here a chunk of the log:

2013-04-29 17:02:21 ART [12024]: [32-1] PANIC: XX001: corrupted item pointer: offset = 8128, size = 80

2013-04-29 17:02:21 ART [12024]: [33-1] LOCATION: PageIndexMultiDelete, bufpage.c:779

2013-04-29 17:02:21 ART [12024]: [34-1] STATEMENT: VACUUM callshopcdrs ;

2013-04-29 17:02:21 ART [23787]: [8-1] LOG: server process (PID 12024) was terminated by signal 6: Aborte

2013-04-29 17:02:21 ART [23787]: [9-1] LOG: terminating any other active server processes

2013-04-29 17:02:21 ART [7300]: [2-1] WARNING: terminating connection because of crash of another server

process

2013-04-29 17:02:21 ART [7300]: [3-1] DETAIL: The postmaster has commanded this server process to roll ba

ck the current transaction and exit, because another server process exited abnormally and possibly corrupt

ed shared memory.

2013-04-29 17:02:21 ART [7300]: [4-1] HINT: In a moment you should be able to reconnect to the database a

nd repeat your command.

2013-04-29 17:02:21 ART [30304]: [1-1] FATAL: the database system is in recovery mode

2013-04-29 17:02:21 ART [23787]: [10-1] LOG: archiver process (PID 7301) exited with exit code 1

2013-04-29 17:02:21 ART [23787]: [11-1] LOG: all server processes terminated; reinitializing

2013-04-29 17:02:21 ART [30305]: [1-1] LOG: database system was interrupted; last known up at 2013-04-29

16:59:01 ART

2013-04-29 17:02:21 ART [30305]: [2-1] LOG: database system was not properly shut down; automatic recover

y in progress

2013-04-29 17:02:21 ART [30305]: [3-1] LOG: redo starts at 11/497D4338

2013-04-29 17:02:21 ART [30305]: [4-1] LOG: invalid magic number 0000 in log file 17, segment 73, offset

8216576

2013-04-29 17:02:21 ART [30305]: [5-1] LOG: redo done at 11/497D4440

2013-04-29 17:02:22 ART [30308]: [1-1] LOG: autovacuum launcher started

2013-04-29 17:02:22 ART [23787]: [12-1] LOG: database system is ready to accept connections

There is a core file generated, it is 7GB big:

$ file core

core: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'postgres: postgres tvoip3 [local] VACUUM'

Many thanks for your help and let me know any extra information that might be useful.

German

On Tue, Apr 30, 2013 at 8:51 AM, Kevin Grittner <kgrittn@ymail.com> wrote:

[please don't top-post]

German Becker <german.becker@gmail.com> wrote:
> Albe Laurenz <laurenz.albe@wien.gv.at> wrote:
>> German Becker wrote:

>>> I am testing version 9.1.9 before putting it in production. One
>>> of my tests involved deleting a the contents of a big table ( ~
>>> 13 GB size) and then VACUUMing it. During VACUUM PANICS.

>> If you mess with the database files, errors like this are to be
>> expected.

> Thanks for your reply. In which sense did I mess with the
> database files?

You didn't say how you deleted the contents of that big table, and
it appears that Albe assumed you deleted or truncated the
underlying disk file rather than using the DELETE or TRUNCATE SQL
statement.

In any event, more details would help people come up with ideas on
what might be wrong.

http://wiki.postgresql.org/wiki/Guide_to_reporting_problems

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: PANIC during VACUUM

PANIC during VACUUM

Re: PANIC during VACUUM

Re: PANIC during VACUUM

Re: PANIC during VACUUM

Re: PANIC during VACUUM

Re: PANIC during VACUUM

Re: PANIC during VACUUM