Обсуждение: Wierd panic with 7.4.7

Поиск
Список
Период
Сортировка

Wierd panic with 7.4.7

От
"Joshua D. Drake"
Дата:
Hello,

Any thoughts on the below? Specifically the PANIC? A customer
was performing a full vacuum when it happen. This is running 7.4.7
on ES 3.0. We run daily vacuums and analyzes as well.

2005-06-27 16:35:02 LOG:  recycled transaction log file "0000004D0000006F"
2005-06-27 16:35:02 LOG:  recycled transaction log file "0000004D00000070"
2005-06-27 16:35:02 LOG:  recycled transaction log file "0000004D00000071"
2005-06-27 16:35:02 LOG:  recycled transaction log file "0000004D00000072"
2005-06-27 16:35:02 LOG:  recycled transaction log file "0000004D00000073"
2005-06-27 16:35:02 LOG:  recycled transaction log file "0000004D00000074"
2005-06-27 16:35:02 LOG:  recycled transaction log file "0000004D00000075"
2005-06-27 16:35:02 LOG:  recycled transaction log file "0000004D00000076"
2005-06-27 16:36:23 LOG:  incomplete startup packet
2005-06-27 16:37:53 ERROR:  could not send data to client: Broken pipe
2005-06-27 16:37:53 PANIC:  cannot abort transaction 146017848, it was 
already committed
2005-06-27 16:37:53 LOG:  server process (PID 6931) was terminated by 
signal 6
2005-06-27 16:37:53 LOG:  terminating any other active server processes
2005-06-27 16:37:53 WARNING:  terminating connection because of crash of 
another server process
DETAIL:  The postmaster has commanded this server process to roll back 
the current transaction and exit, because another server process exited 
abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and 
repeat your command.
2005-06-27 16:37:53 WARNING:  terminating connection because of crash of 
another server process
DETAIL:  The postmaster has commanded this server process to roll back 
the current transaction and exit, because another server process exited 
abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and 
repeat your command.

Sincerely,

Joshua D. Drake



-- 
Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240
PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/


Re: Wierd panic with 7.4.7

От
Tom Lane
Дата:
"Joshua D. Drake" <jd@commandprompt.com> writes:
> 2005-06-27 16:37:53 ERROR:  could not send data to client: Broken pipe
> 2005-06-27 16:37:53 PANIC:  cannot abort transaction 146017848, it was 
> already committed

A reasonable guess as to what happened there is:

1. Client process dies just as server is committing a transaction on its  behalf.

2. For some reason, server tries to send a message to client while it's  doing post-commit cleanup (before it gets to
thepoint of resetting  its state to show that it's not in the transaction anymore).
 

3. Kernel rejects message, causing elog(ERROR), causing entry to  AbortTransaction, causing above panic.

There are a couple of big problems with this theory, though.  In the
first place, there aren't any messages sent to the client during
post-commit; unless possibly it's an error message due to a failure
during post-commit, and that should have shown up in the server log.
In the second place, we don't treat communication failures as ERRORs,
so how did step 3 happen?

Do you know what the dead client was doing?  Can you reproduce this?
        regards, tom lane


Re: Wierd panic with 7.4.7

От
Tom Lane
Дата:
"Joshua D. Drake" <jd@commandprompt.com> writes:
>> Do you know what the dead client was doing?

> Unfortunately I don't. We didn't have PID logging turned on so I can't 
> tell which process it was. The only thing I was told was,
> "I am running a Full Vacuum, CRAP the server just died ;)"

Hmm ... VACUUM FULL is a tad weird because it marks itself committed
before it does the last stage of the operation (ie truncating the
relation).  I suppose it is possible that an error during that last
stage would have this symptom.  There's still the question of why
the postmaster log doesn't show the hypothetical initial error, though.

(Also, it's likely that there's some defense against this scenario in
VACUUM FULL, otherwise we'd hear this kind of report more often, I
should think.  I don't have time to go looking right now though.)
        regards, tom lane


Re: Wierd panic with 7.4.7

От
"Joshua D. Drake"
Дата:
> 
> There are a couple of big problems with this theory, though.  In the
> first place, there aren't any messages sent to the client during
> post-commit; unless possibly it's an error message due to a failure
> during post-commit, and that should have shown up in the server log.
> In the second place, we don't treat communication failures as ERRORs,
> so how did step 3 happen?
> 
> Do you know what the dead client was doing?

Unfortunately I don't. We didn't have PID logging turned on so I can't 
tell which process it was. The only thing I was told was,

"I am running a Full Vacuum, CRAP the server just died ;)"

>  Can you reproduce this?

Let me see if the client has a dev database we can try to reproduce with.

Sincerely,

Joshua D. Drake


> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
> 
>                http://www.postgresql.org/docs/faq


-- 
Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240
PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/


Re: Wierd panic with 7.4.7

От
Greg Stark
Дата:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> In the second place, we don't treat communication failures as ERRORs,
> so how did step 3 happen?

You probably realize this, but just in case: "Broken Pipe" probably means the
backend received SIGPIPE, not just that some file operation syscall returned
-1.

-- 
greg