Обсуждение: PostgreSQL Server Crash using plPHP or PL/Perl

Поиск
Список
Период
Сортировка

PostgreSQL Server Crash using plPHP or PL/Perl

От
"Carl M. Nasal II"
Дата:
We are writing a multi-master replication process for our Electronic
Medical Records product.  We have written triggers in plPHP and then in
PL/Perl to keep an audit trail of the changes as well as flags so the data
can be replicated.  We started with plPHP, but then server started
crashing, which reset all connections to the database (requiring our
application to be restarted).  We then tried to rewrite the code using
PL/Perl, but the same problem has occurred.  The code for the triggers are
available at:

http://medical.bmaenterprises.com/audit.plphp
http://medical.bmaenterprises.com/audit.plperl

We create the triggers by running the follow SQL statement for each table:

CREATE TRIGGER config_audit AFTER INSERT OR UPDATE OR DELETE ON config FOR
EACH ROW EXECUTE PROCEDURE audit();

Any ideas of what is causing the server to crash will be helpful.

Below are the lines from the PostgreSQL serverlog file when the crash occurs:
----------------------------------------------------------------------------
LOG:  server process (PID 29153) exited with exit code 255
LOG:  terminating any other active server processes
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
LOG:  all server processes terminated; reinitializing
LOG:  database system was interrupted at 2006-07-11 16:01:32 EDT
LOG:  checkpoint record is at 1/F413F26C
LOG:  redo record is at 1/F413F26C; undo record is at 0/0; shutdown FALSE
LOG:  next transaction ID: 7628670; next OID: 693120
LOG:  next MultiXactId: 1; next MultiXactOffset: 0
LOG:  database system was not properly shut down; automatic recovery in
progress
LOG:  redo starts at 1/F413F2B0
LOG:  record with zero length at 1/F4186D3C
LOG:  redo done at 1/F4186D14
LOG:  database system is ready
LOG:  transaction ID wrap limit is 2147484146, limited by database "postgres"


Thank you,
Carl M. Nasal II
BMA Enterprises, Inc.

Re: PostgreSQL Server Crash using plPHP or PL/Perl

От
Agent M
Дата:
Are you certain that it is the trigger that is crashing the process? If
that is true, then there may be a bug in plperl.

To debug, you could use gdb, but try this first:
Use the strict pragma. To do this in plperl (instead of plperlu), use:
    BEGIN { strict->import(); }
    or set strict mode to on in postgresql.conf [I don't understand why
this isn't the default.]
    You will need to declare all your variables using my $var. [You are
already half-way there because you declare a lot of empty strings.]

You already pepper your code with elog(NOTICE,"") so you can tell us
how far the code gets right? You can use more elogs to hone in on the
line that crashes.

Are you aware that your code will be very costly to execute?

On Jul 11, 2006, at 5:43 PM, Carl M. Nasal II wrote:

> We are writing a multi-master replication process for our Electronic
> Medical Records product.  We have written triggers in plPHP and then
> in PL/Perl to keep an audit trail of the changes as well as flags so
> the data can be replicated.  We started with plPHP, but then server
> started crashing, which reset all connections to the database
> (requiring our application to be restarted).  We then tried to rewrite
> the code using PL/Perl, but the same problem has occurred.  The code
> for the triggers are available at:
>
> http://medical.bmaenterprises.com/audit.plphp
> http://medical.bmaenterprises.com/audit.plperl
>
> We create the triggers by running the follow SQL statement for each
> table:
>
> CREATE TRIGGER config_audit AFTER INSERT OR UPDATE OR DELETE ON config
> FOR EACH ROW EXECUTE PROCEDURE audit();
>
> Any ideas of what is causing the server to crash will be helpful.
>
> Below are the lines from the PostgreSQL serverlog file when the crash
> occurs:
> -----------------------------------------------------------------------
> -----
> LOG:  server process (PID 29153) exited with exit code 255
> LOG:  terminating any other active server processes
> WARNING:  terminating connection because of crash of another server
> process
> DETAIL:  The postmaster has commanded this server process to roll back
> the current transaction and exit, because another server process
> exited
> HINT:  In a moment you should be able to reconnect to the database and
> repeat your command.
> LOG:  all server processes terminated; reinitializing
> LOG:  database system was interrupted at 2006-07-11 16:01:32 EDT
> LOG:  checkpoint record is at 1/F413F26C
> LOG:  redo record is at 1/F413F26C; undo record is at 0/0; shutdown
> FALSE
> LOG:  next transaction ID: 7628670; next OID: 693120
> LOG:  next MultiXactId: 1; next MultiXactOffset: 0
> LOG:  database system was not properly shut down; automatic recovery
> in progress
> LOG:  redo starts at 1/F413F2B0
> LOG:  record with zero length at 1/F4186D3C
> LOG:  redo done at 1/F4186D14
> LOG:  database system is ready
> LOG:  transaction ID wrap limit is 2147484146, limited by database
> "postgres"
>
>
> Thank you,
> Carl M. Nasal II
> BMA Enterprises, Inc.
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
>
¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬
AgentM
agentm@themactionfaction.com
¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬


Re: PostgreSQL Server Crash using plPHP or PL/Perl

От
"Ian Barwick"
Дата:
2006/7/11, Carl M. Nasal II <cnasal@bmaenterprises.com>:
(...)
> Any ideas of what is causing the server to crash will be helpful.
>
> Below are the lines from the PostgreSQL serverlog file when the crash occurs:
> ----------------------------------------------------------------------------
> LOG:  server process (PID 29153) exited with exit code 255
> LOG:  terminating any other active server processes
> WARNING:  terminating connection because of crash of another server process
(...)

It would be helpful if you provided the PostgreSQL version you're
using as well as details of the OS and possibly the hardware.

I've seen this kind of error in connection with hardware errors
(typically bad RAM or severe hard disk errors). Have you attempted
replicating this problem on another system to confirm / exclude this
as a possible cause?

Ian Barwick

--
http://sql-info.de/index.html

Re: PostgreSQL Server Crash using plPHP or PL/Perl

От
"Carl M. Nasal II"
Дата:
Thanks for the tips.  I will try to use the strict pragma as you suggested
and see if it helps.

The code runs fine most of the time.  The issue is as time elapses, the
server ends up crashing.  Generally, it will crash within a day of use
(with only 3 clients hitting it).  The reason we believe it is the triggers
is that once we remove the triggers, the server works fine for weeks and
months with no problems.  Once we re-enable the triggers, the server
crashes within 1 day.

Yes, we have seen quickly the code is very costly to execute, especially on
queries that update many rows.  We have this home-grown solution because we
need the ability to have two sites have their own copy of the data to
improve performance.  We tried to access data over T1 lines, but it was too
slow, and we would prefer not to be out of service if the T1 is down.

Thanks for your quick response and assistance.


Thank you,
Carl M. Nasal II
BMA Enterprises, Inc.


Agent M wrote:
> Are you certain that it is the trigger that is crashing the process? If
> that is true, then there may be a bug in plperl.
>
> To debug, you could use gdb, but try this first:
> Use the strict pragma. To do this in plperl (instead of plperlu), use:
>     BEGIN { strict->import(); }
>     or set strict mode to on in postgresql.conf [I don't understand why
> this isn't the default.]
>     You will need to declare all your variables using my $var. [You are
> already half-way there because you declare a lot of empty strings.]
>
> You already pepper your code with elog(NOTICE,"") so you can tell us how
> far the code gets right? You can use more elogs to hone in on the line
> that crashes.
>
> Are you aware that your code will be very costly to execute?
>
> On Jul 11, 2006, at 5:43 PM, Carl M. Nasal II wrote:
>
>> We are writing a multi-master replication process for our Electronic
>> Medical Records product.  We have written triggers in plPHP and then
>> in PL/Perl to keep an audit trail of the changes as well as flags so
>> the data can be replicated.  We started with plPHP, but then server
>> started crashing, which reset all connections to the database
>> (requiring our application to be restarted).  We then tried to rewrite
>> the code using PL/Perl, but the same problem has occurred.  The code
>> for the triggers are available at:
>>
>> http://medical.bmaenterprises.com/audit.plphp
>> http://medical.bmaenterprises.com/audit.plperl
>>
>> We create the triggers by running the follow SQL statement for each
>> table:
>>
>> CREATE TRIGGER config_audit AFTER INSERT OR UPDATE OR DELETE ON config
>> FOR EACH ROW EXECUTE PROCEDURE audit();
>>
>> Any ideas of what is causing the server to crash will be helpful.
>>
>> Below are the lines from the PostgreSQL serverlog file when the crash
>> occurs:
>> ----------------------------------------------------------------------------
>>
>> LOG:  server process (PID 29153) exited with exit code 255
>> LOG:  terminating any other active server processes
>> WARNING:  terminating connection because of crash of another server
>> process
>> DETAIL:  The postmaster has commanded this server process to roll back
>> the current transaction and exit, because another server process exited
>> HINT:  In a moment you should be able to reconnect to the database and
>> repeat your command.
>> LOG:  all server processes terminated; reinitializing
>> LOG:  database system was interrupted at 2006-07-11 16:01:32 EDT
>> LOG:  checkpoint record is at 1/F413F26C
>> LOG:  redo record is at 1/F413F26C; undo record is at 0/0; shutdown FALSE
>> LOG:  next transaction ID: 7628670; next OID: 693120
>> LOG:  next MultiXactId: 1; next MultiXactOffset: 0
>> LOG:  database system was not properly shut down; automatic recovery
>> in progress
>> LOG:  redo starts at 1/F413F2B0
>> LOG:  record with zero length at 1/F4186D3C
>> LOG:  redo done at 1/F4186D14
>> LOG:  database system is ready
>> LOG:  transaction ID wrap limit is 2147484146, limited by database
>> "postgres"
>>
>>
>> Thank you,
>> Carl M. Nasal II
>> BMA Enterprises, Inc.
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 1: if posting/reading through Usenet, please send an appropriate
>>       subscribe-nomail command to majordomo@postgresql.org so that your
>>       message can get through to the mailing list cleanly
>>
> ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬
> AgentM
> agentm@themactionfaction.com
> ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>

Re: PostgreSQL Server Crash using plPHP or PL/Perl

От
"Carl M. Nasal II"
Дата:
I apologize for not including the server information in the last e-mail.  I
knew I would forget something.  :)

PostgreSQL 8.1.4 (compiled from source)
Debian GNU/Linux 3.1
Intel(R) Pentium(R) D CPU 3.00GHz
1GB RAM

We have not be able to test the triggers anywhere else that has any load on
it.  If we run it on a test set of data, it works fine because it is not
being hit by that many users.  However, if we disable the triggers,
everything on the live server appears to work fine and PostgreSQL never dies.

Also, we are using the PostgreSQL ODBC driver from a Windows application we
have developed.  It seems that if the user tries to access the server while
it restarts after this error, they get an error.  However, if the user is
not doing anything when the server restarts and then tries to access data
later, it appears to automatically do the reconnect and everything is fine.

Thanks again for your assistance.


Thank you,
Carl M. Nasal II
BMA Enterprises, Inc.



Ian Barwick wrote:
> 2006/7/11, Carl M. Nasal II <cnasal@bmaenterprises.com>:
> (...)
>> Any ideas of what is causing the server to crash will be helpful.
>>
>> Below are the lines from the PostgreSQL serverlog file when the crash
>> occurs:
>> ----------------------------------------------------------------------------
>>
>> LOG:  server process (PID 29153) exited with exit code 255
>> LOG:  terminating any other active server processes
>> WARNING:  terminating connection because of crash of another server
>> process
> (...)
>
> It would be helpful if you provided the PostgreSQL version you're
> using as well as details of the OS and possibly the hardware.
>
> I've seen this kind of error in connection with hardware errors
> (typically bad RAM or severe hard disk errors). Have you attempted
> replicating this problem on another system to confirm / exclude this
> as a possible cause?
>
> Ian Barwick
>

Re: PostgreSQL Server Crash using plPHP or PL/Perl

От
Scott Marlowe
Дата:
On Wed, 2006-07-12 at 07:32, Carl M. Nasal II wrote:
> I apologize for not including the server information in the last e-mail.  I
> knew I would forget something.  :)
>
> PostgreSQL 8.1.4 (compiled from source)
> Debian GNU/Linux 3.1
> Intel(R) Pentium(R) D CPU 3.00GHz
> 1GB RAM
>
> We have not be able to test the triggers anywhere else that has any load on
> it.  If we run it on a test set of data, it works fine because it is not
> being hit by that many users.  However, if we disable the triggers,
> everything on the live server appears to work fine and PostgreSQL never dies.
>
> Also, we are using the PostgreSQL ODBC driver from a Windows application we
> have developed.  It seems that if the user tries to access the server while
> it restarts after this error, they get an error.  However, if the user is
> not doing anything when the server restarts and then tries to access data
> later, it appears to automatically do the reconnect and everything is fine.

a poor man's way to apply lots of parallel load is to build a few web
pages that can individually apply a bit of load, and use apache's ab
program to beat on the web server and, by extension, the database.