Обсуждение: PostgreSQL Server Crash using plPHP or PL/Perl
We are writing a multi-master replication process for our Electronic Medical Records product. We have written triggers in plPHP and then in PL/Perl to keep an audit trail of the changes as well as flags so the data can be replicated. We started with plPHP, but then server started crashing, which reset all connections to the database (requiring our application to be restarted). We then tried to rewrite the code using PL/Perl, but the same problem has occurred. The code for the triggers are available at: http://medical.bmaenterprises.com/audit.plphp http://medical.bmaenterprises.com/audit.plperl We create the triggers by running the follow SQL statement for each table: CREATE TRIGGER config_audit AFTER INSERT OR UPDATE OR DELETE ON config FOR EACH ROW EXECUTE PROCEDURE audit(); Any ideas of what is causing the server to crash will be helpful. Below are the lines from the PostgreSQL serverlog file when the crash occurs: ---------------------------------------------------------------------------- LOG: server process (PID 29153) exited with exit code 255 LOG: terminating any other active server processes WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited HINT: In a moment you should be able to reconnect to the database and repeat your command. LOG: all server processes terminated; reinitializing LOG: database system was interrupted at 2006-07-11 16:01:32 EDT LOG: checkpoint record is at 1/F413F26C LOG: redo record is at 1/F413F26C; undo record is at 0/0; shutdown FALSE LOG: next transaction ID: 7628670; next OID: 693120 LOG: next MultiXactId: 1; next MultiXactOffset: 0 LOG: database system was not properly shut down; automatic recovery in progress LOG: redo starts at 1/F413F2B0 LOG: record with zero length at 1/F4186D3C LOG: redo done at 1/F4186D14 LOG: database system is ready LOG: transaction ID wrap limit is 2147484146, limited by database "postgres" Thank you, Carl M. Nasal II BMA Enterprises, Inc.
Are you certain that it is the trigger that is crashing the process? If that is true, then there may be a bug in plperl. To debug, you could use gdb, but try this first: Use the strict pragma. To do this in plperl (instead of plperlu), use: BEGIN { strict->import(); } or set strict mode to on in postgresql.conf [I don't understand why this isn't the default.] You will need to declare all your variables using my $var. [You are already half-way there because you declare a lot of empty strings.] You already pepper your code with elog(NOTICE,"") so you can tell us how far the code gets right? You can use more elogs to hone in on the line that crashes. Are you aware that your code will be very costly to execute? On Jul 11, 2006, at 5:43 PM, Carl M. Nasal II wrote: > We are writing a multi-master replication process for our Electronic > Medical Records product. We have written triggers in plPHP and then > in PL/Perl to keep an audit trail of the changes as well as flags so > the data can be replicated. We started with plPHP, but then server > started crashing, which reset all connections to the database > (requiring our application to be restarted). We then tried to rewrite > the code using PL/Perl, but the same problem has occurred. The code > for the triggers are available at: > > http://medical.bmaenterprises.com/audit.plphp > http://medical.bmaenterprises.com/audit.plperl > > We create the triggers by running the follow SQL statement for each > table: > > CREATE TRIGGER config_audit AFTER INSERT OR UPDATE OR DELETE ON config > FOR EACH ROW EXECUTE PROCEDURE audit(); > > Any ideas of what is causing the server to crash will be helpful. > > Below are the lines from the PostgreSQL serverlog file when the crash > occurs: > ----------------------------------------------------------------------- > ----- > LOG: server process (PID 29153) exited with exit code 255 > LOG: terminating any other active server processes > WARNING: terminating connection because of crash of another server > process > DETAIL: The postmaster has commanded this server process to roll back > the current transaction and exit, because another server process > exited > HINT: In a moment you should be able to reconnect to the database and > repeat your command. > LOG: all server processes terminated; reinitializing > LOG: database system was interrupted at 2006-07-11 16:01:32 EDT > LOG: checkpoint record is at 1/F413F26C > LOG: redo record is at 1/F413F26C; undo record is at 0/0; shutdown > FALSE > LOG: next transaction ID: 7628670; next OID: 693120 > LOG: next MultiXactId: 1; next MultiXactOffset: 0 > LOG: database system was not properly shut down; automatic recovery > in progress > LOG: redo starts at 1/F413F2B0 > LOG: record with zero length at 1/F4186D3C > LOG: redo done at 1/F4186D14 > LOG: database system is ready > LOG: transaction ID wrap limit is 2147484146, limited by database > "postgres" > > > Thank you, > Carl M. Nasal II > BMA Enterprises, Inc. > > ---------------------------(end of > broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ AgentM agentm@themactionfaction.com ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬
2006/7/11, Carl M. Nasal II <cnasal@bmaenterprises.com>: (...) > Any ideas of what is causing the server to crash will be helpful. > > Below are the lines from the PostgreSQL serverlog file when the crash occurs: > ---------------------------------------------------------------------------- > LOG: server process (PID 29153) exited with exit code 255 > LOG: terminating any other active server processes > WARNING: terminating connection because of crash of another server process (...) It would be helpful if you provided the PostgreSQL version you're using as well as details of the OS and possibly the hardware. I've seen this kind of error in connection with hardware errors (typically bad RAM or severe hard disk errors). Have you attempted replicating this problem on another system to confirm / exclude this as a possible cause? Ian Barwick -- http://sql-info.de/index.html
Thanks for the tips. I will try to use the strict pragma as you suggested and see if it helps. The code runs fine most of the time. The issue is as time elapses, the server ends up crashing. Generally, it will crash within a day of use (with only 3 clients hitting it). The reason we believe it is the triggers is that once we remove the triggers, the server works fine for weeks and months with no problems. Once we re-enable the triggers, the server crashes within 1 day. Yes, we have seen quickly the code is very costly to execute, especially on queries that update many rows. We have this home-grown solution because we need the ability to have two sites have their own copy of the data to improve performance. We tried to access data over T1 lines, but it was too slow, and we would prefer not to be out of service if the T1 is down. Thanks for your quick response and assistance. Thank you, Carl M. Nasal II BMA Enterprises, Inc. Agent M wrote: > Are you certain that it is the trigger that is crashing the process? If > that is true, then there may be a bug in plperl. > > To debug, you could use gdb, but try this first: > Use the strict pragma. To do this in plperl (instead of plperlu), use: > BEGIN { strict->import(); } > or set strict mode to on in postgresql.conf [I don't understand why > this isn't the default.] > You will need to declare all your variables using my $var. [You are > already half-way there because you declare a lot of empty strings.] > > You already pepper your code with elog(NOTICE,"") so you can tell us how > far the code gets right? You can use more elogs to hone in on the line > that crashes. > > Are you aware that your code will be very costly to execute? > > On Jul 11, 2006, at 5:43 PM, Carl M. Nasal II wrote: > >> We are writing a multi-master replication process for our Electronic >> Medical Records product. We have written triggers in plPHP and then >> in PL/Perl to keep an audit trail of the changes as well as flags so >> the data can be replicated. We started with plPHP, but then server >> started crashing, which reset all connections to the database >> (requiring our application to be restarted). We then tried to rewrite >> the code using PL/Perl, but the same problem has occurred. The code >> for the triggers are available at: >> >> http://medical.bmaenterprises.com/audit.plphp >> http://medical.bmaenterprises.com/audit.plperl >> >> We create the triggers by running the follow SQL statement for each >> table: >> >> CREATE TRIGGER config_audit AFTER INSERT OR UPDATE OR DELETE ON config >> FOR EACH ROW EXECUTE PROCEDURE audit(); >> >> Any ideas of what is causing the server to crash will be helpful. >> >> Below are the lines from the PostgreSQL serverlog file when the crash >> occurs: >> ---------------------------------------------------------------------------- >> >> LOG: server process (PID 29153) exited with exit code 255 >> LOG: terminating any other active server processes >> WARNING: terminating connection because of crash of another server >> process >> DETAIL: The postmaster has commanded this server process to roll back >> the current transaction and exit, because another server process exited >> HINT: In a moment you should be able to reconnect to the database and >> repeat your command. >> LOG: all server processes terminated; reinitializing >> LOG: database system was interrupted at 2006-07-11 16:01:32 EDT >> LOG: checkpoint record is at 1/F413F26C >> LOG: redo record is at 1/F413F26C; undo record is at 0/0; shutdown FALSE >> LOG: next transaction ID: 7628670; next OID: 693120 >> LOG: next MultiXactId: 1; next MultiXactOffset: 0 >> LOG: database system was not properly shut down; automatic recovery >> in progress >> LOG: redo starts at 1/F413F2B0 >> LOG: record with zero length at 1/F4186D3C >> LOG: redo done at 1/F4186D14 >> LOG: database system is ready >> LOG: transaction ID wrap limit is 2147484146, limited by database >> "postgres" >> >> >> Thank you, >> Carl M. Nasal II >> BMA Enterprises, Inc. >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 1: if posting/reading through Usenet, please send an appropriate >> subscribe-nomail command to majordomo@postgresql.org so that your >> message can get through to the mailing list cleanly >> > ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ > AgentM > agentm@themactionfaction.com > ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster >
I apologize for not including the server information in the last e-mail. I knew I would forget something. :) PostgreSQL 8.1.4 (compiled from source) Debian GNU/Linux 3.1 Intel(R) Pentium(R) D CPU 3.00GHz 1GB RAM We have not be able to test the triggers anywhere else that has any load on it. If we run it on a test set of data, it works fine because it is not being hit by that many users. However, if we disable the triggers, everything on the live server appears to work fine and PostgreSQL never dies. Also, we are using the PostgreSQL ODBC driver from a Windows application we have developed. It seems that if the user tries to access the server while it restarts after this error, they get an error. However, if the user is not doing anything when the server restarts and then tries to access data later, it appears to automatically do the reconnect and everything is fine. Thanks again for your assistance. Thank you, Carl M. Nasal II BMA Enterprises, Inc. Ian Barwick wrote: > 2006/7/11, Carl M. Nasal II <cnasal@bmaenterprises.com>: > (...) >> Any ideas of what is causing the server to crash will be helpful. >> >> Below are the lines from the PostgreSQL serverlog file when the crash >> occurs: >> ---------------------------------------------------------------------------- >> >> LOG: server process (PID 29153) exited with exit code 255 >> LOG: terminating any other active server processes >> WARNING: terminating connection because of crash of another server >> process > (...) > > It would be helpful if you provided the PostgreSQL version you're > using as well as details of the OS and possibly the hardware. > > I've seen this kind of error in connection with hardware errors > (typically bad RAM or severe hard disk errors). Have you attempted > replicating this problem on another system to confirm / exclude this > as a possible cause? > > Ian Barwick >
On Wed, 2006-07-12 at 07:32, Carl M. Nasal II wrote: > I apologize for not including the server information in the last e-mail. I > knew I would forget something. :) > > PostgreSQL 8.1.4 (compiled from source) > Debian GNU/Linux 3.1 > Intel(R) Pentium(R) D CPU 3.00GHz > 1GB RAM > > We have not be able to test the triggers anywhere else that has any load on > it. If we run it on a test set of data, it works fine because it is not > being hit by that many users. However, if we disable the triggers, > everything on the live server appears to work fine and PostgreSQL never dies. > > Also, we are using the PostgreSQL ODBC driver from a Windows application we > have developed. It seems that if the user tries to access the server while > it restarts after this error, they get an error. However, if the user is > not doing anything when the server restarts and then tries to access data > later, it appears to automatically do the reconnect and everything is fine. a poor man's way to apply lots of parallel load is to build a few web pages that can individually apply a bit of load, and use apache's ab program to beat on the web server and, by extension, the database.