Обсуждение: Ideas for easier debugging of backend problems
1. Move the test for strange memory alloc sizes to the palloc macros so that on error, it points at the palloc call rather than mcxt.c. Sure, it only attacks a small set of problems, but still. 2. Add either a GUC or a command line switch or PGOPTION switch to call setrlimit to set the core size to something bigger. Most places only soft limit the core size, not hard limit. 3. Add either a GUC or a command line switch or PGOPTION switch to automatically invoke and attach gdb on certain types of error. Obviously you can only do this where stdin, stdout and stderr have not been redirected. However, for this to be useful we need to distinguish between "internal" errors (like palloc) and user errors (like tablename not found). There doesn't appear to be a way to distinguish them. Mind you, distinguishing the errors would be useful, say adding err_internal() to the ones that are not supposed to happen and can be fixed (i.e. not disk errors). We could than add a standard message about "Please report this as a bug". Still something like: if( isatty(0) && isatty(1) && isatty(2) ) if( fork() != 0 ) /* Parent */ { while(1) pause(); } else { /* Imagine sprintfcalls here */ exec("gdb", "/proc/$ppid/exec", $ppid) _exit(); /* Oops */ } Might be useful for getting bug reports out of people who can't work out how to get corefiles... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Martijn van Oosterhout <kleptog@svana.org> writes: > 1. Move the test for strange memory alloc sizes to the palloc macros so > that on error, it points at the palloc call rather than mcxt.c. What would that accomplish other than bloating the backend? We can't do it anyway, because of double-evaluation risk. > 2. Add either a GUC or a command line switch or PGOPTION switch to call > setrlimit to set the core size to something bigger. Most places only > soft limit the core size, not hard limit. > 3. Add either a GUC or a command line switch or PGOPTION switch to > automatically invoke and attach gdb on certain types of error. > Obviously you can only do this where stdin, stdout and stderr have not > been redirected. Both of these presume you have a programmer running the database, or at least someone who's not scared of gdb. regards, tom lane
On Thu, Oct 27, 2005 at 10:41:08AM -0400, Tom Lane wrote: > Martijn van Oosterhout <kleptog@svana.org> writes: > > 1. Move the test for strange memory alloc sizes to the palloc macros so > > that on error, it points at the palloc call rather than mcxt.c. > > What would that accomplish other than bloating the backend? We can't do > it anyway, because of double-evaluation risk. The double-evaluation is avoidable (on GCC at least). I was thinking about when you --enable-cassert or something. But you're right, the test in palloc is always on, so you don't save much. > > 2. Add either a GUC or a command line switch or PGOPTION switch to call > > setrlimit to set the core size to something bigger. Most places only > > soft limit the core size, not hard limit. > > > 3. Add either a GUC or a command line switch or PGOPTION switch to > > automatically invoke and attach gdb on certain types of error. > > Obviously you can only do this where stdin, stdout and stderr have not > > been redirected. > > Both of these presume you have a programmer running the database, or at > least someone who's not scared of gdb. I've been thinking more along the lines that gdb is scriptable. We could invoke gdb in a way that immediatly dumps a backtrace, the local variables and some of the globals to a file and display a message to the user: please attach this to your bug report... Like today, having all this info upfront would save some time, because then we wouldn't have to teach people how to use gdb or what variables are important. Ofcourse, we'd really need to distinguish between run-of-the-mill errors and actual serious problems. The second option would help us where users are stymied by the system trying to change the core size ulimit, why not make it easier? Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Tom Lane wrote: > >Both of these presume you have a programmer running the database, or at >least someone who's not scared of gdb. > > > I think you have the set relationship the wrong way around ;-) Personally, I only use gdb in extremis, and I am sure the average DBA would be less familiar with it than I am. cheers andrew
On Thu, Oct 27, 2005 at 05:11:41PM +0200, Martijn van Oosterhout wrote: > On Thu, Oct 27, 2005 at 10:41:08AM -0400, Tom Lane wrote: > > Martijn van Oosterhout <kleptog@svana.org> writes: > > > 1. Move the test for strange memory alloc sizes to the palloc macros so > > > that on error, it points at the palloc call rather than mcxt.c. > > > > What would that accomplish other than bloating the backend? We can't do > > it anyway, because of double-evaluation risk. > > The double-evaluation is avoidable (on GCC at least). I was thinking > about when you --enable-cassert or something. But you're right, the > test in palloc is always on, so you don't save much. > > > > 2. Add either a GUC or a command line switch or PGOPTION switch to call > > > setrlimit to set the core size to something bigger. Most places only > > > soft limit the core size, not hard limit. > > > > > 3. Add either a GUC or a command line switch or PGOPTION switch to > > > automatically invoke and attach gdb on certain types of error. > > > Obviously you can only do this where stdin, stdout and stderr have not > > > been redirected. > > > > Both of these presume you have a programmer running the database, or at > > least someone who's not scared of gdb. > > I've been thinking more along the lines that gdb is scriptable. We > could invoke gdb in a way that immediatly dumps a backtrace, the local > variables and some of the globals to a file and display a message to > the user: please attach this to your bug report... > > Like today, having all this info upfront would save some time, because > then we wouldn't have to teach people how to use gdb or what variables > are important. Ofcourse, we'd really need to distinguish between > run-of-the-mill errors and actual serious problems. Perhaps have a flag that people can turn on when they're having issues? Still get a good amount of bloat, but at least not all the time... > The second option would help us where users are stymied by the system > trying to change the core size ulimit, why not make it easier? It would also be very good if there was a way to verify that the backend should be able to generate a core, such as being able to see what ulimits the backend is running under. This would have saved me some pain yesterday. It would also be useful to be able to force the backend to dump core so you can see if it's actually working (granted, I know you can end up hitting the ulimit depending on how much memory is being consumed). Maybe there is a way to do this already and I just don't know it... -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
"Jim C. Nasby" <jnasby@pervasive.com> writes: > It would also be useful to be able to force the backend to > dump core so you can see if it's actually working kill -ABRT backend-PID regards, tom lane
"Jim C. Nasby" <jnasby@pervasive.com> writes: > It would also be useful to be able to force the backend to > dump core so you can see if it's actually working (granted, I know you > can end up hitting the ulimit depending on how much memory is being > consumed). Maybe there is a way to do this already and I just don't know > it... SIGABRT? -Doug
On Thu, Oct 27, 2005 at 01:13:37PM -0400, Douglas McNaught wrote: > "Jim C. Nasby" <jnasby@pervasive.com> writes: > > > It would also be useful to be able to force the backend to > > dump core so you can see if it's actually working (granted, I know you > > can end up hitting the ulimit depending on how much memory is being > > consumed). Maybe there is a way to do this already and I just don't know > > it... > > SIGABRT? Still doesn't do anything if core dumps are disabled. The point is to enable the user to enable core dumps from the frontend, rather than having to fiddle the backend environment directly... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
On Thu, Oct 27, 2005 at 07:20:57PM +0200, Martijn van Oosterhout wrote: > On Thu, Oct 27, 2005 at 01:13:37PM -0400, Douglas McNaught wrote: > > "Jim C. Nasby" <jnasby@pervasive.com> writes: > > > > > It would also be useful to be able to force the backend to > > > dump core so you can see if it's actually working (granted, I know you > > > can end up hitting the ulimit depending on how much memory is being > > > consumed). Maybe there is a way to do this already and I just don't know > > > it... > > > > SIGABRT? > > Still doesn't do anything if core dumps are disabled. The point is to > enable the user to enable core dumps from the frontend, rather than > having to fiddle the backend environment directly... Actually, SIGABRT is what I was looking for... I wanted a way to force a backend to dump core so I could see if a corefile would actually be generated. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Thu, Oct 27, 2005 at 11:44:24AM -0500, Jim C. Nasby wrote: > > The second option would help us where users are stymied by the system > > trying to change the core size ulimit, why not make it easier? > > It would also be very good if there was a way to verify that the backend > should be able to generate a core, such as being able to see what > ulimits the backend is running under. This would have saved me some pain <snip> Well, I've sent something to -patches that allows you to set an option so you get one of the following messages: NOTICE: Core dumps hard disabled by admin NOTICE: Core dumps already enabled by admin (size) NOTICE: Core limit successfully changed to (size) You use it like: $ PGOPTIONS=-C ./psql test NOTICE: Core limit successfully changed to (unlimited) Welcome to psql 8.1beta2, the PostgreSQL interactive terminal. <snip> I think a GUC would be a waste of space. It's not like you want to skip the first three segfaults and dump on the fourth. It shouldn't be a global option. It shouldn't be easy to enable, but the option should be there. This way doesn't require any changes to clients, as it can be controlled by the environment. Bloat, I don't know, maybe. I think the gain outweighs the costs, but I'll leave it to TPTB to decide that. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Martijn van Oosterhout wrote: > 3. Add either a GUC or a command line switch or PGOPTION switch to > automatically invoke and attach gdb on certain types of error. > Obviously you can only do this where stdin, stdout and stderr have > not been redirected. Samba has a configuration parameter that allows you to set an arbitrary command as a "panic action" script. This can then be used to gather debugging information or prepare a bug report (see other thread). This also has the added flexibility that binary packagers can add extra information specific to their environment. It may be worthwhile to research whether we could do something similar. -- Peter Eisentraut http://developer.postgresql.org/~petere/
On Tue, Nov 01, 2005 at 12:33:39PM +0100, Peter Eisentraut wrote: > Martijn van Oosterhout wrote: > > 3. Add either a GUC or a command line switch or PGOPTION switch to > > automatically invoke and attach gdb on certain types of error. > > Obviously you can only do this where stdin, stdout and stderr have > > not been redirected. > > Samba has a configuration parameter that allows you to set an arbitrary > command as a "panic action" script. This can then be used to gather > debugging information or prepare a bug report (see other thread). This > also has the added flexibility that binary packagers can add extra > information specific to their environment. It may be worthwhile to > research whether we could do something similar. TODO? -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
What about the Google Core Dumper? :) http://sourceforge.net/projects/goog-coredumper/ Chris Peter Eisentraut wrote: > Martijn van Oosterhout wrote: > >>3. Add either a GUC or a command line switch or PGOPTION switch to >>automatically invoke and attach gdb on certain types of error. >>Obviously you can only do this where stdin, stdout and stderr have >>not been redirected. > > > Samba has a configuration parameter that allows you to set an arbitrary > command as a "panic action" script. This can then be used to gather > debugging information or prepare a bug report (see other thread). This > also has the added flexibility that binary packagers can add extra > information specific to their environment. It may be worthwhile to > research whether we could do something similar. >
Added to TODO: * Add GUC variable to run a command on database panic or smart/fast/immediate shutdown --------------------------------------------------------------------------- Peter Eisentraut wrote: > Martijn van Oosterhout wrote: > > 3. Add either a GUC or a command line switch or PGOPTION switch to > > automatically invoke and attach gdb on certain types of error. > > Obviously you can only do this where stdin, stdout and stderr have > > not been redirected. > > Samba has a configuration parameter that allows you to set an arbitrary > command as a "panic action" script. This can then be used to gather > debugging information or prepare a bug report (see other thread). This > also has the added flexibility that binary packagers can add extra > information specific to their environment. It may be worthwhile to > research whether we could do something similar. > > -- > Peter Eisentraut > http://developer.postgresql.org/~petere/ > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073