Обсуждение: maximum number of backtrace frames logged by backtrace_functions
Hi, The maximum number of backtrace frames logged by backtrace_functions is 100. Isn't it better to document this informationso that users can understand not all backtrace always can be logged? Patch attached. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Вложения
On 01.02.22 18:04, Fujii Masao wrote: > The maximum number of backtrace frames logged by backtrace_functions is > 100. Isn't it better to document this information so that users can > understand not all backtrace always can be logged? Patch attached. The 100 was chosen as more than you'll ever need. If you are encountering cases with more than that, let's increase the limit.
On 2022/02/02 16:20, Peter Eisentraut wrote: > On 01.02.22 18:04, Fujii Masao wrote: >> The maximum number of backtrace frames logged by backtrace_functions is 100. Isn't it better to document this informationso that users can understand not all backtrace always can be logged? Patch attached. > > The 100 was chosen as more than you'll ever need. If you are encountering cases with more than that, let's increase thelimit. I encountered the "more than 100 backtrace frames" case when investigating the bug of pg_log_query_plan() patch [1]. Sincethe function that the patch added can be called repeatedly during call to that function, the backtrace became largerthan 100. I think this is not general case, so basically 100 sounds enough limit size to me. OTOH I think it's helpful if the limit is documented when I occasionally encounter the case and want to understand why allbacktrace frames are not logged. [1] https://commitfest.postgresql.org/37/3142/ Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On 03.02.22 06:33, Fujii Masao wrote: > I encountered the "more than 100 backtrace frames" case when > investigating the bug of pg_log_query_plan() patch [1]. Since the > function that the patch added can be called repeatedly during call to > that function, the backtrace became larger than 100. I think this is not > general case, so basically 100 sounds enough limit size to me. > > OTOH I think it's helpful if the limit is documented when I occasionally > encounter the case and want to understand why all backtrace frames are > not logged. How about we issue a message when the backtrace is cut off. Then it's immediately visible to the user, instead of hidden away somewhere in the documentation. Something like this (untested): diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c index 7402696986..3777dff030 100644 --- a/src/backend/utils/error/elog.c +++ b/src/backend/utils/error/elog.c @@ -967,6 +967,8 @@ set_backtrace(ErrorData *edata, int num_skip) for (int i = num_skip; i < nframes; i++) appendStringInfo(&errtrace, "\n%s", strfrms[i]); free(strfrms); + if (nframes >= lengthof(buf)) + appendStringInfo(&errtrace, "\n(backtrace limited to %zu frames)", lengthof(buf)); } #else appendStringInfoString(&errtrace,
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes: > How about we issue a message when the backtrace is cut off. Then it's > immediately visible to the user, instead of hidden away somewhere in the > documentation. Something like this (untested): +1 for idea (I didn't test it either). Is "nframes" useful enough to include in the report? regards, tom lane
On 2022/02/03 23:48, Tom Lane wrote: > Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes: >> How about we issue a message when the backtrace is cut off. Then it's >> immediately visible to the user, instead of hidden away somewhere in the >> documentation. Something like this (untested): > > +1 for idea (I didn't test it either). +1. I made this change to the patch and confirmed that it worked fine. For the record, I tested that by setting backtrace_functions to 'pg_wal_replay_pause' and executing it. This function willreport an error when it's called not during recovery. In the test, "SELECT pg_wal_replay_pause()" caused backtrace but "backtrace limited to %zu frames" message was not outputbecause the number of backtrace frames was less than 100. Then I created the function "foo" as follows and executed "SELECT foo(0)" to generate more than 100 backtrace frames. Inthis case "backtrace limited to %zu frames" message was successfully output. CREATE FUNCTION foo (x int) RETURNS void AS $$ BEGIN IF x = 10 THEN PERFORM pg_wal_replay_pause(); ELSE PERFORM foo(x + 1); END IF; END; $$ LANGUAGE plpgsql; > Is "nframes" useful enough to > include in the report? Probably No because "nframes" is equal to "lengthof(buf)" in the case where more than 100 frames are generated. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Вложения
On 2022/02/04 9:48, Fujii Masao wrote: > > > On 2022/02/03 23:48, Tom Lane wrote: >> Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes: >>> How about we issue a message when the backtrace is cut off. Then it's >>> immediately visible to the user, instead of hidden away somewhere in the >>> documentation. Something like this (untested): >> >> +1 for idea (I didn't test it either). > > +1. I made this change to the patch and confirmed that it worked fine. Barring any objection, I will commit the patch. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On 07.02.22 06:29, Fujii Masao wrote: >> >> On 2022/02/03 23:48, Tom Lane wrote: >>> Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes: >>>> How about we issue a message when the backtrace is cut off. Then it's >>>> immediately visible to the user, instead of hidden away somewhere in >>>> the >>>> documentation. Something like this (untested): >>> >>> +1 for idea (I didn't test it either). >> >> +1. I made this change to the patch and confirmed that it worked fine. > > Barring any objection, I will commit the patch. This change looks good to me. There is also backtrace code in assert.c that might want the same treatment.
On 2022/02/08 1:12, Peter Eisentraut wrote: > This change looks good to me. There is also backtrace code in assert.c that might want the same treatment. Yeah, that's good idea! The attached patch also adds the same treatment into assert.c. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Вложения
On 07.02.22 17:42, Fujii Masao wrote: > On 2022/02/08 1:12, Peter Eisentraut wrote: >> This change looks good to me. There is also backtrace code in >> assert.c that might want the same treatment. > > Yeah, that's good idea! The attached patch also adds the same treatment > into assert.c. I don't know if using write_stderr() is the right thing here. Since backtrace_symbols_fd() writes directly to stderr in any case, the whole Windows-specific eventlog dance in write_stderr() wouldn't make sense even if this feature supported Windows. So I'd just do a straight fprintf(stderr) there.
On 2022/02/18 16:07, Peter Eisentraut wrote: > On 07.02.22 17:42, Fujii Masao wrote: >> On 2022/02/08 1:12, Peter Eisentraut wrote: >>> This change looks good to me. There is also backtrace code in assert.c that might want the same treatment. >> >> Yeah, that's good idea! The attached patch also adds the same treatment into assert.c. > > I don't know if using write_stderr() is the right thing here. Since backtrace_symbols_fd() writes directly to stderr inany case, the whole Windows-specific eventlog dance in write_stderr() wouldn't make sense even if this feature supportedWindows. So I'd just do a straight fprintf(stderr) there. Yeah, maybe. Or even backtrace should be logged by write_stderr() so that it's written to eventlog if necessary? I just wonder why backtrace_symbols_fd()is used only in ExceptionalCondition(). Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On 18.02.22 09:24, Fujii Masao wrote: > Or even backtrace should be logged by write_stderr() so that it's > written to eventlog if necessary? I just wonder why > backtrace_symbols_fd() is used only in ExceptionalCondition(). Probably because it was simpler. It would also make sense to convert the whole thing to use write_stderr() consistently.
On 2022/02/18 19:59, Peter Eisentraut wrote: > On 18.02.22 09:24, Fujii Masao wrote: >> Or even backtrace should be logged by write_stderr() so that it's written to eventlog if necessary? I just wonder whybacktrace_symbols_fd() is used only in ExceptionalCondition(). > > Probably because it was simpler. It would also make sense to convert the whole thing to use write_stderr() consistently. +1 Attached is the updated version of the patch that uses write_stderr() to log the backtrace in assertion failure case. + if (nframes >= lengthof(buf)) + appendStringInfo(&errtrace, "\n(backtrace limited to %zu frames)", + lengthof(buf)); I found this doesn't work on FreeBSD, at least FreeBSD 13 that cfbot uses on Cirrus CI. When the backtrace is larger than100, on FreeBSD, backtrace() seems to write the *99* (not 100) most recent function calls to the buffer. That is, thevariable "nframes" is 99 while lengthof(buf) indicates 100. So the above message about backtrace limit will never be loggedon FreeBSD. OTOH, on Linux and MacOS, backtrace() writes the 100 most recent function calls. I'm not sure if such abehavior on FreeBSD is expected or a bug. To issue the message whatever OS is, probably we need to modify set_backtrace() as follows, for example. But is this overkill?Thought? - void *buf[100]; +#define BACKTRACE_MAX_FRAMES 100 + void *buf[BACKTRACE_MAX_FRAMES + 1]; int nframes; char **strfrms; nframes = backtrace(buf, lengthof(buf)); + if (nframes > BACKTRACE_MAX_FRAMES) + nframes = BACKTRACE_MAX_FRAMES; strfrms = backtrace_symbols(buf, nframes); if (strfrms == NULL) return; for (int i = num_skip; i < nframes; i++) appendStringInfo(&errtrace, "\n%s", strfrms[i]); + if (nframes >= BACKTRACE_MAX_FRAMES) + appendStringInfo(&errtrace, "\n(backtrace limited to %d frames)", + BACKTRACE_MAX_FRAMES); Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Вложения
On Sat, 19 Feb 2022 01:05:04 +0900 Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > > > On 2022/02/18 19:59, Peter Eisentraut wrote: > > On 18.02.22 09:24, Fujii Masao wrote: > >> Or even backtrace should be logged by write_stderr() so that it's written to eventlog if necessary? I just wonder whybacktrace_symbols_fd() is used only in ExceptionalCondition(). > > > > Probably because it was simpler. It would also make sense to convert the whole thing to use write_stderr() consistently. > +1 > Attached is the updated version of the patch that uses write_stderr() to log the backtrace in assertion failure case. > > + if (nframes >= lengthof(buf)) > + appendStringInfo(&errtrace, "\n(backtrace limited to %zu frames)", > + lengthof(buf)); > > I found this doesn't work on FreeBSD, at least FreeBSD 13 that cfbot uses on Cirrus CI. When the backtrace is larger than100, on FreeBSD, backtrace() seems to write the *99* (not 100) most recent function calls to the buffer. That is, thevariable "nframes" is 99 while lengthof(buf) indicates 100. So the above message about backtrace limit will never be loggedon FreeBSD. OTOH, on Linux and MacOS, backtrace() writes the 100 most recent function calls. I'm not sure if such abehavior on FreeBSD is expected or a bug. > > To issue the message whatever OS is, probably we need to modify set_backtrace() as follows, for example. But is this overkill?Thought? > > > - void *buf[100]; > +#define BACKTRACE_MAX_FRAMES 100 > + void *buf[BACKTRACE_MAX_FRAMES + 1]; > int nframes; > char **strfrms; > > nframes = backtrace(buf, lengthof(buf)); > + if (nframes > BACKTRACE_MAX_FRAMES) > + nframes = BACKTRACE_MAX_FRAMES; > strfrms = backtrace_symbols(buf, nframes); > if (strfrms == NULL) > return; > > for (int i = num_skip; i < nframes; i++) > appendStringInfo(&errtrace, "\n%s", strfrms[i]); > + if (nframes >= BACKTRACE_MAX_FRAMES) > + appendStringInfo(&errtrace, "\n(backtrace limited to %d frames)", > + BACKTRACE_MAX_FRAMES); I think it would be better that users can get the same results in different OS as as possible. I have another comment. When the backtrace is output from elog, two rows are skipped by num_skip and only 98 frames will appear even though it is said "backtrace limited to 100 frames". So, how about reporting the limiting number subtracted by num_skip, like this: if (nframes >= lengthof(buf)) appendStringInfo(&errtrace, "\n(backtrace limited to %zu frames)", lengthof(buf) - num_skip); or if (nframes >= BACKTRACE_MAX_FRAMES) appendStringInfo(&errtrace, "\n(backtrace limited to %d frames)", BACKTRACE_MAX_FRAMES - num_skip); Regards, Yugo Nagata -- Yugo NAGATA <nagata@sraoss.co.jp>
On 18.02.22 17:05, Fujii Masao wrote: >>> Or even backtrace should be logged by write_stderr() so that it's >>> written to eventlog if necessary? I just wonder why >>> backtrace_symbols_fd() is used only in ExceptionalCondition(). >> >> Probably because it was simpler. It would also make sense to convert >> the whole thing to use write_stderr() consistently. > +1 > Attached is the updated version of the patch that uses write_stderr() to > log the backtrace in assertion failure case. > > + if (nframes >= lengthof(buf)) > + appendStringInfo(&errtrace, "\n(backtrace limited to %zu > frames)", > + lengthof(buf)); I think this should actually print nframes, because that's the actual number of frames returned. > I found this doesn't work on FreeBSD, at least FreeBSD 13 that cfbot > uses on Cirrus CI. When the backtrace is larger than 100, on FreeBSD, > backtrace() seems to write the *99* (not 100) most recent function calls > to the buffer. That is, the variable "nframes" is 99 while lengthof(buf) > indicates 100. So the above message about backtrace limit will never be > logged on FreeBSD. OTOH, on Linux and MacOS, backtrace() writes the 100 > most recent function calls. I'm not sure if such a behavior on FreeBSD > is expected or a bug. Well, the API is that you ask for up to N frames, and it gives you X frames back. There is no requirement that X is as close to N as possible. But that also means that there is no portable API for finding out how many X it could give you if you had enough room (other than by looping and doubling the buffer dynamically etc.). Maybe we can catch this particular case if we made the condition if (nframes >= lengthof(buf) - 1) That might give slightly misleading information if you have exactly 99 or 100 frames, but overall it might still be better.