Обсуждение: More consistency for some file-related error message
Hi all, While looking at the source code for more consistency work with error messages, I have bumped into a couple of messages which could be simplified, as those include in the name of the file manipulated basically the same information as the context added. I have finished with the attached. Note that for most of the messages, I think that the context can be useful, like for example the stats temporary file which could be user-defined, so those are left out. There are also other cases, say the reorder buffer spill file, where we may not know the path worked on, so the messages are kept consistent as per HEAD. That's a bit less work to do for translators, particularly with pg_stat_statements. Thoughts? -- Michael
Вложения
Michael Paquier <michael@paquier.xyz> writes: > While looking at the source code for more consistency work with error > messages, I have bumped into a couple of messages which could be > simplified, as those include in the name of the file manipulated > basically the same information as the context added. > I have finished with the attached. Note that for most of the messages, > I think that the context can be useful, like for example the stats > temporary file which could be user-defined, so those are left out. > There are also other cases, say the reorder buffer spill file, where we > may not know the path worked on, so the messages are kept consistent as > per HEAD. > That's a bit less work to do for translators, particularly with > pg_stat_statements. +1. Another thing I noticed is that we now have a fair amount of code like this example in xlog.c: errno = 0; pgstat_report_wait_start(WAIT_EVENT_WAL_COPY_READ); r = read(srcfd, buffer, nread); if (r != nread) { if (r < 0) ereport(ERROR, (errcode_for_file_access(), errmsg("could not read file \"%s\": %m", path))); else ereport(ERROR, (errmsg("could not read file \"%s\": read %d of %zu", path, r, (Size) nread))); } pgstat_report_wait_end(); } The short-read ereport has no errcode() call, meaning it will report XX000, which seems like it's against project policy for foreseeable errors. In this example ERRCODE_DATA_CORRUPTED seems better. BTW, isn't the initial "errno = 0" dead code now? regards, tom lane
On Wed, Jul 18, 2018 at 10:57:16AM -0400, Tom Lane wrote: > +1. Okay, thanks. I can always get that pushed first if there are no objections. More can be always done, but that's already a nice cut. > The short-read ereport has no errcode() call, meaning it will report > XX000, which seems like it's against project policy for foreseeable > errors. In this example ERRCODE_DATA_CORRUPTED seems better. Okay, my mistake then. All the error code paths for read really out to not fail, so ERRCODE_DATA_CORRUPTED seems adapted for all of them to me instead of ERRCODE_INTERNAL_ERROR. Or would we want to create new sub categories for corruptions? I can personally live with ERRCODE_DATA_CORRUPTED but.. By the way, the original PANIC message in StartupReplicationOrigin also lacked an errcode, so fixed on the way. > BTW, isn't the initial "errno = 0" dead code now? Hm. I have not bothered touching those as it could be possible that read() may not initialize errno to 0, so errno would remain set to any previous value when less bytes than expected are read, no? It seems to me that the current coding is more careful. Attached are two patches: - 0001 is the previous patch with a commit messages. - 0002 is an answer to Tom's remark about errcodes in the new error messages for read(). I have left out the error messages for 2PC, which are WARNINGs (those should actually be ERROR, promoted to FATAL in the startup process, see here: https://commitfest.postgresql.org/19/1717/). -- Michael
Вложения
Michael Paquier <michael@paquier.xyz> writes: > On Wed, Jul 18, 2018 at 10:57:16AM -0400, Tom Lane wrote: >> BTW, isn't the initial "errno = 0" dead code now? > Hm. I have not bothered touching those as it could be possible that > read() may not initialize errno to 0, so errno would remain set to any > previous value when less bytes than expected are read, no? It seems to > me that the current coding is more careful. read() is required by spec to set errno when returning a negative result. I think the previous coding paid attention to errno regardless of the sign of the result, which would justify pre-zeroing it ... but the new coding definitely doesn't. regards, tom lane
On Wed, Jul 18, 2018 at 11:24:05PM -0400, Tom Lane wrote: > read() is required by spec to set errno when returning a negative result. > I think the previous coding paid attention to errno regardless of the sign > of the result, which would justify pre-zeroing it ... but the new coding > definitely doesn't. Yes, my point is a bit different though.. Do you think that we need to bother about the case where errno is not 0 before calling read(), in the case where it returns a positive result? This would mean that errno would still have a previous errno set, still it returned a number of bytes read. For the code paths discussed here that visibly does not matter so you are right, we could remove them, still patterns get easily copy-pasted around... -- Michael
Вложения
Hello. At Thu, 19 Jul 2018 12:33:30 +0900, Michael Paquier <michael@paquier.xyz> wrote in <20180719033330.GH3411@paquier.xyz> > On Wed, Jul 18, 2018 at 11:24:05PM -0400, Tom Lane wrote: > > read() is required by spec to set errno when returning a negative result. > > I think the previous coding paid attention to errno regardless of the sign > > of the result, which would justify pre-zeroing it ... but the new coding > > definitely doesn't. > > Yes, my point is a bit different though.. Do you think that we need to > bother about the case where errno is not 0 before calling read(), in the > case where it returns a positive result? This would mean that errno > would still have a previous errno set, still it returned a number of > bytes read. For the code paths discussed here that visibly does not > matter so you are right, we could remove them, still patterns get easily > copy-pasted around... Agreed to it's not necessary and a developer ought to know about the errno behavior. However, I can sympathize with Michael. Meawhile I found the following code in xlog.c. | r = fread(labelfile, statbuf.st_size, 1, lfp); | labelfile[statbuf.st_size] = '\0'; | | /* | * Close and remove the backup label file | */ | if (r != 1 || ferror(lfp) || FreeFile(lfp)) | ereport(ERROR, | (errcode_for_file_access(), | errmsg("could not read file \"%s\": %m", | BACKUP_LABEL_FILE))); fread() and ferror() don't set errno so the errcode_for_file_access() gives a bogus result. The same found in basebackup.c and genfile.c. Also found in bootscanner.c but it doesn't come from .l file.. CopyGetData has a variant of it. | bytesread = fread(databuf, 1, maxread, cstate->copy_file); | if (ferror(cstate->copy_file)) | ereport(ERROR, | (errcode_for_file_access(), ereport(..(errcode_for_file_access() gets bogus errno here. Might be trivial but the following message is emitted for AllocateFile failure and AllocateFile feilure in other places gets a different message "could not *open* file". The differece leads to a slightly confusing message which doesn't harm so much like "could not read file: File name too long".. | - errmsg("could not read pg_stat_statement file \"%s\": %m", | + errmsg("could not read file \"%s\": %m", And I see other variants of this like the follows. "could not read from hash-join temporary file: %m" "could not read two-phase state file \"%s\": %m" "could not read from COPY file: %m" I'm not sure it's a good thing to elimiate all specific file naem from all of these but I don't find a criteria whether we use generic or specific message in each case. About the following and similars, there's two variants which has "to" and not has it. | - errmsg("could not write pg_stat_statement file \"%s\": %m", | + errmsg("could not write file \"%s\": %m", | - errmsg("could not write to control file: %m"))); | + errmsg("could not write to file \"%s\": %m", regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Thu, Jul 19, 2018 at 02:05:51PM +0900, Kyotaro HORIGUCHI wrote: > Agreed to it's not necessary and a developer ought to know about > the errno behavior. However, I can sympathize with Michael. I am fine to remove them if folks here push for that. > CopyGetData has a variant of it. > > | bytesread = fread(databuf, 1, maxread, cstate->copy_file); > | if (ferror(cstate->copy_file)) > | ereport(ERROR, > | (errcode_for_file_access(), > > ereport(..(errcode_for_file_access() gets bogus errno here. Yeah, I saw those but did not really bother much about them yet, errno should be set to the return result of ferror(). If you have a patch, please feel free to send one. > And I see other variants of this like the follows. > > "could not read from hash-join temporary file: %m" > "could not read from COPY file: %m" This is intentional. You may not be able to guess the context based on the file name. > "could not read two-phase state file \"%s\": %m" You need to refresh your tree, that does not show up on HEAD anymore. See 811b6e3. > I'm not sure it's a good thing to elimiate all specific file naem > from all of these but I don't find a criteria whether we use > generic or specific message in each case. I would say that we can remove any context-specific message if one can guess easily based on the file name what the context is. For two phase files, that's for example easy as pg_twophase/ is involved. > About the following and similars, there's two variants which has > "to" and not has it. > > | - errmsg("could not write pg_stat_statement file \"%s\": %m", > | + errmsg("could not write file \"%s\": %m", > > | - errmsg("could not write to control file: %m"))); > | + errmsg("could not write to file \"%s\": %m", It seems to me that it is important to make the distinction between a file which gets fully written and a file which exists already and gets written more. That's this difference I understand from these two error messages, so that's my intention to not change them. -- Michael
Вложения
On Thu, Jul 19, 2018 at 12:33:30PM +0900, Michael Paquier wrote: > On Wed, Jul 18, 2018 at 11:24:05PM -0400, Tom Lane wrote: >> read() is required by spec to set errno when returning a negative result. >> I think the previous coding paid attention to errno regardless of the sign >> of the result, which would justify pre-zeroing it ... but the new coding >> definitely doesn't. > > Yes, my point is a bit different though.. Do you think that we need to > bother about the case where errno is not 0 before calling read(), in the > case where it returns a positive result? This would mean that errno > would still have a previous errno set, still it returned a number of > bytes read. For the code paths discussed here that visibly does not > matter so you are right, we could remove them, still patterns get easily > copy-pasted around... Okay, I just looked again at this point, and among the new messages only what's in XLogFileCopy has been bothering setting errno to 0 (see 811b6e3), so let's remove it in this case. Thoughts about the previous patch set? -- Michael
Вложения
On Fri, Jul 20, 2018 at 03:41:08PM +0900, Michael Paquier wrote: > Okay, I just looked again at this point, and among the new messages only > what's in XLogFileCopy has been bothering setting errno to 0 (see > 811b6e3), so let's remove it in this case. So, I have been through the patch set once again and pushed the patch to make more error messages consistent, as well as the patch to set up proper errcodes for new error messages. There are perhaps more improvements which could be done for the error messages, but that's not absolutely clear either as the context used is actually useful in those remaining. -- Michael