Re: [GENERAL] psql weird behaviour with charset encodings
От | Michael Paquier |
---|---|
Тема | Re: [GENERAL] psql weird behaviour with charset encodings |
Дата | |
Msg-id | CAB7nPqTSwtVDXEsjB0QJgQRazVtRVSzPYBi0VfdyvyhnF8D+jQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [GENERAL] psql weird behaviour with charset encodings (Noah Misch <noah@leadboat.com>) |
Список | pgsql-hackers |
On Thu, Jun 18, 2015 at 9:47 AM, Noah Misch <noah@leadboat.com> wrote: > On Wed, Jun 03, 2015 at 05:25:45PM +0900, Michael Paquier wrote: >> On Tue, Jun 2, 2015 at 4:19 PM, Michael Paquier <michael.paquier@gmail.com> wrote: >> > On Sun, May 24, 2015 at 2:43 AM, Noah Misch <noah@leadboat.com> wrote: >> > > It would be good to purge the code of precisions on "s" conversion specifiers, >> > > then Assert(!pointflag) in fmtstr() to catch new introductions. I won't plan >> > > to do it myself, but it would be a nice little defensive change. >> > >> > This sounds like a good protection idea, but as it impacts existing >> > backend code relying in sprintf's port version we should only do the >> > assertion in HEAD in my opinion, and mention it in the release notes of the >> > next major version at the time a patch in this area is applied. I guess > > Adding the assertion would be master-only. We don't necessarily release-note > C API changes. Cool. So we are on the same page. >> > that we had better backpatch the places using .*s though on back-branches. > > I would tend to back-patch only the ones that cause interesting bugs. For > example, we should not reach the read.c elog() calls anyway, so it's not a big > deal if the GNU libc bug makes them a bit less helpful in back branches. > (Thanks for the list of code sites; it was more complete than anything I had.) > So far, only tar.c looks harmed enough to back-patch. > >> Attached is a patch purging a bunch of places from using %.*s, this will >> make the code more solid when facing non-ASCII strings. I let pg_upgrade >> and pg_basebackup code paths alone as it reduces the code lisibility by >> moving out of this separator. We may want to fix them though if file path >> names have non-ASCII characters, but it seems less critical. > > To add the assertion, we must of course fix all uses. Sure. > Having seen the patch I requested, I don't like the result as much as I > expected to like it. The patched code is definitely harder to read and write: > >> @@ -1534,7 +1541,10 @@ parseNodeString(void) >> return_value = _readDeclareCursorStmt(); >> else >> { >> - elog(ERROR, "badly formatted node string \"%.32s\"...", token); >> + char buf[33]; >> + memcpy(buf, token, 32); >> + buf[33] = '\0'; >> + elog(ERROR, "badly formatted node string \"%s\"...", buf); >> return_value = NULL; /* keep compiler quiet */ >> } We could spread what the first patch did in readfuncs.c by having some more macros doing the duplicated work. Not that it would improve the code readability of those macros.. > (Apropos, that terminator belongs in buf[32], not buf[33].) Indeed. > Perhaps we're better off setting aside the whole idea, At least on OSX (10.8), I am seeing that no more than the number of bytes defined by the precision is written. So it looks that we are safe there. So yes thinking long-term this looks the better alternative. And I am wondering about the potential breakage that this could actually have with Postgres third-part tools using src/port's snprintf. > or forcing use of snprintf.c on configurations having the bug? I am less sure about that. It doesn't seem worth it knowing that the tendance is to evaluate the precision in terms in bytes and not characters. -- Michael
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Noah MischДата:
Сообщение: Re: [GENERAL] psql weird behaviour with charset encodings
Следующее
От: Peter EisentrautДата:
Сообщение: Re: 9.5 make world failing due to sgml tools missing