Обсуждение: Getting the red out (of the buildfarm)
We have a number of buildfarm members that have been failing on HEAD consistently for some time. It looks from here that the following actions need to be taken: tapir, cardinal: need a newer version of "flex" installed wombat, eukaryote, chinchilla: these are all failing withLOG: could not bind IPv4 socket: Address already in useHINT: Isanother postmaster already running on port 5678? Ifnot, wait a few seconds and retry. This presumably indicates that a postmaster is hanging around from a previous test and needs to be killed manually. The fact that this started to happen about ten days ago on all three machines suggests a generic failure-to-shut-down problem in the buildfarm script. I wonder how up-to-date their scripts are. comet_moth, gothic_moth: these are failing the new plpython_unicode test in locale cs_CZ.ISO8859-2. Somebody needs to do something about that. If it's left to me I'll probably just remove the test that has multiple results. regards, tom lane
On Wed, 2009-09-23 at 10:20 -0400, Tom Lane wrote: > comet_moth, gothic_moth: these are failing the new plpython_unicode > test > in locale cs_CZ.ISO8859-2. Somebody needs to do something about that. > If it's left to me I'll probably just remove the test that has > multiple > results. This is, at first glance, not a valid variant result. It's a genuine failure that needs investigation. I can't reproduce the problem with the equivalent locale on Linux, so Zdenek might need to look into it.
* Tom Lane wrote: > wombat, eukaryote, chinchilla: these are all failing with > ... > I wonder how up-to-date their scripts are. > chinchilla's was ancient, until five minutes ago. Thanks for the prodding. I'm running a --test HEAD now. -- Christian Ullrich
Peter Eisentraut <peter_e@gmx.net> writes: > On Wed, 2009-09-23 at 10:20 -0400, Tom Lane wrote: >> comet_moth, gothic_moth: these are failing the new plpython_unicode >> test >> in locale cs_CZ.ISO8859-2. Somebody needs to do something about that. >> If it's left to me I'll probably just remove the test that has >> multiple >> results. > This is, at first glance, not a valid variant result. It's a genuine > failure that needs investigation. I can't reproduce the problem with > the equivalent locale on Linux, so Zdenek might need to look into it. Uh, I can reproduce it just fine on Fedora 11, and OS X too. These are running python 2.6 and 2.6.1 respectively ... maybe the behavior is python version dependent? As far as I can tell, PLyObject_ToDatum is invoking PLyUnicode_Str and then PyString_AsString, and what it gets back from the latter is (in C string notation) "\200\0". Possibly what this means is that python thinks that that is the correct LATIN2 representation of \u0080. regards, tom lane
On Sat, 2009-10-03 at 00:42 -0400, Tom Lane wrote: > As far as I can tell, PLyObject_ToDatum is invoking PLyUnicode_Str and > then PyString_AsString, and what it gets back from the latter is > (in C string notation) "\200\0". Possibly what this means is that > python thinks that that is the correct LATIN2 representation of > \u0080. Well, \u0080 is \x80 in LATIN2, which is "\200\0" as a C string. So far so good. But that does not equate to the Euro sign, which the build farm result shows. So something is screwing up beyond this point.
Peter Eisentraut <peter_e@gmx.net> writes: > On Sat, 2009-10-03 at 00:42 -0400, Tom Lane wrote: >> As far as I can tell, PLyObject_ToDatum is invoking PLyUnicode_Str and >> then PyString_AsString, and what it gets back from the latter is >> (in C string notation) "\200\0". Possibly what this means is that >> python thinks that that is the correct LATIN2 representation of >> \u0080. > Well, \u0080 is \x80 in LATIN2, which is "\200\0" as a C string. So far > so good. But that does not equate to the Euro sign, which the build > farm result shows. So something is screwing up beyond this point. Well, there are assorted Windows code pages in which 0x80 *is* supposed to map to the Euro sign. I suspect some confusion somewhere in Solaris-land about the definition of LATIN2. But the main point here is that what is coming out, on my machines as well as Zdenek's, is the single byte "\200" not the "\\u0080" representation that the test seems to expect. Where exactly are you expecting the latter string to get substituted in? regards, tom lane
On Sat, 2009-10-03 at 11:21 -0400, Tom Lane wrote: > Well, there are assorted Windows code pages in which 0x80 *is* supposed > to map to the Euro sign. I suspect some confusion somewhere in > Solaris-land about the definition of LATIN2. But the main point here > is that what is coming out, on my machines as well as Zdenek's, is the > single byte "\200" not the "\\u0080" representation that the test seems > to expect. Where exactly are you expecting the latter string to get > substituted in? The way I understand it, the \uxxxx comes from psql, mbprint.c. So this may depend on exactly what locale psql, as run by pg_regress, thinks it is in.
Peter Eisentraut <peter_e@gmx.net> writes: > On Sat, 2009-10-03 at 11:21 -0400, Tom Lane wrote: >> Where exactly are you expecting the latter string to get >> substituted in? > The way I understand it, the \uxxxx comes from psql, mbprint.c. So this > may depend on exactly what locale psql, as run by pg_regress, thinks it > is in. [ looks at psql code ... ] Ah, I think actually the key question is what the client_encoding is. It looks to me like the \u0080 is only likely to come out if psql is working in utf8 encoding. In particular, in LATIN2 it is *guaranteed* to think that 0x80 is a displayable character, because wchar.c will tell it so (look at pg_latin1_dsplen). So plpython_unicode.out is in fact assuming UTF8 encoding is used. The results from the _moth buildfarm machines suggest that the prevailing locale is something Windows-ish, or maybe that's just an artifact introduced somewhere between the actual test and the web page. I am inclined to think that we should add another expected-file showing the single-byte \200 result. What that might get displayed as on the local system isn't really our concern. Alternatively, maybe we should change pg_latin1_dsplen so that it reports 0x80-0x9F as control characters; but that would have consequences far beyond this one regression test. regards, tom lane
On Sat, 2009-10-03 at 12:20 -0400, Tom Lane wrote: > I am inclined to think that we should add another expected-file > showing the single-byte \200 result. What that might get displayed > as on the local system isn't really our concern. OK, the reason I couldn't reproduce this for the life of me is that I had PGCLIENTENCODING=UTF8 in the environment of the server(!). Once I unset that, I could reproduce the problem. This could be made a bit more well-defined if we ran pg_regress with --multibyte=something, although that is then liable to fail in encodings that don't have an equivalent of \u0080. Some with your suggestion above: It will only work for some encodings.
Peter Eisentraut <peter_e@gmx.net> writes: > OK, the reason I couldn't reproduce this for the life of me is that I > had PGCLIENTENCODING=UTF8 in the environment of the server(!). Once I > unset that, I could reproduce the problem. This could be made a bit > more well-defined if we ran pg_regress with --multibyte=something, > although that is then liable to fail in encodings that don't have an > equivalent of \u0080. Some with your suggestion above: It will only > work for some encodings. I'm back to wondering why we need a regression test for this at all. Wouldn't it be just as useful to be testing a character code that is well-defined everywhere? Or just drop this test altogether? It's already got way too many expected files for my taste. regards, tom lane
On Sat, 2009-10-03 at 13:40 -0400, Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: > > OK, the reason I couldn't reproduce this for the life of me is that I > > had PGCLIENTENCODING=UTF8 in the environment of the server(!). Once I > > unset that, I could reproduce the problem. This could be made a bit > > more well-defined if we ran pg_regress with --multibyte=something, > > although that is then liable to fail in encodings that don't have an > > equivalent of \u0080. Some with your suggestion above: It will only > > work for some encodings. > > I'm back to wondering why we need a regression test for this at all. > Wouldn't it be just as useful to be testing a character code that > is well-defined everywhere? Or just drop this test altogether? > It's already got way too many expected files for my taste. Note that I didn't write this test; it has been there for ages. It used to prove that you couldn't process non-ASCII Unicode characters in PL/Python at all (for some value of "at all" ...), and after I implemented Unicode support they now show that you can. So they served a real purpose, and changing them to use an ASCII character code (which is presumably the only thing that is "well-defined everywhere") wouldn't have done the same thing. (In that case I probably would have had to write the test case myself.) I understand the annoyance, but I think we do need to have an organized way to do testing of non-ASCII data and in particular UTF8 data, because there are an increasing number of special code paths for those. Perhaps we could have a naming convention for test files like testname.utf8.sql, so they only get run in the appropriate environment. Any scheme like that has the disadvantage, however, that the proper rejection of non-ASCII data in ASCII environments isn't tested. (That's what all these alternative result files for the plpython_unicode test are for, btw.)
Peter Eisentraut <peter_e@gmx.net> writes: > I understand the annoyance, but I think we do need to have an organized > way to do testing of non-ASCII data and in particular UTF8 data, because > there are an increasing number of special code paths for those. Well, if you want to keep the test, we should put in the variant with \200, because it is now clear that that is in fact the right answer in a nontrivial number of environments (arguably *more* cases than in which "\u0080" is correct). regards, tom lane
On Sun, 2009-10-04 at 10:48 -0400, Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: > > I understand the annoyance, but I think we do need to have an organized > > way to do testing of non-ASCII data and in particular UTF8 data, because > > there are an increasing number of special code paths for those. > > Well, if you want to keep the test, we should put in the variant with > \200, because it is now clear that that is in fact the right answer > in a nontrivial number of environments (arguably *more* cases than > in which "\u0080" is correct). I put in a new variant file. Let's see if it works.
On Thu, 2009-10-15 at 00:43 +0300, Peter Eisentraut wrote: > On Sun, 2009-10-04 at 10:48 -0400, Tom Lane wrote: > > Peter Eisentraut <peter_e@gmx.net> writes: > > > I understand the annoyance, but I think we do need to have an organized > > > way to do testing of non-ASCII data and in particular UTF8 data, because > > > there are an increasing number of special code paths for those. > > > > Well, if you want to keep the test, we should put in the variant with > > \200, because it is now clear that that is in fact the right answer > > in a nontrivial number of environments (arguably *more* cases than > > in which "\u0080" is correct). > > I put in a new variant file. Let's see if it works. [http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/pl/plpython/expected/plpython_unicode_0.out] Actually, what I committed was really the output I got. Now with your commit my tests started failing again. The difference turns out to be caused by glibc. When you print an invalid UTF-8 byte sequence using "%.*s" when LC_CTYPE is a UTF-8 locale (e.g., en_US.utf8), it prints nothing. Presumably, it gets confused counting the characters for aligning the field width. Test program: #include <locale.h> #include <stdio.h> int main() { setlocale(LC_ALL, ""); printf("%.*s", 1, "\200"); return 0; } This prints nothing (check with od) when LC_CTYPE is en_US.utf8. I think this can be filed under trouble caused by mismatching LC_CTYPE and client encoding and doesn't need further fixing, but it's good to keep in mind. Let's see what the Solaris builds say now.
Peter Eisentraut <peter_e@gmx.net> writes: > Actually, what I committed was really the output I got. Now with your > commit my tests started failing again. Huh --- what I committed is what I got on a Fedora 11 machine. Maybe we need both variants? > Let's see what the Solaris builds say now. We'll know for sure in a couple hours, but it looks to me like their results are matching mine. regards, tom lane
On Fri, 2009-10-16 at 15:14 -0400, Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: > > Actually, what I committed was really the output I got. Now with your > > commit my tests started failing again. > > Huh --- what I committed is what I got on a Fedora 11 machine. Maybe > we need both variants? It depends on what LC_CTYPE is set to on the client side.
Peter Eisentraut <peter_e@gmx.net> writes: > On Fri, 2009-10-16 at 15:14 -0400, Tom Lane wrote: >> Huh --- what I committed is what I got on a Fedora 11 machine. Maybe >> we need both variants? > It depends on what LC_CTYPE is set to on the client side. I was testing the same case as the problematic Solaris tests, ie, LANG=cs_CZ.iso88592 [ thinks ... ] although I don't remember if psql was seeing that value too. I might've just initdb'd with that and then run psql in my usual C locale. regards, tom lane