Обсуждение: dikkop seems unhappy because of openssl stuff (FreeBSD 14-BETA1)
Hi, it seems dikkop is unhappy again, this time because of some OpenSSL stuff. I'm not sure it's our problem - it might be issues with the other packages, or maybe something FreeBSD specific, not sure. We did some investigation of an unrelated issue on dikkop about a month ago [1], so it wasn't doing/reporting the buildfram stuff for a while. After that I had to poweroff/move the machine, and unfortunately it didn't boot after that - it's a rpi4 so maybe the SD card got damaged or something, not sure. I used the opportunity to install the new 14-BETA1 (instead of the 14-current snapshot), but unfortunately it started having issues :-( Both 11 and 12 failed with a weird openssl segfaults in plpython tests, see [2] and [3]. And 13 is stuck in some openssl stuff in plpython tests, with 100% CPU usage (for ~30h now): #0 0x00000000850e86c0 in OPENSSL_sk_insert () from /usr/local/lib/libcrypto.so.11 #1 0x00000000850a5848 in CRYPTO_set_ex_data () from /usr/local/lib/libcrypto.so.11 ... Full backtrace attached. I'm not sure what could possibly be causing this, except maybe something in FreeBSD? Or maybe there's some confusion about libraries? No idea. The system is entirely new, there's only a handful of packages installed (full list attached), and I don't think I did anything strange or much different from the previous 14-current install. Any ideas what might be causing this? regards [1] https://www.postgresql.org/message-id/b2bc5c16-899e-ca99-26ed-e623b4259ec7%40enterprisedb.com [2] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dikkop&dt=2023-09-16%2021%3A10%3A45 [3] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dikkop&dt=2023-09-17%2000%3A01%3A42 -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
Tomas Vondra <tomas.vondra@enterprisedb.com> writes: > it seems dikkop is unhappy again, this time because of some OpenSSL > stuff. I'm not sure it's our problem - it might be issues with the other > packages, or maybe something FreeBSD specific, not sure. > ... > Both 11 and 12 failed with a weird openssl segfaults in plpython tests, > see [2] and [3]. And 13 is stuck in some openssl stuff in plpython > tests, with 100% CPU usage (for ~30h now): Even weirder, its latest REL_11 run got past that, and instead failed in pltcl [1]. I suppose in an hour or two we'll know if v12 also changed behavior. The pltcl test case that is failing is annotated -- Test usage of Tcl's "clock" command. In recent Tcl versions this -- command fails without working "unknown" support, so it's a good canary -- for initialization problems. which is mighty suggestive, but I'm not sure what to look at exactly. Perhaps apply "ldd" or local equivalent to those languages' .so files and see if they link to the same versions of indirectly-required libraries as Postgres is linking to? regards, tom lane [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dikkop&dt=2023-09-18%2013%3A59%3A40
IDK, but I tried installing tcl87 as you showed in packages.txt, and REL_11_STABLE said: checking for tclsh... no checking for tcl... no checking for tclsh8.6... no checking for tclsh86... no checking for tclsh8.5... no checking for tclsh85... no checking for tclsh8.4... no checking for tclsh84... no configure: error: Tcl shell not found It seems like our configure stuff knows only about older tcl, so how did you get past that? The other thing that springs to mind, without any particular theory, is that FreeBSD 14 switched to OpenSSL 3 (but hadn't done so yet in your old current snapshot).
On Mon, Sep 18, 2023 at 03:11:27PM +0200, Tomas Vondra wrote: > Both 11 and 12 failed with a weird openssl segfaults in plpython tests, > see [2] and [3]. And 13 is stuck in some openssl stuff in plpython > tests, with 100% CPU usage (for ~30h now): > > #0 0x00000000850e86c0 in OPENSSL_sk_insert () > from /usr/local/lib/libcrypto.so.11 > #1 0x00000000850a5848 in CRYPTO_set_ex_data () > from /usr/local/lib/libcrypto.so.11 > ... > > Full backtrace attached. I'm not sure what could possibly be causing > this, except maybe something in FreeBSD? Or maybe there's some confusion > about libraries? No idea. FWIW, I've seen such corrupted and time-sensitive stacks in the past in the plpython tests in builds when python linked to a SSL library different than what's linked with the backend. So that smells like a packaging issue to me. -- Michael
Вложения
On Tue, Sep 19, 2023 at 2:04 PM Michael Paquier <michael@paquier.xyz> wrote: > On Mon, Sep 18, 2023 at 03:11:27PM +0200, Tomas Vondra wrote: > > Both 11 and 12 failed with a weird openssl segfaults in plpython tests, > > see [2] and [3]. And 13 is stuck in some openssl stuff in plpython > > tests, with 100% CPU usage (for ~30h now): > > > > #0 0x00000000850e86c0 in OPENSSL_sk_insert () > > from /usr/local/lib/libcrypto.so.11 > > #1 0x00000000850a5848 in CRYPTO_set_ex_data () > > from /usr/local/lib/libcrypto.so.11 > > ... > > > > Full backtrace attached. I'm not sure what could possibly be causing > > this, except maybe something in FreeBSD? Or maybe there's some confusion > > about libraries? No idea. > > FWIW, I've seen such corrupted and time-sensitive stacks in the past > in the plpython tests in builds when python linked to a SSL library > different than what's linked with the backend. So that smells like a > packaging issue to me. Could it be confusion due to the presence of OpenSSL 3.0 in the FreeBSD base system (/usr/include, /usr/lib) combined with the presence of OpenSSL 1.1.1 installed with "pkg install openssl" (/usr/local/include, /usr/local/lib)? Tomas, does it help if you "pkg remove openssl"?
On 9/19/23 04:25, Thomas Munro wrote: > On Tue, Sep 19, 2023 at 2:04 PM Michael Paquier <michael@paquier.xyz> wrote: >> On Mon, Sep 18, 2023 at 03:11:27PM +0200, Tomas Vondra wrote: >>> Both 11 and 12 failed with a weird openssl segfaults in plpython tests, >>> see [2] and [3]. And 13 is stuck in some openssl stuff in plpython >>> tests, with 100% CPU usage (for ~30h now): >>> >>> #0 0x00000000850e86c0 in OPENSSL_sk_insert () >>> from /usr/local/lib/libcrypto.so.11 >>> #1 0x00000000850a5848 in CRYPTO_set_ex_data () >>> from /usr/local/lib/libcrypto.so.11 >>> ... >>> >>> Full backtrace attached. I'm not sure what could possibly be causing >>> this, except maybe something in FreeBSD? Or maybe there's some confusion >>> about libraries? No idea. >> >> FWIW, I've seen such corrupted and time-sensitive stacks in the past >> in the plpython tests in builds when python linked to a SSL library >> different than what's linked with the backend. So that smells like a >> packaging issue to me. > > Could it be confusion due to the presence of OpenSSL 3.0 in the > FreeBSD base system (/usr/include, /usr/lib) combined with the > presence of OpenSSL 1.1.1 installed with "pkg install openssl" > (/usr/local/include, /usr/local/lib)? Tomas, does it help if you "pkg > remove openssl"? Oh! That might be it - I didn't realize FreeBSD already has openssl 3.0 already included in the base system, so perhaps installing 1.1.1v leads to some serious confusion ... After some off-list discussion with Alvaro I tried removing the 1.1.1v and installed the openssl31 package, which apparently resolved this (at which point it ran into the unrelated tcl issue). Still, this confusion seems rather unexpected, and I'm not sure if having both 3.0 (from base) and 3.1 (from package) could lead to the same confusion / crashes. Not sure if it's "our" problem ... regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 9/18/23 20:52, Tom Lane wrote: > Tomas Vondra <tomas.vondra@enterprisedb.com> writes: >> it seems dikkop is unhappy again, this time because of some OpenSSL >> stuff. I'm not sure it's our problem - it might be issues with the other >> packages, or maybe something FreeBSD specific, not sure. >> ... >> Both 11 and 12 failed with a weird openssl segfaults in plpython tests, >> see [2] and [3]. And 13 is stuck in some openssl stuff in plpython >> tests, with 100% CPU usage (for ~30h now): > > Even weirder, its latest REL_11 run got past that, and instead failed > in pltcl [1]. I suppose in an hour or two we'll know if v12 also > changed behavior. > Oh, yeah. Sorry for not mentioning this yesterday ... I tried removing the openssl-1.1.1v and installed 3.1 instead, which apparently allowed it to pass the plpython tests. I guess it's due to some sort of confusion with the openssl-3.0 included in FreeBSD base (which I didn't realize is there). > The pltcl test case that is failing is annotated > > -- Test usage of Tcl's "clock" command. In recent Tcl versions this > -- command fails without working "unknown" support, so it's a good canary > -- for initialization problems. > > which is mighty suggestive, but I'm not sure what to look at exactly. > Perhaps apply "ldd" or local equivalent to those languages' .so files > and see if they link to the same versions of indirectly-required > libraries as Postgres is linking to? > > regards, tom lane > I have no experience with tcl, but I tried this in the two tclsh versions installed no the system (8.6 and 8.7): bsd@freebsd:~ $ tclsh8.7 % clock scan "1/26/2010" time value too large/small to represent bsd@freebsd:~ $ tclsh8.6 % clock scan "1/26/2010" time value too large/small to represent AFAIK this is what the tcl_date_week(2010,1,26) translates to. -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Tomas Vondra <tomas.vondra@enterprisedb.com> writes: > I have no experience with tcl, but I tried this in the two tclsh > versions installed no the system (8.6 and 8.7): > bsd@freebsd:~ $ tclsh8.7 > % clock scan "1/26/2010" > time value too large/small to represent > bsd@freebsd:~ $ tclsh8.6 > % clock scan "1/26/2010" > time value too large/small to represent > AFAIK this is what the tcl_date_week(2010,1,26) translates to. Oh, interesting. On my FreeBSD 13.1 arm64 system, it works: $ tclsh8.6 % clock scan "1/26/2010" 1264482000 I am now suspicious that there's some locale effect that we have not observed before (though why not?). What is the result of the "locale" command on your box? Mine gives $ locale LANG=C.UTF-8 LC_CTYPE="C.UTF-8" LC_COLLATE="C.UTF-8" LC_TIME="C.UTF-8" LC_NUMERIC="C.UTF-8" LC_MONETARY="C.UTF-8" LC_MESSAGES="C.UTF-8" LC_ALL= regards, tom lane
On 9/19/23 18:45, Tom Lane wrote: > Tomas Vondra <tomas.vondra@enterprisedb.com> writes: >> I have no experience with tcl, but I tried this in the two tclsh >> versions installed no the system (8.6 and 8.7): > >> bsd@freebsd:~ $ tclsh8.7 >> % clock scan "1/26/2010" >> time value too large/small to represent > >> bsd@freebsd:~ $ tclsh8.6 >> % clock scan "1/26/2010" >> time value too large/small to represent > >> AFAIK this is what the tcl_date_week(2010,1,26) translates to. > > Oh, interesting. On my FreeBSD 13.1 arm64 system, it works: > > $ tclsh8.6 > % clock scan "1/26/2010" > 1264482000 > > I am now suspicious that there's some locale effect that we have > not observed before (though why not?). What is the result of > the "locale" command on your box? Mine gives > > $ locale > LANG=C.UTF-8 > LC_CTYPE="C.UTF-8" > LC_COLLATE="C.UTF-8" > LC_TIME="C.UTF-8" > LC_NUMERIC="C.UTF-8" > LC_MONETARY="C.UTF-8" > LC_MESSAGES="C.UTF-8" > LC_ALL= > bsd@freebsd:~ $ locale LANG=C.UTF-8 LC_CTYPE="C.UTF-8" LC_COLLATE="C.UTF-8" LC_TIME="C.UTF-8" LC_NUMERIC="C.UTF-8" LC_MONETARY="C.UTF-8" LC_MESSAGES="C.UTF-8" LC_ALL= bsd@freebsd:~ $ tclsh8.6 % clock scan "1/26/2010" time value too large/small to represent However, I wonder if there's something wrong with tcl itself, considering this: % clock format 1360558800 -format %D 02/11/2013 % clock scan 02/11/2013 -format %D time value too large/small to represent That's a bit strange - it seems tcl can format a timestamp, but then can't read it back in for some reason ... regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Tomas Vondra <tomas.vondra@enterprisedb.com> writes: > bsd@freebsd:~ $ tclsh8.6 > % clock scan "1/26/2010" > time value too large/small to represent In hopes of replicating this, I tried installing FreeBSD 14-BETA2 aarch64 on my Pi 3B. This test case works fine: $ tclsh8.6 % clock scan "1/26/2010" 1264482000 $ tclsh8.7 % clock scan "1/26/2010" 1264482000 and unsurprisingly, pltcl's regression tests pass. I surmise that something is broken in BETA1 that they fixed in BETA2. plpython works too, with the python 3.9 package (and no older python). However, all is not peachy, because plperl doesn't work. Trying to CREATE EXTENSION either plperl or plperlu leads to a libperl panic: pl_regression=# create extension plperl; server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Succeeded. with this in the postmaster log: panic: pthread_key_create failed That message is certainly not ours, so it must be coming out of libperl. Another thing that seemed strange is that ecpg's preproc.o takes O(forever) to compile. I killed the build after observing that the compiler had gotten to 40 minutes of CPU time, and redid that step with PROFILE=-O0, which allowed it to compile in 20 seconds or so. (I also tried -O1, but gave up after a few minutes.) This machine can compile the main backend grammar in a minute or two, so there is something very odd there. I'm coming to the conclusion that 14-BETA is, well, beta grade. I'll be interested to see if you get the same results when you update to BETA2. regards, tom lane
On 9/20/23 01:24, Tom Lane wrote: > Tomas Vondra <tomas.vondra@enterprisedb.com> writes: >> bsd@freebsd:~ $ tclsh8.6 >> % clock scan "1/26/2010" >> time value too large/small to represent > > In hopes of replicating this, I tried installing FreeBSD 14-BETA2 > aarch64 on my Pi 3B. This test case works fine: > > $ tclsh8.6 > % clock scan "1/26/2010" > 1264482000 > > $ tclsh8.7 > % clock scan "1/26/2010" > 1264482000 > > and unsurprisingly, pltcl's regression tests pass. I surmise > that something is broken in BETA1 that they fixed in BETA2. > > plpython works too, with the python 3.9 package (and no older > python). > > However, all is not peachy, because plperl doesn't work. > Trying to CREATE EXTENSION either plperl or plperlu leads > to a libperl panic: > > pl_regression=# create extension plperl; > server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > The connection to the server was lost. Attempting reset: Succeeded. > > with this in the postmaster log: > > panic: pthread_key_create failed > > That message is certainly not ours, so it must be coming out of libperl. > > Another thing that seemed strange is that ecpg's preproc.o takes > O(forever) to compile. I killed the build after observing that the > compiler had gotten to 40 minutes of CPU time, and redid that step > with PROFILE=-O0, which allowed it to compile in 20 seconds or so. > (I also tried -O1, but gave up after a few minutes.) This machine > can compile the main backend grammar in a minute or two, so there is > something very odd there. > > I'm coming to the conclusion that 14-BETA is, well, beta grade. > I'll be interested to see if you get the same results when you > update to BETA2. Thanks, I'll try that when I'll be at the office next week. retards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 9/20/23 19:59, Tomas Vondra wrote: > > > On 9/20/23 01:24, Tom Lane wrote: >> Tomas Vondra <tomas.vondra@enterprisedb.com> writes: >>> bsd@freebsd:~ $ tclsh8.6 >>> % clock scan "1/26/2010" >>> time value too large/small to represent >> >> In hopes of replicating this, I tried installing FreeBSD 14-BETA2 >> aarch64 on my Pi 3B. This test case works fine: >> >> $ tclsh8.6 >> % clock scan "1/26/2010" >> 1264482000 >> >> $ tclsh8.7 >> % clock scan "1/26/2010" >> 1264482000 >> >> and unsurprisingly, pltcl's regression tests pass. I surmise >> that something is broken in BETA1 that they fixed in BETA2. >> >> plpython works too, with the python 3.9 package (and no older >> python). >> >> However, all is not peachy, because plperl doesn't work. >> Trying to CREATE EXTENSION either plperl or plperlu leads >> to a libperl panic: >> >> pl_regression=# create extension plperl; >> server closed the connection unexpectedly >> This probably means the server terminated abnormally >> before or while processing the request. >> The connection to the server was lost. Attempting reset: Succeeded. >> >> with this in the postmaster log: >> >> panic: pthread_key_create failed >> >> That message is certainly not ours, so it must be coming out of libperl. >> >> Another thing that seemed strange is that ecpg's preproc.o takes >> O(forever) to compile. I killed the build after observing that the >> compiler had gotten to 40 minutes of CPU time, and redid that step >> with PROFILE=-O0, which allowed it to compile in 20 seconds or so. >> (I also tried -O1, but gave up after a few minutes.) This machine >> can compile the main backend grammar in a minute or two, so there is >> something very odd there. >> >> I'm coming to the conclusion that 14-BETA is, well, beta grade. >> I'll be interested to see if you get the same results when you >> update to BETA2. > > Thanks, I'll try that when I'll be at the office next week. > FWIW when I disabled tcl, the tests pass (it's running with --nostatus --nosend, so it's not visible on the buildfarm site). Including the plperl stuff: ============== running regression test queries ============== test plperl ... ok 397 ms test plperl_lc ... ok 152 ms test plperl_trigger ... ok 374 ms test plperl_shared ... ok 163 ms test plperl_elog ... ok 184 ms test plperl_util ... ok 210 ms test plperl_init ... ok 150 ms test plperlu ... ok 117 ms test plperl_array ... ok 228 ms test plperl_call ... ok 189 ms test plperl_transaction ... ok 412 ms test plperl_plperlu ... ok 238 ms ====================== All 12 tests passed. ====================== I wonder if this got broken between BETA1 and BETA2. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 9/20/23 20:09, Tomas Vondra wrote: > On 9/20/23 19:59, Tomas Vondra wrote: >> >> >> On 9/20/23 01:24, Tom Lane wrote: >>> Tomas Vondra <tomas.vondra@enterprisedb.com> writes: >>>> bsd@freebsd:~ $ tclsh8.6 >>>> % clock scan "1/26/2010" >>>> time value too large/small to represent >>> >>> In hopes of replicating this, I tried installing FreeBSD 14-BETA2 >>> aarch64 on my Pi 3B. This test case works fine: >>> >>> $ tclsh8.6 >>> % clock scan "1/26/2010" >>> 1264482000 >>> >>> $ tclsh8.7 >>> % clock scan "1/26/2010" >>> 1264482000 >>> >>> and unsurprisingly, pltcl's regression tests pass. I surmise >>> that something is broken in BETA1 that they fixed in BETA2. >>> >>> plpython works too, with the python 3.9 package (and no older >>> python). >>> >>> However, all is not peachy, because plperl doesn't work. >>> Trying to CREATE EXTENSION either plperl or plperlu leads >>> to a libperl panic: >>> >>> pl_regression=# create extension plperl; >>> server closed the connection unexpectedly >>> This probably means the server terminated abnormally >>> before or while processing the request. >>> The connection to the server was lost. Attempting reset: Succeeded. >>> >>> with this in the postmaster log: >>> >>> panic: pthread_key_create failed >>> >>> That message is certainly not ours, so it must be coming out of libperl. >>> >>> Another thing that seemed strange is that ecpg's preproc.o takes >>> O(forever) to compile. I killed the build after observing that the >>> compiler had gotten to 40 minutes of CPU time, and redid that step >>> with PROFILE=-O0, which allowed it to compile in 20 seconds or so. >>> (I also tried -O1, but gave up after a few minutes.) This machine >>> can compile the main backend grammar in a minute or two, so there is >>> something very odd there. >>> >>> I'm coming to the conclusion that 14-BETA is, well, beta grade. >>> I'll be interested to see if you get the same results when you >>> update to BETA2. >> >> Thanks, I'll try that when I'll be at the office next week. >> > > FWIW when I disabled tcl, the tests pass (it's running with --nostatus > --nosend, so it's not visible on the buildfarm site). Including the > plperl stuff: > > ============== running regression test queries ============== > test plperl ... ok 397 ms > test plperl_lc ... ok 152 ms > test plperl_trigger ... ok 374 ms > test plperl_shared ... ok 163 ms > test plperl_elog ... ok 184 ms > test plperl_util ... ok 210 ms > test plperl_init ... ok 150 ms > test plperlu ... ok 117 ms > test plperl_array ... ok 228 ms > test plperl_call ... ok 189 ms > test plperl_transaction ... ok 412 ms > test plperl_plperlu ... ok 238 ms > > ====================== > All 12 tests passed. > ====================== > > I wonder if this got broken between BETA1 and BETA2. > Hmmm, I got to install BETA2 yesterday, but I still se the tcl failure: select tcl_date_week(2010,1,26); - tcl_date_week ---------------- - 04 -(1 row) - +ERROR: time value too large/small to represent +CONTEXT: time value too large/small to represent + while executing +"ConvertLocalToUTC $date[set date {}] $TZData($timezone) 2361222" + (procedure "FreeScan" line 86) + invoked from within +"FreeScan $string $base $timezone $locale" + (procedure "::tcl::clock::scan" line 68) + invoked from within +"::tcl::clock::scan 1/26/2010" + ("uplevel" body line 1) + invoked from within +"uplevel 1 [info level 0]" + (procedure "::tcl::clock::scan" line 4) + invoked from within +"clock scan "$2/$3/$1"" + (procedure "__PLTcl_proc_55335" line 3) + invoked from within +"__PLTcl_proc_55335 2010 1 26" +in PL/Tcl function "tcl_date_week" select tcl_date_week(2001,10,24); I wonder what's the difference between the systems ... All I did was writing the BETA2 image to SD card, and install a couple packages: pkg install xml2c libxslt gettext-tools ccache tcl tcl87 \ p5-Test-Harness p5-IPC-Run gmake htop bash screen \ python tcl86 nano p5-Test-LWP-UserAgent \ p5-LWP-Protocol-https And then perl ./run_branches.pl --run-all --nosend --nostatus --verbose with the buildfarm config used by dikkop. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Tomas Vondra <tomas.vondra@enterprisedb.com> writes: > Hmmm, I got to install BETA2 yesterday, but I still se the tcl failure: Huh. I'm baffled as to what's up there. Is it possible that this is actually a hardware-based difference? I didn't think there was much difference between Pi 3B and Pi 4, but we're running out of other explanations. > I wonder what's the difference between the systems ... All I did was > writing the BETA2 image to SD card, and install a couple packages: I reinstalled BETA3, since that's out now, but see no change in behavior. I did discover that plperl works for me after adding --with-openssl to the configure options. Not sure if it's worth digging any further than that. regards, tom lane
On 9/26/23 23:50, Tom Lane wrote: > Tomas Vondra <tomas.vondra@enterprisedb.com> writes: >> Hmmm, I got to install BETA2 yesterday, but I still se the tcl failure: > > Huh. I'm baffled as to what's up there. Is it possible that this is > actually a hardware-based difference? I didn't think there was much > difference between Pi 3B and Pi 4, but we're running out of other > explanations. > Hmm, yeah. Which FreeBSD image did you install? armv7 or aarch64? >> I wonder what's the difference between the systems ... All I did was >> writing the BETA2 image to SD card, and install a couple packages: > > I reinstalled BETA3, since that's out now, but see no change in > behavior. > > I did discover that plperl works for me after adding --with-openssl > to the configure options. Not sure if it's worth digging any further > than that. > No idea. Seems broken, but no time to investigate further at the moment. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Tomas Vondra <tomas.vondra@enterprisedb.com> writes: > On 9/26/23 23:50, Tom Lane wrote: >> Huh. I'm baffled as to what's up there. Is it possible that this is >> actually a hardware-based difference? I didn't think there was much >> difference between Pi 3B and Pi 4, but we're running out of other >> explanations. > Hmm, yeah. Which FreeBSD image did you install? armv7 or aarch64? https://download.freebsd.org/releases/arm64/aarch64/ISO-IMAGES/14.0/FreeBSD-14.0-BETA3-arm64-aarch64-RPI.img.xz regards, tom lane
On 9/27/23 15:38, Tom Lane wrote: > Tomas Vondra <tomas.vondra@enterprisedb.com> writes: >> On 9/26/23 23:50, Tom Lane wrote: >>> Huh. I'm baffled as to what's up there. Is it possible that this is >>> actually a hardware-based difference? I didn't think there was much >>> difference between Pi 3B and Pi 4, but we're running out of other >>> explanations. > >> Hmm, yeah. Which FreeBSD image did you install? armv7 or aarch64? > > https://download.freebsd.org/releases/arm64/aarch64/ISO-IMAGES/14.0/FreeBSD-14.0-BETA3-arm64-aarch64-RPI.img.xz > Thanks, that's the image I've used. This is really strange ... regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Tomas Vondra <tomas.vondra@enterprisedb.com> writes: > On 9/27/23 15:38, Tom Lane wrote: >> Tomas Vondra <tomas.vondra@enterprisedb.com> writes: >>> Hmm, yeah. Which FreeBSD image did you install? armv7 or aarch64? >> https://download.freebsd.org/releases/arm64/aarch64/ISO-IMAGES/14.0/FreeBSD-14.0-BETA3-arm64-aarch64-RPI.img.xz > Thanks, that's the image I've used. This is really strange ... I've now laid my hands on a Pi 4B, and with that exact same SD card plugged in, I get the same results I did with the 3B+: pltcl regression tests pass, and so does the manual check with tclsh8.[67]. So it seems like the "different CPU" theory doesn't survive contact with reality either. I'm completely baffled, but I do notice that "clock scan" without a -format option is deprecated according to the Tcl man page. Maybe we should stop relying on deprecated behavior and put in a -format option? regards, tom lane
Does the image lack a /etc/localtime file/link, but perhaps one of you did something to create it? This came up with the CI image: https://www.postgresql.org/message-id/flat/20230731191510.pebqeiuo2sbmlcfh%40awork3.anarazel.de Also mentioned at: https://wiki.tcl-lang.org/page/clock+scan
Thomas Munro <thomas.munro@gmail.com> writes: > Does the image lack a /etc/localtime file/link, but perhaps one of you > did something to create it? Hah! I thought it had to be some sort of locale effect, but I failed to think of that as a contributor :-(. My installation does have /etc/localtime, and removing it duplicates Tomas' syndrome. I also find that if I add "-gmt 1" to the clock invocation, it's happy with or without /etc/localtime. So I think we should modify the test case to use that to reduce its environmental sensitivity. Will go make it so. regards, tom lane
Thomas Munro <thomas.munro@gmail.com> writes: > This came up with the CI image: > https://www.postgresql.org/message-id/flat/20230731191510.pebqeiuo2sbmlcfh%40awork3.anarazel.de BTW, after re-reading that thread, I think the significant difference is that these FreeBSD images don't force you to select a timezone during setup, unlike what I recall seeing when installing x86_64 FreeBSD. You're not forced to run bsdconfig at all, and even if you do it doesn't make you enter the sub-menu where you can pick a timezone. I recall that I did do that while setting mine up, but I'll bet Tomas skipped it. I'm not sure at this point whether FreeBSD changed behavior since 13.x, or this is a difference between their preferred installation processes for x86 vs. ARM. But in any case, it's clearly easier to get into the no-/etc/localtime state with these systems than I thought before. regards, tom lane
On 9/30/23 01:57, Tom Lane wrote: > Thomas Munro <thomas.munro@gmail.com> writes: >> Does the image lack a /etc/localtime file/link, but perhaps one of you >> did something to create it? > > Hah! I thought it had to be some sort of locale effect, but I failed > to think of that as a contributor :-(. My installation does have > /etc/localtime, and removing it duplicates Tomas' syndrome. > > I also find that if I add "-gmt 1" to the clock invocation, it's happy > with or without /etc/localtime. So I think we should modify the test > case to use that to reduce its environmental sensitivity. Will > go make it so. > FWIW I've defined the timezone (copying it into /etc/localtime), and that seems to have resolved the issue (well, maybe it's the "-gmt 1" tweak, not sure). I wonder how come it worked with the earlier image - I don't recall defining the timezone (AFAIK I only did the bare minimum to get it working), but maybe I did. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company