Обсуждение: PostgreSQL for VAX on NetBSD/OpenBSD
Hello VAX Enthusiasts: PostgreSQL currently has some very minimal code to support the VAX architecture. We have never supported OpenVMS, so this code would only be used if someone were to compile PostgreSQL for VAX on an operating system that we *do* support, such as NetBSD or OpenBSD. However, we don't know of anyone who has tried to do this in a very long time, and are therefore considering removing the remaining support for the VAX platform. Has anyone tried to build PostgreSQL for VAX lately? If so, did it compile? Did you have to use --disable-spinlocks to get it to compile? If it did compile, can you actually run it, and does it pass the regression tests and work as expected? Would you be willing to work with the PostgreSQL to ensure continuing support for this platform, or does that seem not worthwhile for whatever reason? Thanks, -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Jun 23, 2014 at 3:09 PM, Robert Haas <robertmhaas@gmail.com> wrote: > However, we don't know of anyone who has tried to do this in a very > long time, and are therefore considering removing the remaining > support for the VAX platform. Has anyone tried to build PostgreSQL > for VAX lately? Actually I tried a while ago but got stuck configuring the network on simh so I could get all the tools. I can try again if there's interest but we don't necessarily need to keep a port just because there's a simulator for it. -- greg
On Mon, Jun 23, 2014 at 6:58 PM, Greg Stark <stark@mit.edu> wrote: > On Mon, Jun 23, 2014 at 3:09 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> However, we don't know of anyone who has tried to do this in a very >> long time, and are therefore considering removing the remaining >> support for the VAX platform. Has anyone tried to build PostgreSQL >> for VAX lately? > > Actually I tried a while ago but got stuck configuring the network on > simh so I could get all the tools. I can try again if there's interest > but we don't necessarily need to keep a port just because there's a > simulator for it. That's really up to you. I'm not particularly interested in generating interest in maintaining this port if there wouldn't otherwise be any; I'm trying to figure out whether there is existing interest in it. For all I know, <whatever>BSD is shipping PostgreSQL binaries for VAX and every other platform they support in each new release and people are using them to get real work done. Then again, for all I know, it doesn't even compile on that platform, and if you did manage to get it to compile it wouldn't fit on the disk, and if you managed to fit it on the disk it wouldn't work because key system calls aren't supported. If someone is still interested in this, I'm hoping they'll help us figure out whether it's anywhere close to working, and maybe even contribute a buildfarm critter. If no one cares, then let's just rip it out and be done with it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Jun 24, 2014 at 7:45 AM, Sebastian Reitenbach <sebastia@l00-bugdead-prods.de> wrote: >> I'm building the vax packages for openbsd. What I can tell is that >> for 5.5 no postgresql packages were built. But that may be that >> due to the recent upgrade from gcc 2.95 to 3.3. >> I guess that not all dependencies to actually build postgresql >> are available for the vax, or may build successfully there. But I need >> to verify. Might need a few days, since I'm currently on vacation, >> with sparse Internet connectivity. ;) > > OK, that was easy: > > $ cd /usr/ports/databases/postgresql > $ make install > ===> postgresql-client-9.3.4p0 requires shared libraries . > > OpenBSD VAX is static only, so no postgresql on OpenBSD > VAX before shared libraries will ever be made working on it. Thanks very much; that's useful information. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 06/23/2014 06:58 PM, Greg Stark wrote:
> On Mon, Jun 23, 2014 at 3:09 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> However, we don't know of anyone who has tried to do this in a very
>> long time, and are therefore considering removing the remaining
>> support for the VAX platform.  Has anyone tried to build PostgreSQL
>> for VAX lately?
> 
> Actually I tried a while ago but got stuck configuring the network on
> simh so I could get all the tools. I can try again if there's interest
> but we don't necessarily need to keep a port just because there's a
> simulator for it.
 ...not to mention actual hardware.
              -Dave
-- 
Dave McGuire, AK4HZ/3
New Kensington, PA
			
		On Tuesday, June 24, 2014 03:12 CEST, Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Jun 23, 2014 at 6:58 PM, Greg Stark <stark@mit.edu> wrote: > > On Mon, Jun 23, 2014 at 3:09 PM, Robert Haas <robertmhaas@gmail.com> wrote: > >> However, we don't know of anyone who has tried to do this in a very > >> long time, and are therefore considering removing the remaining > >> support for the VAX platform. Has anyone tried to build PostgreSQL > >> for VAX lately? > > > > Actually I tried a while ago but got stuck configuring the network on > > simh so I could get all the tools. I can try again if there's interest > > but we don't necessarily need to keep a port just because there's a > > simulator for it. > > That's really up to you. I'm not particularly interested in > generating interest in maintaining this port if there wouldn't > otherwise be any; I'm trying to figure out whether there is existing > interest in it. For all I know, <whatever>BSD is shipping PostgreSQL > binaries for VAX and every other platform they support in each new > release and people are using them to get real work done. Then again, > for all I know, it doesn't even compile on that platform, and if you > did manage to get it to compile it wouldn't fit on the disk, and if > you managed to fit it on the disk it wouldn't work because key system > calls aren't supported. If someone is still interested in this, I'm > hoping they'll help us figure out whether it's anywhere close to > working, and maybe even contribute a buildfarm critter. If no one > cares, then let's just rip it out and be done with it. > I'm building the vax packages for openbsd. What I can tell is that for 5.5 no postgresql packages were built. But that may be that due to the recent upgrade from gcc 2.95 to 3.3. I guess that not all dependencies to actually build postgresql are available for the vax, or may build successfully there. But I need to verify. Might need a few days, since I'm currently on vacation, with sparse Internet connectivity. ;) Sebastian > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company >
On Tuesday, June 24, 2014 13:37 CEST, "Sebastian Reitenbach" <sebastia@l00-bugdead-prods.de> wrote: > On Tuesday, June 24, 2014 03:12 CEST, Robert Haas <robertmhaas@gmail.com> wrote: > > > On Mon, Jun 23, 2014 at 6:58 PM, Greg Stark <stark@mit.edu> wrote: > > > On Mon, Jun 23, 2014 at 3:09 PM, Robert Haas <robertmhaas@gmail.com> wrote: > > >> However, we don't know of anyone who has tried to do this in a very > > >> long time, and are therefore considering removing the remaining > > >> support for the VAX platform. Has anyone tried to build PostgreSQL > > >> for VAX lately? > > > > > > Actually I tried a while ago but got stuck configuring the network on > > > simh so I could get all the tools. I can try again if there's interest > > > but we don't necessarily need to keep a port just because there's a > > > simulator for it. > > > > That's really up to you. I'm not particularly interested in > > generating interest in maintaining this port if there wouldn't > > otherwise be any; I'm trying to figure out whether there is existing > > interest in it. For all I know, <whatever>BSD is shipping PostgreSQL > > binaries for VAX and every other platform they support in each new > > release and people are using them to get real work done. Then again, > > for all I know, it doesn't even compile on that platform, and if you > > did manage to get it to compile it wouldn't fit on the disk, and if > > you managed to fit it on the disk it wouldn't work because key system > > calls aren't supported. If someone is still interested in this, I'm > > hoping they'll help us figure out whether it's anywhere close to > > working, and maybe even contribute a buildfarm critter. If no one > > cares, then let's just rip it out and be done with it. > > > > I'm building the vax packages for openbsd. What I can tell is that > for 5.5 no postgresql packages were built. But that may be that > due to the recent upgrade from gcc 2.95 to 3.3. > I guess that not all dependencies to actually build postgresql > are available for the vax, or may build successfully there. But I need > to verify. Might need a few days, since I'm currently on vacation, > with sparse Internet connectivity. ;) OK, that was easy: $ cd /usr/ports/databases/postgresql $ make install ===> postgresql-client-9.3.4p0 requires shared libraries . OpenBSD VAX is static only, so no postgresql on OpenBSD VAX before shared libraries will ever be made working on it. cheers, Sebastian > > Sebastian > > > > -- > > Robert Haas > > EnterpriseDB: http://www.enterprisedb.com > > The Enterprise PostgreSQL Company >
<div dir="ltr">Well the latest NetBSD/vax package build doesn't seem to include any PostgreSQL packages <a href="http://ftp.netbsd.org/pub/pkgsrc/packages/NetBSD/vax/6.0_2014Q1/">http://ftp.netbsd.org/pub/pkgsrc/packages/NetBSD/vax/6.0_2014Q1/</a> butI don't know why.<br /><br />I'll try a quick (hah :) build this end to see what happens<br /><br />David<br /><br /></div><divclass="gmail_extra"><br /><br /><div class="gmail_quote">On 24 June 2014 02:12, Robert Haas <span dir="ltr"><<ahref="mailto:robertmhaas@gmail.com" target="_blank">robertmhaas@gmail.com</a>></span> wrote:<br /><blockquoteclass="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><divclass="h5">On Mon, Jun 23, 2014 at 6:58 PM, Greg Stark <<a href="mailto:stark@mit.edu">stark@mit.edu</a>>wrote:<br /> > On Mon, Jun 23, 2014 at 3:09 PM, Robert Haas <<a href="mailto:robertmhaas@gmail.com">robertmhaas@gmail.com</a>>wrote:<br /> >> However, we don't know of anyone whohas tried to do this in a very<br /> >> long time, and are therefore considering removing the remaining<br /> >>support for the VAX platform. Has anyone tried to build PostgreSQL<br /> >> for VAX lately?<br /> ><br/> > Actually I tried a while ago but got stuck configuring the network on<br /> > simh so I could get all thetools. I can try again if there's interest<br /> > but we don't necessarily need to keep a port just because there'sa<br /> > simulator for it.<br /><br /></div></div>That's really up to you. I'm not particularly interested in<br/> generating interest in maintaining this port if there wouldn't<br /> otherwise be any; I'm trying to figure out whetherthere is existing<br /> interest in it. For all I know, <whatever>BSD is shipping PostgreSQL<br /> binariesfor VAX and every other platform they support in each new<br /> release and people are using them to get real workdone. Then again,<br /> for all I know, it doesn't even compile on that platform, and if you<br /> did manage to getit to compile it wouldn't fit on the disk, and if<br /> you managed to fit it on the disk it wouldn't work because keysystem<br /> calls aren't supported. If someone is still interested in this, I'm<br /> hoping they'll help us figureout whether it's anywhere close to<br /> working, and maybe even contribute a buildfarm critter. If no one<br /> cares,then let's just rip it out and be done with it.<br /><div class="HOEnZb"><div class="h5"><br /> --<br /> Robert Haas<br/> EnterpriseDB: <a href="http://www.enterprisedb.com" target="_blank">http://www.enterprisedb.com</a><br /> The EnterprisePostgreSQL Company<br /></div></div></blockquote></div><br /></div>
"Sebastian Reitenbach" <sebastia@l00-bugdead-prods.de> writes:
> OK, that was easy:
> $ cd /usr/ports/databases/postgresql                                   
> $ make install
> ===>  postgresql-client-9.3.4p0  requires shared libraries .
> OpenBSD VAX is static only, so no postgresql on OpenBSD
> VAX before shared libraries will ever be made working on it.
Ouch.  We long ago passed the point of no return as far as requiring
shared library support: there's too much backend functionality that's
in separate shared libraries rather than being linked directly into
the core executable.  I doubt anyone will be interested in taking on
the task of supporting a parallel all-static build.
I think this means we can write off VAX on NetBSD/OpenBSD as a viable
platform for Postgres :-(.  I'm sad to hear it, but certainly have
not got the cycles personally to prevent it.
        regards, tom lane
			
		Dave McGuire <mcguire@neurotica.com> writes:
> On 06/24/2014 12:42 PM, Tom Lane wrote:
>> I think this means we can write off VAX on NetBSD/OpenBSD as a viable
>> platform for Postgres :-(.  I'm sad to hear it, but certainly have
>> not got the cycles personally to prevent it.
>   Nonono...NetBSD/vax has had shared library support for many years.
> It's only OpenBSD that has that limitation.
Ah, thanks for the clarification.
        regards, tom lane
			
		Tom Lane skrev 2014-06-24 18:42: > "Sebastian Reitenbach" <sebastia@l00-bugdead-prods.de> writes: >> OK, that was easy: >> $ cd /usr/ports/databases/postgresql >> $ make install >> ===> postgresql-client-9.3.4p0 requires shared libraries . >> OpenBSD VAX is static only, so no postgresql on OpenBSD >> VAX before shared libraries will ever be made working on it. > Ouch. We long ago passed the point of no return as far as requiring > shared library support: there's too much backend functionality that's > in separate shared libraries rather than being linked directly into > the core executable. I doubt anyone will be interested in taking on > the task of supporting a parallel all-static build. > > I think this means we can write off VAX on NetBSD/OpenBSD as a viable > platform for Postgres :-(. I'm sad to hear it, but certainly have > not got the cycles personally to prevent it. > OpenBSD/vax is static only. NetBSD/vax has dynamic libraries. -- Ragge
On Jun 24, 2014, at 9:42 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I think this means we can write off VAX on NetBSD/OpenBSD as a viable > platform for Postgres :-(. I'm sad to hear it, but certainly have > not got the cycles personally to prevent it. Why? NetBSD/vax has supported shared libraries for a long long time.
On Jun 24, 2014, at 12:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Sebastian Reitenbach" <sebastia@l00-bugdead-prods.de> writes: >> OK, that was easy: > >> $ cd /usr/ports/databases/postgresql >> $ make install >> ===> postgresql-client-9.3.4p0 requires shared libraries . > >> OpenBSD VAX is static only, so no postgresql on OpenBSD >> VAX before shared libraries will ever be made working on it. > > Ouch. We long ago passed the point of no return as far as requiring > shared library support: there's too much backend functionality that's > in separate shared libraries rather than being linked directly into > the core executable. I doubt anyone will be interested in taking on > the task of supporting a parallel all-static build. > > I think this means we can write off VAX on NetBSD/OpenBSD as a viable > platform for Postgres :-(. I'm sad to hear it, but certainly have > not got the cycles personally to prevent it. NetBSD and OpenBSD are different systems. I don’t remember if NetBSD supports shared libraries on VAX, but that’s independentof the fact that OpenBSD doesn’t. paul
On 06/24/2014 12:42 PM, Tom Lane wrote:
> "Sebastian Reitenbach" <sebastia@l00-bugdead-prods.de> writes:
>> OK, that was easy:
> 
>> $ cd /usr/ports/databases/postgresql                                   
>> $ make install
>> ===>  postgresql-client-9.3.4p0  requires shared libraries .
> 
>> OpenBSD VAX is static only, so no postgresql on OpenBSD
>> VAX before shared libraries will ever be made working on it.
> 
> Ouch.  We long ago passed the point of no return as far as requiring
> shared library support: there's too much backend functionality that's
> in separate shared libraries rather than being linked directly into
> the core executable.  I doubt anyone will be interested in taking on
> the task of supporting a parallel all-static build.
> 
> I think this means we can write off VAX on NetBSD/OpenBSD as a viable
> platform for Postgres :-(.  I'm sad to hear it, but certainly have
> not got the cycles personally to prevent it.
 Nonono...NetBSD/vax has had shared library support for many years.
It's only OpenBSD that has that limitation.
               -Dave
-- 
Dave McGuire, AK4HZ/3
New Kensington, PA
			
		Dave McGuire wrote: > On 06/24/2014 12:42 PM, Tom Lane wrote: > > I think this means we can write off VAX on NetBSD/OpenBSD as a viable > > platform for Postgres :-(. I'm sad to hear it, but certainly have > > not got the cycles personally to prevent it. > > Nonono...NetBSD/vax has had shared library support for many years. > It's only OpenBSD that has that limitation. So now we know that NetBSD/vax is free of the shared library limitation that plagues OpenBSD, but does Postgres work on NetBSD/vax otherwise? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Jun 24, 2014 at 10:16 PM, John Klos <john@ziaspace.com> wrote: >> Has anyone tried to build PostgreSQL for VAX lately? If so, did it >> compile? Did you have to use --disable-spinlocks to get it to compile? If >> it did compile, can you actually run it, and does it pass the regression >> tests and work as expected? Would you be willing to work with the >> PostgreSQL to ensure continuing support for this platform, or does that seem >> not worthwhile for whatever reason? > > I've compiled postgresql93-client and postgresql93-server from pkgsrc on a > VAX running NetBSD 6.1.4. The initial launch didn't like the default stack > limit: > > /etc/rc.d/pgsql start > Initializing PostgreSQL databases. > LOG: invalid value for parameter "max_stack_depth": 100 > DETAIL: "max_stack_depth" must not exceed 0kB. > HINT: Increase the platform's stack depth limit via "ulimit -s" or local > equivalent. > FATAL: failed to initialize max_stack_depth to 100 > child process exited with exit code 1 > initdb: removing data directory "/usr/local/pgsql/data" > pg_ctl: database system initialization failed > > I unlimited and tried again. The pgsql process showed it was using 146 > megabytes of memory while initializing, then got as far as: > > /etc/rc.d/pgsql start > Initializing PostgreSQL databases. What value did it select for shared_buffers? How much memory does a high-end VAX have? These days, we try to set shared_buffers = 128MB if the platform will support it, but it's supposed to fall back to smaller values if that doesn't work. It will try allocating that much though, at least for a moment, to see whether it can. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Jun 25, 2014 at 5:30 AM, John Klos <john@ziaspace.com> wrote: > A high end VAX, such as a 4000 Model 108, can have 512 megs (as can an > 11/780, at least in theory), but most of the VAXen used here are VAXstations > such as the 4000/60 or 4000/90, 90a or 96, which have either 104 megs or 128 > megs. Hmm, not bad for old hardware. > and it launched fine. I then tried to run: > > gmake MAX_CONNECTIONS=5 installcheck > > in > /usr/pkgsrc/databases/postgresql93-server/work/postgresql-9.3.4/src/test/regress, > but it failed with: > > ... > gmake[2]: Leaving directory > '/usr/pkgsrc/databases/postgresql93-server/work/postgresql-9.3.4/src/backend' > gcc -O1 -fgcse -fstrength-reduce -fgcse-after-reload -I/usr/include > -I/usr/local/include -Wall -Wmissing-prototypes -Wpointer-arith > -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute > -Wformat-security -fno-strict-aliasing -fwrapv -pthread -mt -D_REENTRANT > -D_THREAD_SAFE -D_POSIX_PTHREAD_SEMANTICS -I../../src/port -DFRONTEND > -I../../src/include -I/usr/include -I/usr/local/include -c -o thread.o > thread.c > cc1: error: unrecognized command line option "-mt" <builtin>: recipe for > target 'thread.o' failed > gmake[1]: *** [thread.o] Error 1 > gmake[1]: Leaving directory > '/usr/pkgsrc/databases/postgresql93-server/work/postgresql-9.3.4/src/port' > ../../../src/Makefile.global:423: recipe for target 'submake-libpgport' > failed > gmake: *** [submake-libpgport] Error 2 > > That's all I have time for tonight. Is there an easier way to run a > testsuite? I think you're doing it right, but apparently configure is mis-identifying which flags are needed for thread-safety on your platform. It's possible configuring with --disable-thread-safety would help, or you could manually edit the Makefile. In any case I'm coming to the conclusion that there's little point in us keeping the VAX-specific code in our source tree, because in fact, this port is broken and doesn't work. Based on your results thus far, I doubt that it would be a huge amount of work to fix that, but unless somebody from the VAX community wants to volunteer to be a PostgreSQL maintainer for that platform, straighten out the things that have gotten broken since this port was originally added, and keep it working on an ongoing basis, it's probably not going to happen. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Jun 25, 2014 at 1:05 PM, John Klos <john@ziaspace.com> wrote: >> In any case I'm coming to the conclusion that there's little point in >> us keeping the VAX-specific code in our source tree, because in fact, >> this port is broken and doesn't work. Based on your results thus far, >> I doubt that it would be a huge amount of work to fix that, but unless >> somebody from the VAX community wants to volunteer to be a PostgreSQL >> maintainer for that platform, straighten out the things that have >> gotten broken since this port was originally added, and keep it >> working on an ongoing basis, it's probably not going to happen. > > While I wouldn't be surprised if you remove the VAX code because not many > people are going to be running PostgreSQL, I'd disagree with the assessment > that this port is broken. It compiles, it initializes databases, it runs, et > cetera, albeit not with the default postgresql.conf. Well, the fact that initdb didn't produce a working configuration and that make installcheck failed to work properly are bad. But, yeah, it's not totally broken. > I'm actually rather impressed at how well PostgreSQL can be adjusted to > lower memory systems. I deploy a lot of embedded systems with 128 megs (a > lot for an embedded system, but nothing compared with what everyone else > assumes), so I'll be checking out PostgreSQL for other uses. I agree, that's cool. > NetBSD's VAX port does lots to help ensure code portability and code > correctness, so it's not going anywhere any time soon. It certainly is a > good sign that PostgreSQL can run on a VAX with only 20 MB or so of resident > memory. Yeah! -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hi, > Has anyone tried to build PostgreSQL for VAX lately? If so, did it > compile? Did you have to use --disable-spinlocks to get it to compile? > If it did compile, can you actually run it, and does it pass the > regression tests and work as expected? Would you be willing to work > with the PostgreSQL to ensure continuing support for this platform, or > does that seem not worthwhile for whatever reason? I've compiled postgresql93-client and postgresql93-server from pkgsrc on a VAX running NetBSD 6.1.4. The initial launch didn't like the default stack limit: /etc/rc.d/pgsql start Initializing PostgreSQL databases. LOG: invalid value for parameter "max_stack_depth": 100 DETAIL: "max_stack_depth" must not exceed 0kB. HINT: Increase the platform's stack depth limit via "ulimit -s" or local equivalent. FATAL: failed to initialize max_stack_depth to 100 child process exited with exit code 1 initdb: removing data directory "/usr/local/pgsql/data" pg_ctl: database system initialization failed I unlimited and tried again. The pgsql process showed it was using 146 megabytes of memory while initializing, then got as far as: /etc/rc.d/pgsql start Initializing PostgreSQL databases. WARNING: enabling "trust" authentication for local connections You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb. Starting pgsql. Then the machine paniced. The serial console showed: panic: usrptmap space leakage cpu0: Begin traceback... panic: usrptmap space leakage Stack traceback : Process is executing in user space. cpu0: End traceback... dump to dev 9,1 not possible It does compile and initialize, so the VAX code does work. However, considering how much memory it uses, I wonder how many people would actually use it. I did run Apache / MySQL / PHP on a VAXstation 4000/60 not long ago, but MySQL takes way too much memory, too. Don't even get me started on how memory PHP uses - someone has to write some good weblog software in C one of these days... John
John Klos skrev 2014-06-25 04:16:<br /><blockquote cite="mid:Pine.NEB.4.64.1406241805581.18041@andromeda.ziaspace.com" type="cite">Thenthe machine paniced. The serial console showed: <br /><br /> panic: usrptmap space leakage <br /> cpu0: Begintraceback... <br /> panic: usrptmap space leakage <br /> Stack traceback : <br /> Process is executing in userspace. <br /> cpu0: End traceback... <br /><br /></blockquote> Hm, can you add info about this panic to PR #28379 ? I will try to hunt this down soon, so I need some test cases.<br /><br /> -- Ragge<br />
Hi, > What value did it select for shared_buffers? How much memory does a > high-end VAX have? These days, we try to set shared_buffers = 128MB > if the platform will support it, but it's supposed to fall back to > smaller values if that doesn't work. It will try allocating that much > though, at least for a moment, to see whether it can. A high end VAX, such as a 4000 Model 108, can have 512 megs (as can an 11/780, at least in theory), but most of the VAXen used here are VAXstations such as the 4000/60 or 4000/90, 90a or 96, which have either 104 megs or 128 megs. I was trying it just using the default postgresql.conf. I changed it: < max_connections = 10 # (change requires restart) > max_connections = 40 # (change requires restart) < shared_buffers = 16MB # min 128kB > shared_buffers = 128MB # min 128kB < temp_buffers = 2MB # min 800kB < max_prepared_transactions = 0 # zero disables the feature > #temp_buffers = 8MB # min 800kB > #max_prepared_transactions = 0 # zero disables the feature < maintenance_work_mem = 8MB # min 1MB < max_stack_depth = 1MB # min 100kB > #maintenance_work_mem = 16MB # min 1MB > #max_stack_depth = 2MB # min 100kB < max_files_per_process = 100 # min 25 > #max_files_per_process = 1000 # min 25 and it launched fine. I then tried to run: gmake MAX_CONNECTIONS=5 installcheck in /usr/pkgsrc/databases/postgresql93-server/work/postgresql-9.3.4/src/test/regress, but it failed with: ... gmake[2]: Leaving directory '/usr/pkgsrc/databases/postgresql93-server/work/postgresql-9.3.4/src/backend' gcc -O1 -fgcse -fstrength-reduce -fgcse-after-reload -I/usr/include -I/usr/local/include -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -pthread -mt -D_REENTRANT -D_THREAD_SAFE -D_POSIX_PTHREAD_SEMANTICS -I../../src/port -DFRONTEND -I../../src/include -I/usr/include -I/usr/local/include -c -o thread.o thread.c cc1: error: unrecognized command line option "-mt" <builtin>: recipe for target 'thread.o' failed gmake[1]: *** [thread.o] Error 1 gmake[1]: Leaving directory '/usr/pkgsrc/databases/postgresql93-server/work/postgresql-9.3.4/src/port' ../../../src/Makefile.global:423: recipe for target 'submake-libpgport' failed gmake: *** [submake-libpgport] Error 2 That's all I have time for tonight. Is there an easier way to run a testsuite? Thanks, John
On 24/06/14 10:16 PM, John Klos wrote: > Hi, > >> Has anyone tried to build PostgreSQL for VAX lately? If so, did it >> compile? Did you have to use --disable-spinlocks to get it to >> compile? If it did compile, can you actually run it, and does it pass >> the regression tests and work as expected? Would you be willing to >> work with the PostgreSQL to ensure continuing support for this >> platform, or does that seem not worthwhile for whatever reason? > > I've compiled postgresql93-client and postgresql93-server from pkgsrc on > a VAX running NetBSD 6.1.4. ... > It does compile and initialize, so the VAX code does work. However, > considering how much memory it uses, I wonder how many people would > actually use it. I did run Apache / MySQL / PHP on a VAXstation 4000/60 I guess I shan't expect to run PgSQL on a MicroVAX II (9MB), NetBSD 1.4.1. I did get Apache 1.3.x built on it. > not long ago, but MySQL takes way too much memory, too. Don't even get > me started on how memory PHP uses - someone has to write some good > weblog software in C one of these days... If only C and PHP weren't the only options! --T > > John >
On 25 June 2014 12:38, Toby Thain <toby@telegraphics.com.au> wrote: > On 24/06/14 10:16 PM, John Klos wrote: >> >> Hi, >> >>> Has anyone tried to build PostgreSQL for VAX lately? If so, did it >>> compile? Did you have to use --disable-spinlocks to get it to >>> compile? If it did compile, can you actually run it, and does it pass >>> the regression tests and work as expected? Would you be willing to >>> work with the PostgreSQL to ensure continuing support for this >>> platform, or does that seem not worthwhile for whatever reason? >> >> >> I've compiled postgresql93-client and postgresql93-server from pkgsrc on >> a VAX running NetBSD 6.1.4. ... >> >> It does compile and initialize, so the VAX code does work. However, >> considering how much memory it uses, I wonder how many people would >> actually use it. I did run Apache / MySQL / PHP on a VAXstation 4000/60 > > I guess I shan't expect to run PgSQL on a MicroVAX II (9MB), NetBSD 1.4.1. I > did get Apache 1.3.x built on it. I suspect it might technically be possible to run PgSQL on that hardware - probably best done with an app on another box (maybe a second uVAX II :) which is not in a particular hurry for query responses >> not long ago, but MySQL takes way too much memory, too. Don't even get >> me started on how memory PHP uses - someone has to write some good >> weblog software in C one of these days... > > If only C and PHP weren't the only options! Tsk, how could we forget VAX MACRO assembler :-p
On 06/25/2014 05:30 AM, John Klos wrote:
>> What value did it select for shared_buffers?  How much memory does a
>> high-end VAX have?  These days, we try to set shared_buffers = 128MB
>> if the platform will support it, but it's supposed to fall back to
>> smaller values if that doesn't work.  It will try allocating that much
>> though, at least for a moment, to see whether it can.
> 
> A high end VAX, such as a 4000 Model 108, can have 512 megs (as can an
> 11/780, at least in theory), but most of the VAXen used here are
> VAXstations such as the 4000/60 or 4000/90, 90a or 96, which have either
> 104 megs or 128 megs.
 My VAX-7000 has 1.5GB. B-)
           -Dave
-- 
Dave McGuire, AK4HZ/3
New Kensington, PA
			
		>> That's all I have time for tonight. Is there an easier way to run a >> testsuite? > > I think you're doing it right, but apparently configure is > mis-identifying which flags are needed for thread-safety on your > platform. It's possible configuring with --disable-thread-safety > would help, or you could manually edit the Makefile. I'll play with it some more in a little bit. This is why I tend to trust the pkgsrc framework - it just works. > In any case I'm coming to the conclusion that there's little point in > us keeping the VAX-specific code in our source tree, because in fact, > this port is broken and doesn't work. Based on your results thus far, > I doubt that it would be a huge amount of work to fix that, but unless > somebody from the VAX community wants to volunteer to be a PostgreSQL > maintainer for that platform, straighten out the things that have > gotten broken since this port was originally added, and keep it > working on an ongoing basis, it's probably not going to happen. While I wouldn't be surprised if you remove the VAX code because not many people are going to be running PostgreSQL, I'd disagree with the assessment that this port is broken. It compiles, it initializes databases, it runs, et cetera, albeit not with the default postgresql.conf. I'm actually rather impressed at how well PostgreSQL can be adjusted to lower memory systems. I deploy a lot of embedded systems with 128 megs (a lot for an embedded system, but nothing compared with what everyone else assumes), so I'll be checking out PostgreSQL for other uses. NetBSD's VAX port does lots to help ensure code portability and code correctness, so it's not going anywhere any time soon. It certainly is a good sign that PostgreSQL can run on a VAX with only 20 MB or so of resident memory. Thanks, John
On Wed, Jun 25, 2014 at 10:17 AM, Robert Haas <robertmhaas@gmail.com> wrote: > Well, the fact that initdb didn't produce a working configuration and > that make installcheck failed to work properly are bad. But, yeah, > it's not totally broken. Yeah it seems to me that these kinds of autoconf and initdb tests failing are different from a platform where the spinlock code doesn't work. It's actually valuable to have a platform where people routinely trigger those configuration values. If they're broken there's not much value in carrying them. -- greg
Robert Haas <robertmhaas@gmail.com> writes: > On Wed, Jun 25, 2014 at 1:05 PM, John Klos <john@ziaspace.com> wrote: >> While I wouldn't be surprised if you remove the VAX code because not many >> people are going to be running PostgreSQL, I'd disagree with the assessment >> that this port is broken. It compiles, it initializes databases, it runs, et >> cetera, albeit not with the default postgresql.conf. > Well, the fact that initdb didn't produce a working configuration and > that make installcheck failed to work properly are bad. But, yeah, > it's not totally broken. >> I'm actually rather impressed at how well PostgreSQL can be adjusted to >> lower memory systems. I deploy a lot of embedded systems with 128 megs (a >> lot for an embedded system, but nothing compared with what everyone else >> assumes), so I'll be checking out PostgreSQL for other uses. > I agree, that's cool. I think we'd be happy to keep the VAX port of PG going as long as we get regular feedback on it, ie closed-loop maintenance not open-loop ;-) Is there anyone in the NetBSD/VAX community who would be willing to host a PG buildfarm member? http://buildfarm.postgresql.org/index.html The requirements for this beyond what it takes to build from source are basically just working git and Perl (ccache helps a lot too), and enough cycles to build the code at least once a day or so. Once you've got the thing set up it seldom needs human attention. If we had a buildfarm member to tell us when we break things, it would be a lot easier to promise continued support. regards, tom lane
On Wed, Jun 25, 2014 at 1:58 PM, John Klos <john@ziaspace.com> wrote: >> Well, the fact that initdb didn't produce a working configuration and >> that make installcheck failed to work properly are bad. But, yeah, >> it's not totally broken. > > I think it did create a working configuration (with the exception of > postgresql.conf), because I can run psql and do stuff on the command line: Yeah, but postgresql.conf should not have require manual tweaking... > I don't know enough to really test this. Can you recommend a simple script > to do some PostgreSQL testing? Well, this is what 'make installcheck' is for... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> Well, the fact that initdb didn't produce a working configuration and > that make installcheck failed to work properly are bad. But, yeah, > it's not totally broken. I think it did create a working configuration (with the exception of postgresql.conf), because I can run psql and do stuff on the command line: psql --username=pgsql postgres psql (9.3.4) Type "help" for help. postgres=# CREATE DATABASE test; CREATE DATABASE postgres=# CREATE USER testuser WITH PASSWORD 'test'; CREATE ROLE postgres=# GRANT ALL PRIVILEGES ON DATABASE test to testuser; GRANT postgres=# CREATE SCHEMA testschema; CREATE SCHEMA postgres=# CREATE TABLE testschema.testtable (testserial serial PRIMARY KEY, testchar varchar (100) NOT NULL); CREATE TABLE I don't know enough to really test this. Can you recommend a simple script to do some PostgreSQL testing? John
Dave McGuire <mcguire@neurotica.com> writes: > On 06/25/2014 01:57 PM, Tom Lane wrote: >> Is there anyone in the NetBSD/VAX community who would be willing to >> host a PG buildfarm member? > I could put together a simh-based machine (i.e., fast) on a vm, if > nobody else has stepped up for this. No other volunteers have emerged, so if you'd do that it'd be great. Probably first we ought to fix whatever needs to be fixed to get a standard build to go through. The one existing NetBSD machine in the buildfarm, coypu, doesn't seem to be using any special configuration hacks: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=coypu&dt=2014-06-29%2012%3A33%3A12 so I'm a bit confused as to what we need to change for VAX. > Dave McGuire, AK4HZ/3 > New Kensington, PA Hey, right up the river from here! regards, tom lane
On 2014-06-29 10:24:22 -0400, Tom Lane wrote: > Dave McGuire <mcguire@neurotica.com> writes: > > On 06/25/2014 01:57 PM, Tom Lane wrote: > >> Is there anyone in the NetBSD/VAX community who would be willing to > >> host a PG buildfarm member? > > > I could put together a simh-based machine (i.e., fast) on a vm, if > > nobody else has stepped up for this. > > No other volunteers have emerged, so if you'd do that it'd be great. Maybe I'm just not playful enough, but keeping a platform alive so we can run postgres in simulator seems a bit, well, pointless. On the other hand VAX on *BSD isn't causing many problems that I'm aware of though, so, whatever. I've had a quick look and it seems netbsd emulates atomics on vax for its own purposes (_do_cas in https://www.almos.fr/trac/netbsdtsar/browser/vendor/netbsd/5/src/sys/arch/vax/vax/lock_stubs.S) using a hashed lock. Interestingly ither my nonexistant VAX knowledge is betraying me (wouldn't be a surprise) or the algorithm doesn't test whether the lock (bbssi) actually suceeded though... So I don't think we'd be much worse off with the userland spinlock protecting atomic ops. > Probably first we ought to fix whatever needs to be fixed to get a > standard build to go through. The one existing NetBSD machine in the > buildfarm, coypu, doesn't seem to be using any special configuration > hacks: > http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=coypu&dt=2014-06-29%2012%3A33%3A12 > so I'm a bit confused as to what we need to change for VAX. That's probably something we should fix independently though. One of the failures was: > gmake[2]: Leaving directory > '/usr/pkgsrc/databases/postgresql93-server/work/postgresql-9.3.4/src/backend' > gcc -O1 -fgcse -fstrength-reduce -fgcse-after-reload -I/usr/include > -I/usr/local/include -Wall -Wmissing-prototypes -Wpointer-arith > -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute > -Wformat-security -fno-strict-aliasing -fwrapv -pthread -mt -D_REENTRANT > -D_THREAD_SAFE -D_POSIX_PTHREAD_SEMANTICS -I../../src/port -DFRONTEND > -I../../src/include -I/usr/include -I/usr/local/include -c -o thread.o > thread.c > cc1: error: unrecognized command line option "-mt" <builtin>: recipe for > target 'thread.o' failed > gmake[1]: *** [thread.o] Error 1 > gmake[1]: Leaving directory > '/usr/pkgsrc/databases/postgresql93-server/work/postgresql-9.3.4/src/port' > ../../../src/Makefile.global:423: recipe for target 'submake-libpgport' which I don't really understand - we actually test all that in acx_pthread.m4? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes:
> Maybe I'm just not playful enough, but keeping a platform alive so we
> can run postgres in simulator seems a bit, well, pointless.
True --- an actual machine would be more useful, even if slower.  Some
of the existing buildfarm critters are pretty slow already, so that's
not a huge hindrance AFAIK.
> That's probably something we should fix independently though. One of the
> failures was:
>> cc1: error: unrecognized command line option "-mt" <builtin>: recipe for
>> target 'thread.o' failed
> which I don't really understand - we actually test all that in
> acx_pthread.m4?
Yeah.  We'd need to look at the relevant part of config.log to be sure,
but my guess is that configure tried that switch, failed to recognize
that it wasn't actually working, and seized on it as what to use.
Maybe the test-for-workingness isn't quite right for this platform.
        regards, tom lane
			
		[ trimming the cc list since this isn't VAX-specific ]
I wrote:
> Yeah.  We'd need to look at the relevant part of config.log to be sure,
> but my guess is that configure tried that switch, failed to recognize
> that it wasn't actually working, and seized on it as what to use.
> Maybe the test-for-workingness isn't quite right for this platform.
BTW, it sure looks like the part of ACX_PTHREAD beginning with    # Various other checks:    if test "x$acx_pthread_ok"
=xyes; then
 
(lines 163..210 in HEAD's acx_pthread.m4) is dead code.  One might
think that this runs if the previous loop found any working thread/
library combinations, but actually it runs only if the *last* switch
tried worked, which seems a bit improbable, and even if that was the
intention it's sure fragile as can be.
It looks like that section is mostly AIX-specific hacks, and given that
it's been awhile since there was any AIX in the buildfarm, I wonder if
that code is correct or needed at all.
        regards, tom lane
			
		I wrote:
> BTW, it sure looks like the part of ACX_PTHREAD beginning with
>      # Various other checks:
>      if test "x$acx_pthread_ok" = xyes; then
> (lines 163..210 in HEAD's acx_pthread.m4) is dead code.
On closer inspection, this has been broken since commit
e48322a6d6cfce1ec52ab303441df329ddbc04d1, which is just barely short of
its tenth birthday.  The reason we've not noticed is that Postgres makes
no use of PTHREAD_CREATE_JOINABLE, nor of PTHREAD_CC, nor of HAVE_PTHREAD,
nor of the success/failure options for ACX_PTHREAD.
I'm tempted to just rip out all the useless code rather than fix the
logic bug as such.  OTOH, that might complicate updating to more recent
versions of the original Autoconf macro.  On the third hand, we've not
bothered to do that in ten years either.
Thoughts?
        regards, tom lane
			
		On 2014-06-29 12:20:02 -0400, Tom Lane wrote: > I wrote: > > BTW, it sure looks like the part of ACX_PTHREAD beginning with > > # Various other checks: > > if test "x$acx_pthread_ok" = xyes; then > > (lines 163..210 in HEAD's acx_pthread.m4) is dead code. > > On closer inspection, this has been broken since commit > e48322a6d6cfce1ec52ab303441df329ddbc04d1, which is just barely short of > its tenth birthday. The reason we've not noticed is that Postgres makes > no use of PTHREAD_CREATE_JOINABLE, nor of PTHREAD_CC, nor of HAVE_PTHREAD, > nor of the success/failure options for ACX_PTHREAD. > > I'm tempted to just rip out all the useless code rather than fix the > logic bug as such. OTOH, that might complicate updating to more recent > versions of the original Autoconf macro. On the third hand, we've not > bothered to do that in ten years either. Rip it out, maye leaving a comment inplace like /* upstream tests for stuff we don't need here */ in its place. Since there have been a number of changes to the file, one large missing hunk shouldn't make the task of syncing measurably more difficult. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Dave McGuire <mcguire@neurotica.com> writes:
> On 06/29/2014 10:54 AM, Andres Freund wrote:
>> Maybe I'm just not playful enough, but keeping a platform alive so we
>> can run postgres in simulator seems a bit, well, pointless.
>   On the "in a simulator" matter: It's important to keep in mind that
> there are more VAXen out there than just simulated ones.  I'm offering
> up a simulated one here because I can spin it up in a dedicated VM on a
> VMware host that's already running and I already have power budget for.
>  I could just as easily run it on real hardware...there are, at last
> count, close to forty real-iron VAXen here, but only a few of those are
> running 24/7.  I'd happily bring up another one to do Postgres builds
> and testing, if someone will send me the bucks to pay for the additional
> power and cooling.  (that is a real offer)
Well, the issue from our point of view is that a lot of what we care about
testing is extremely low-level hardware behavior, like whether spinlocks
work as expected across processors.  It's not clear that a simulator would
provide a sufficiently accurate emulation.
OTOH, the really nasty issues like cache coherency rules don't arise in
single-processor systems.  So unless you have a multiprocessor VAX
available to spin up, a simulator may tell us as much as we'd learn
anyway.
(If you have got one, maybe some cash could be found --- we do have
project funds available, and I think they'd be well spent on testing
purposes.  I don't make those decisions though.)
        regards, tom lane
			
		On June 29, 2014 9:01:27 PM CEST, Dave McGuire <mcguire@neurotica.com> wrote: >On 06/29/2014 02:58 PM, Patrick Finnegan wrote: >> Last I checked, NetBSD doesn't support any sort of multiprocessor >VAX. >> Multiprocessor VAXes exist, but you're stuck with either Ultrix or >VMS >> on them. > > Hi Pat, it's good to see your name in my inbox. > > NetBSD ran on multiprocessor BI-bus VAXen many, many years ago. Is >that support broken? The new atomics code refers to a VAX SMP define... So somebody seems to have thought about it not too long ago. Andres Andres -- Please excuse brevity and formatting - I am writing this on my mobile phone. Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Dave McGuire <mcguire@neurotica.com> writes: > On 06/29/2014 10:24 AM, Tom Lane wrote: >> Hey, right up the river from here! > Come on up and hack! There's always something neat going on around > here. Ever run a PDP-11? B-) There were so many PDP-11s around CMU when I was an undergrad that I remember seeing spare ones being used as doorstops ;-). I even got paid to help hack on this: http://en.wikipedia.org/wiki/C.mmp So nah, don't want to do it any more. Been there done that. regards, tom lane
On 06/29/2014 03:10 PM, Patrick Finnegan wrote:
> And it also runs on the 11/780 which can have multiple CPUs... but I've
> never seen support for using more than one CPU (and the NetBSD page
> still says "NetBSD/vax can only make use of one CPU on multi-CPU
> machines").  If that has changed, I'd love to hear about it.  Support
> for my VAX 6000 would also be nice...
 It changed well over a decade ago, if memory serves.  The specific
work was done on a VAX-8300 or -8350.  I'm pretty sure the 11/780's
specific flavor of SMP is not supported.  (though I do have a pair of
11/785s here...wanna come hack? ;))
              -Dave
-- 
Dave McGuire, AK4HZ/3
New Kensington, PA
			
		On 06/29/2014 10:24 AM, Tom Lane wrote:
>>> Is there anyone in the NetBSD/VAX community who would be willing to
>>> host a PG buildfarm member?
> 
>>   I could put together a simh-based machine (i.e., fast) on a vm, if
>> nobody else has stepped up for this.
> 
> No other volunteers have emerged, so if you'd do that it'd be great.
 Ok, I am certainly willing to do it.  Though I haven't used PostgreSQL
in quite awhile, I ran it A LOT back when its query language was
PostQUEL, and later when it was known as Postgres95.  It'd give me a
serious "warm fuzzy" to be able to support the project in some way.
>> Dave McGuire, AK4HZ/3
>> New Kensington, PA
> 
> Hey, right up the river from here!
 Come on up and hack!  There's always something neat going on around
here.  Ever run a PDP-11?  B-)
                -Dave
-- 
Dave McGuire, AK4HZ/3
New Kensington, PA
			
		On Sun, Jun 29, 2014 at 3:01 PM, Dave McGuire <mcguire@neurotica.com> wrote:
 
And it also runs on the 11/780 which can have multiple CPUs... but I've never seen support for using more than one CPU (and the NetBSD page still says "NetBSD/vax can only make use of one CPU on multi-CPU machines").  If that has changed, I'd love to hear about it.  Support for my VAX 6000 would also be nice...
.
			
		On 06/29/2014 02:58 PM, Patrick Finnegan wrote:Hi Pat, it's good to see your name in my inbox.
> Last I checked, NetBSD doesn't support any sort of multiprocessor VAX.
> Multiprocessor VAXes exist, but you're stuck with either Ultrix or VMS
> on them.
Hi! :)
NetBSD ran on multiprocessor BI-bus VAXen many, many years ago. Is
that support broken?
.
Pat
On Sun, Jun 29, 2014 at 3:12 PM, Dave McGuire <mcguire@neurotica.com> wrote:
 Which flavor of 11/78x MP? The official DEC kind (which is really just two computers with a block of shared memory) or the Purdue kind (which isn't quite SMP, but actually shares the system bus)?
			
		On 06/29/2014 03:10 PM, Patrick Finnegan wrote:It changed well over a decade ago, if memory serves. The specific
> And it also runs on the 11/780 which can have multiple CPUs... but I've
> never seen support for using more than one CPU (and the NetBSD page
> still says "NetBSD/vax can only make use of one CPU on multi-CPU
> machines"). If that has changed, I'd love to hear about it. Support
> for my VAX 6000 would also be nice...
work was done on a VAX-8300 or -8350. I'm pretty sure the 11/780's
specific flavor of SMP is not supported. (though I do have a pair of
11/785s here...wanna come hack? ;))
If it works, someone should update the documentation. :)
Pat
Dave McGuire skrev 2014-06-29 21:01: > On 06/29/2014 02:58 PM, Patrick Finnegan wrote: >> Last I checked, NetBSD doesn't support any sort of multiprocessor VAX. >> Multiprocessor VAXes exist, but you're stuck with either Ultrix or VMS >> on them. > Hi Pat, it's good to see your name in my inbox. > > NetBSD ran on multiprocessor BI-bus VAXen many, many years ago. Is > that support broken? > I made it run on 8300 once, in the early days of NetBSD MP. I planned to make it run on both 8800 and 6300, but due to lack of docs neither of those came true. So, unless someone has a 8300 to test on (just over 1 VUPS per CPU, not much), current state is unknown. "It worked last time I tested" :-) -- Ragge
Last I checked, NetBSD doesn't support any sort of multiprocessor VAX.  Multiprocessor VAXes exist, but you're stuck with either Ultrix or VMS on them.
Pat
On Sun, Jun 29, 2014 at 2:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Maybe I'm just not playful enough, but keeping a platform alive so we
>> can run postgres in simulator seems a bit, well, pointless.> On the "in a simulator" matter: It's important to keep in mind thatWell, the issue from our point of view is that a lot of what we care about
> there are more VAXen out there than just simulated ones. I'm offering
> up a simulated one here because I can spin it up in a dedicated VM on a
> VMware host that's already running and I already have power budget for.
> I could just as easily run it on real hardware...there are, at last
> count, close to forty real-iron VAXen here, but only a few of those are
> running 24/7. I'd happily bring up another one to do Postgres builds
> and testing, if someone will send me the bucks to pay for the additional
> power and cooling. (that is a real offer)
testing is extremely low-level hardware behavior, like whether spinlocks
work as expected across processors. It's not clear that a simulator would
provide a sufficiently accurate emulation.
OTOH, the really nasty issues like cache coherency rules don't arise in
single-processor systems. So unless you have a multiprocessor VAX
available to spin up, a simulator may tell us as much as we'd learn
anyway.
(If you have got one, maybe some cash could be found --- we do have
project funds available, and I think they'd be well spent on testing
purposes. I don't make those decisions though.)
regards, tom lane
> Well, the issue from our point of view is that a lot of what we care about > testing is extremely low-level hardware behavior, like whether spinlocks > work as expected across processors. It's not clear that a simulator would > provide a sufficiently accurate emulation. > > OTOH, the really nasty issues like cache coherency rules don't arise in > single-processor systems. So unless you have a multiprocessor VAX > available to spin up, a simulator may tell us as much as we'd learn > anyway. > > (If you have got one, maybe some cash could be found --- we do have > project funds available, and I think they'd be well spent on testing > purposes. I don't make those decisions though.) Depending on how often you'd like the system to try to run a compile, I'd be happy to run it on a VAXstation 4000/60. It runs bulk package builds for pkgsrc, but we could do a compile every week or so (every day would really eat into cycles for other packages). John
On 6/29/14, 12:20 PM, Tom Lane wrote: > I'm tempted to just rip out all the useless code rather than fix the > logic bug as such. OTOH, that might complicate updating to more recent > versions of the original Autoconf macro. On the third hand, we've not > bothered to do that in ten years either. Our version of the tests are already hacked up enough that we probably won't ever upgrade. I say keep hacking away.
On 06/29/2014 02:58 PM, Patrick Finnegan wrote:
> Last I checked, NetBSD doesn't support any sort of multiprocessor VAX.
>  Multiprocessor VAXes exist, but you're stuck with either Ultrix or VMS
> on them.
 Hi Pat, it's good to see your name in my inbox.
 NetBSD ran on multiprocessor BI-bus VAXen many, many years ago.  Is
that support broken?
               -Dave
-- 
Dave McGuire, AK4HZ/3
New Kensington, PA
			
		On 06/29/2014 10:54 AM, Andres Freund wrote:
>>>> Is there anyone in the NetBSD/VAX community who would be willing to
>>>> host a PG buildfarm member?
>>
>>>   I could put together a simh-based machine (i.e., fast) on a vm, if
>>> nobody else has stepped up for this.
>>
>> No other volunteers have emerged, so if you'd do that it'd be great.
> 
> Maybe I'm just not playful enough, but keeping a platform alive so we
> can run postgres in simulator seems a bit, well, pointless.
 There are a couple of points.
 First and foremost is portability.  Using as many architectures as
possible as test platforms "keeps us honest" and can be a highly
valuable tool for early discovery of portability issues or "iffy" code
constructs.  The VAX in particular is an "extreme example" of many
aspects of processor architecture, and as such, is an excellent tool for
that sort of thing.
 Next, some people actually want to *run* it on a VAX.  Maybe for hobby
reasons, maybe for "approved platform" reasons, whatever...We don't know
(and don't care) why.
 On the "in a simulator" matter: It's important to keep in mind that
there are more VAXen out there than just simulated ones.  I'm offering
up a simulated one here because I can spin it up in a dedicated VM on a
VMware host that's already running and I already have power budget for.I could just as easily run it on real
hardware...thereare, at last
 
count, close to forty real-iron VAXen here, but only a few of those are
running 24/7.  I'd happily bring up another one to do Postgres builds
and testing, if someone will send me the bucks to pay for the additional
power and cooling.  (that is a real offer)
                -Dave
-- 
Dave McGuire, AK4HZ/3
New Kensington, PA
			
		On 06/29/2014 03:35 PM, Tom Lane wrote: >>> Hey, right up the river from here! > >> Come on up and hack! There's always something neat going on around >> here. Ever run a PDP-11? B-) > > There were so many PDP-11s around CMU when I was an undergrad that > I remember seeing spare ones being used as doorstops ;-). I even > got paid to help hack on this: > http://en.wikipedia.org/wiki/C.mmp > > So nah, don't want to do it any more. Been there done that. Ah well, more for me then. I cut my teeth on PDP-11s 30yrs ago, and I always find it therapeutic to fire one up today. I have a biggish commercial building and am putting together a computer museum up here. Right now it's only "privately open", and we have hack workshops and stuff to get things running. There are about sixty racks in here. It's awesome that you worked on C.mmp. I saw that machine in person a few weeks ago. I'd love to discuss that with you a bit off-list if you can spare the time. -Dave -- Dave McGuire, AK4HZ/3 New Kensington, PA
On Sun, Jun 29, 2014 at 10:24:22AM -0400, Tom Lane wrote: > http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=coypu&dt=2014-06-29%2012%3A33%3A12 > so I'm a bit confused as to what we need to change for VAX. Dave did use NetBSD 6.1 (IIRC), which uses an ancient gcc version. I would suggest to go for NetBSD-current (if someone sets it up before the netbsd-7 branch, or 7 post-branch), which uses gcc 4.8.3 on vax. All other architectures already used a more modern gcc on 6. However, there is still a bit of fallout to fix in -current. Martin
On 2014-06-29 12:12, Dave McGuire wrote: > On 06/29/2014 03:10 PM, Patrick Finnegan wrote: >> And it also runs on the 11/780 which can have multiple CPUs... but I've >> never seen support for using more than one CPU (and the NetBSD page >> still says "NetBSD/vax can only make use of one CPU on multi-CPU >> machines"). If that has changed, I'd love to hear about it. Support >> for my VAX 6000 would also be nice... > > It changed well over a decade ago, if memory serves. The specific > work was done on a VAX-8300 or -8350. I'm pretty sure the 11/780's > specific flavor of SMP is not supported. (though I do have a pair of > 11/785s here...wanna come hack? ;)) Well, VAX-11/78x do not support SMP, they have (had) ASMP only. Johnny
On 06/25/2014 01:57 PM, Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Wed, Jun 25, 2014 at 1:05 PM, John Klos <john@ziaspace.com> wrote: >>> While I wouldn't be surprised if you remove the VAX code because not many >>> people are going to be running PostgreSQL, I'd disagree with the assessment >>> that this port is broken. It compiles, it initializes databases, it runs, et >>> cetera, albeit not with the default postgresql.conf. > >> Well, the fact that initdb didn't produce a working configuration and >> that make installcheck failed to work properly are bad. But, yeah, >> it's not totally broken. > >>> I'm actually rather impressed at how well PostgreSQL can be adjusted to >>> lower memory systems. I deploy a lot of embedded systems with 128 megs (a >>> lot for an embedded system, but nothing compared with what everyone else >>> assumes), so I'll be checking out PostgreSQL for other uses. > >> I agree, that's cool. > > I think we'd be happy to keep the VAX port of PG going as long as we > get regular feedback on it, ie closed-loop maintenance not open-loop ;-) > > Is there anyone in the NetBSD/VAX community who would be willing to > host a PG buildfarm member? > http://buildfarm.postgresql.org/index.html > > The requirements for this beyond what it takes to build from source > are basically just working git and Perl (ccache helps a lot too), > and enough cycles to build the code at least once a day or so. > Once you've got the thing set up it seldom needs human attention. > > If we had a buildfarm member to tell us when we break things, it > would be a lot easier to promise continued support. I could put together a simh-based machine (i.e., fast) on a vm, if nobody else has stepped up for this. -Dave -- Dave McGuire, AK4HZ/3 New Kensington, PA
On 06/29/2014 02:06 PM, Tom Lane wrote:
> Well, the issue from our point of view is that a lot of what we care about
> testing is extremely low-level hardware behavior, like whether spinlocks
> work as expected across processors.  It's not clear that a simulator would
> provide a sufficiently accurate emulation.
 Oh ok, I understand.  Thank you for the clarification.
> OTOH, the really nasty issues like cache coherency rules don't arise in
> single-processor systems.  So unless you have a multiprocessor VAX
> available to spin up, a simulator may tell us as much as we'd learn
> anyway.
> 
> (If you have got one, maybe some cash could be found --- we do have
> project funds available, and I think they'd be well spent on testing
> purposes.  I don't make those decisions though.)
 I have several multiprocessor VAXen, but only one of them is capable
of running NetBSD, and I only (currently) have a single processor in
that machine.  I can (and want to) fix that, but not right away.
             -Dave
-- 
Dave McGuire, AK4HZ/3
New Kensington, PA
			
		On Wed, Jul 16, 2014 at 11:45 PM, Thor Lancelot Simon <tls@panix.com> wrote: > Well, I have to ask this question: why should there be any "vax-specific > code"? What facilities beyond what POSIX with the threading extensions > offers on a modern system do you really need? Why? We have a spinlock implementation. When spinlocks are not available, we have to fall back to using semaphores, which is much slower. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Jul 17, 2014 at 4:45 AM, Thor Lancelot Simon <tls@panix.com> wrote: > Except, of course, for IEEE floating point, because the VAX's floating point > unit simply does not provide that Actually I think that's relevant. We usually get focused on the concurrency because that's an area where architectures vary a lot but it sounds like VAX barely supports multiple CPUs and generally older architectures had fairly mundane concurrency semantics since they were designed to work with existing toolchains. From memory it wasn't until later Sparc chips and Alpha that people started to experiment with looser concurrency models and expecting the toolchains to satisfy complex constraints to make them work. But imho the interesting thing about supporting some older architectures is for things like smoking out assumptions that math is IEEE floating point or whatever caused initdb to generate an initial config that couldn't start due to requiring too much memory. There could also be interesting(ish) performance losses if we're using lots of floating point math on a machine where floating point is emulated or perhaps using lots of 64-bit integers on a machine where it's implemented by the compiler using 32-bit operations. I don't think we're too concerned about performance on older architectures but if it's easy enough to avoid we might want to. Or at least we might want to know what architectures can't reasonably run a database due to these kinds of issues. -- greg
On Thu, Jul 17, 2014 at 4:04 PM, Johnny Billquist <bqt@update.uu.se> wrote: > Also, VAX did not use CAS as the general paradigm for atomic writes and so > on, but have other explicit instructions that are guaranteed to be atomic. > NetBSD/vax don't use the VAX specific instructions, but emulates CAS in the > kernel instead. But I don't remember how that extends to userland. It's > obviously easiest if userland programs use the pthread library functions, > which are guaranteed to work right even in multiprocessor environment. pthread functions may work by accident in shared memory but there's no way to be sure they won't depend on some pthread threading data structures. In short, if you don't use pthreads you can't really count on pthread functions to work. We did experiment a while back with using futexes on Linux instead of our spinlocks but the experiments didn't seem to work out. -- greg
On Wed, Jun 25, 2014 at 10:50:47AM -0700, Greg Stark wrote: > On Wed, Jun 25, 2014 at 10:17 AM, Robert Haas <robertmhaas@gmail.com> wrote: > > Well, the fact that initdb didn't produce a working configuration and > > that make installcheck failed to work properly are bad. But, yeah, > > it's not totally broken. > > Yeah it seems to me that these kinds of autoconf and initdb tests > failing are different from a platform where the spinlock code doesn't > work. It's actually valuable to have a platform where people routinely > trigger those configuration values. If they're broken there's not much > value in carrying them. Well, I have to ask this question: why should there be any "vax-specific code"? What facilities beyond what POSIX with the threading extensions offers on a modern system do you really need? Why? It seems to me that NetBSD/vax is a very good platform for testing one's assumptions about whether one's code is truly portable -- because it is a moderately weird architecture, with some resource constraints, but with a modern kernel and runtime offering everything you'd get from a software point of view on any other platform. Except, of course, for IEEE floating point, because the VAX's floating point unit simply does not provide that. But if other tests fail on the VAX or one's source tree is littered with any other kind of VAX-specific code or special cases for VAX, I would submit that this suggests one's code has fairly serious architectual or implementation discipline issues. Thor
On Thu, Jul 17, 2014 at 07:47:28AM -0400, Robert Haas wrote: > On Wed, Jul 16, 2014 at 11:45 PM, Thor Lancelot Simon <tls@panix.com> wrote: > > Well, I have to ask this question: why should there be any "vax-specific > > code"? What facilities beyond what POSIX with the threading extensions > > offers on a modern system do you really need? Why? > > We have a spinlock implementation. When spinlocks are not available, > we have to fall back to using semaphores, which is much slower. Neither pthread_mutex nor pthread_rwlock suffices? Is the spinlock implementation in terms of the primitives provided by atomic.h? Could it be? If so there should really be nothing unusual about the VAX platform except the FPU. Thor
On 2014-07-17 16:53, Greg Stark wrote: > On Thu, Jul 17, 2014 at 4:45 AM, Thor Lancelot Simon <tls@panix.com> wrote: >> Except, of course, for IEEE floating point, because the VAX's floating point >> unit simply does not provide that > > Actually I think that's relevant. We usually get focused on the > concurrency because that's an area where architectures vary a lot but > it sounds like VAX barely supports multiple CPUs and generally older > architectures had fairly mundane concurrency semantics since they were > designed to work with existing toolchains. From memory it wasn't until > later Sparc chips and Alpha that people started to experiment with > looser concurrency models and expecting the toolchains to satisfy > complex constraints to make them work. Well, VAXen support multiple CPUs just fine. However, NetBSD/vax barely have support for it. That could of course change with time, as there are plenty of multiple CPU machines around. We just need to add support for them in NetBSD... Also, VAX did not use CAS as the general paradigm for atomic writes and so on, but have other explicit instructions that are guaranteed to be atomic. NetBSD/vax don't use the VAX specific instructions, but emulates CAS in the kernel instead. But I don't remember how that extends to userland. It's obviously easiest if userland programs use the pthread library functions, which are guaranteed to work right even in multiprocessor environment. Implementing your own spinlocks is of course possible, but a horrible way to use machine resources in userland. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: bqt@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
On Wed, Jun 25, 2014 at 6:05 PM, John Klos <john@ziaspace.com> wrote: > While I wouldn't be surprised if you remove the VAX code because not many > people are going to be running PostgreSQL, I'd disagree with the assessment > that this port is broken. It compiles, it initializes databases, it runs, et > cetera, albeit not with the default postgresql.conf. So I've been playing with this a bit. I have simh running on my home server as a Vax 3900 with NetBSD 6.1.5. My home server was mainly intended to be a SAN and its cpu is woefully underpowered so the resulting VAX is actually very very slow. So slow I wonder if there's a bug in the emulator but anyways. I'm coming to the conclusion that the port doesn't really work practically speaking but the failures are more interesting than I expected. They come in a few varieties: 1) Vax does not have IEEE fp. This manifests in a few ways, some of which may be our own bugs or missing expected outputs. The numeric data type maths often produce numbers rounded off differently, the floating point tests print numbers one digit shorter than our expected results expect and sometimes in scientific notation where we don't expect. The overflow tests generate floating point exceptions rather than overflows. Infinity and NaN don't work. The Json code in particular generates large numbers where +/- Infinity literals are supplied. There are some planner tests that fail with floating point exceptions -- that's probably a bug on our part. And I've seen at least one server crash (maybe two) apparently caused by one as well which I don't believe is expected. 2) The initdb problem is actually not our fault. It looks like a NetBSD kernel bug when allocating large shared memory blocks on a machine without lots of memory. There's not much initdb can do with a kernel panic... BSD still has the problem of kern.maxfiles defaulting to a value low enough that even two connections causes the regression tests to run out of file descriptors. That's documented and it would be a right pain for initdb to detect that case. 3) The tests take so long to run that autovacuum kicks in and the tests start producing rows in inconsistent orderings. I assume that's a problem we've run into on the CLOBBER_CACHE animals as well? 4) One of the tablesample tests seems to freeze indefinitely. I haven't looked into why yet. That might indeed indicate that the spinlock code isn't working? So my conclusion tentatively is that while the port doesn't actually work practically speaking it is nevertheless uncovering some interesting bugs. -- greg
Greg Stark <stark@mit.edu> writes:
> So I've been playing with this a bit. I have simh running on my home
> server as a Vax  3900 with NetBSD 6.1.5. My home server was mainly
> intended to be a SAN and its cpu is woefully underpowered so the
> resulting VAX is actually very very slow. So slow I wonder if there's
> a bug in the emulator but anyways.
Fun fun!
> There are some planner tests that fail with floating point exceptions
> -- that's probably a bug on our part. And I've seen at least one
> server crash (maybe two) apparently caused by one as well which I
> don't believe is expected.
That seems worth poking into.
> 3) The tests take so long to run that autovacuum kicks in and the
> tests start producing rows in inconsistent orderings. I assume that's
> a problem we've run into on the CLOBBER_CACHE animals as well?
I'd tentatively bet that it's more like planner behavioral differences
due to different floating-point roundoff.
> 4) One of the tablesample tests seems to freeze indefinitely. I
> haven't looked into why yet. That might indeed indicate that the
> spinlock code isn't working?
The tablesample tests seem like a not-very-likely first place for such a
thing to manifest.  What I'm thinking is that there are places in there
where we loop till we get an expected result.  Offhand I thought they were
all integer math; but if one was float and the VAX code wasn't doing what
was expected, maybe we could blame this on float discrepancies as well.
        regards, tom lane
			
		On Thu, Aug 20, 2015 at 4:13 PM, David Brownlee <abs@absd.org> wrote: >> 2) The initdb problem is actually not our fault. It looks like a >> NetBSD kernel bug when allocating large shared memory blocks on a >> machine without lots of memory. There's not much initdb can do with a >> kernel panic... > > That should definitely be fixed... cf http://mail-index.netbsd.org/port-vax/2015/08/19/msg002524.html http://comments.gmane.org/gmane.os.netbsd.ports.vax/5773 It's possible it's a simh bug it smells more like a simple overflow to me. >> BSD still has the problem of kern.maxfiles defaulting >> to a value low enough that even two connections causes the regression >> tests to run out of file descriptors. That's documented and it would >> be a right pain for initdb to detect that case. > > Is initdb calling ulimit() to check/set open files? Its probably worth > it as a sanity check if nothing else. Yup, we do that. > I think the VAX default open_max is 128. The 'bigger' ports have a > default of 1024, and I think they should probably all be updated to > that, though that is orthogonal to a ulimit() check. That's the problem. initdb tests how many connections can start up when writing the default config. But we assume that each process can use up to the rlimit file descriptors without running into a system-wide limit. Raising it to 1024 lets me get two processes running which is how I'm running them currently. Also I forgot to mention, I also have to raise the stack limit with ulimit. The default is so small that Postgres calculates the maximum safe value for its max_stack_depth config is 0. -- greg
On Thu, Aug 20, 2015 at 3:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> 4) One of the tablesample tests seems to freeze indefinitely. I >> haven't looked into why yet. That might indeed indicate that the >> spinlock code isn't working? > > The tablesample tests seem like a not-very-likely first place for such a > thing to manifest. What I'm thinking is that there are places in there > where we loop till we get an expected result. Offhand I thought they were > all integer math; but if one was float and the VAX code wasn't doing what > was expected, maybe we could blame this on float discrepancies as well. Ah, I was wrong. It's not the tablesample test -- I think that was the last one to complete. Annoyingly we don't seem to print test names until they finish. It was groupingsets. And it's stuck again on the same query: regression=# select pid,now()-query_start,now()-state_change,waiting,state,query from pg_stat_activity where pid <> pg_backend_pid(); +------+-----------------+-----------------+---------+--------+------------------------------------------------------+ | pid | ?column? | ?column? | waiting | state | query | +------+-----------------+-----------------+---------+--------+------------------------------------------------------+ | 9185 | 00:53:38.571552 | 00:53:38.571552 | f | active | select a, b, grouping(a,b), sum(v), count(*), max(v)#| | | | | | | from gstest1 group by rollup (a,b); | +------+-----------------+-----------------+---------+--------+------------------------------------------------------+ It's only been stuck an hour so it's possible it's still running but this morning it was the same query that was running for 7 hours so I'm guessing not. Unfortunately I appear to have built without debugging symbols so it'll be a couple days before I can rebuild with symbols to get a back trace. (I vaguely remember when builds took hours but I don't recall ever having to wait 48 hours for a build even back then) -- greg
On 2015-08-20 16:42:21 +0100, Greg Stark wrote: > Ah, I was wrong. It's not the tablesample test -- I think that was the > last one to complete. Annoyingly we don't seem to print test names > until they finish. > > It was groupingsets. And it's stuck again on the same query: > > regression=# select > pid,now()-query_start,now()-state_change,waiting,state,query from > pg_stat_activity where pid <> pg_backend_pid(); > +------+-----------------+-----------------+---------+--------+------------------------------------------------------+ > | pid | ?column? | ?column? | waiting | state | > query | > +------+-----------------+-----------------+---------+--------+------------------------------------------------------+ > | 9185 | 00:53:38.571552 | 00:53:38.571552 | f | active | select > a, b, grouping(a,b), sum(v), count(*), max(v)#| > | | | | | | from > gstest1 group by rollup (a,b); | > +------+-----------------+-----------------+---------+--------+------------------------------------------------------+ > > It's only been stuck an hour so it's possible it's still running but > this morning it was the same query that was running for 7 hours so I'm > guessing not. Interesting. > Unfortunately I appear to have built without debugging symbols so > it'll be a couple days before I can rebuild with symbols to get a back > trace. (I vaguely remember when builds took hours but I don't recall > ever having to wait 48 hours for a build even back then) Without any further clues I'd guess it's stuck somewhere in bipartite_match.c. That's the only place where floating point problmes would possibly result in getting stuck. I'm all for making sure these issues are indeed caused by platform specific float oddities, and not a more fundamental problem. But to me the state of this port, as evidenced in this thread, seems to be too bad to be worthwhile keeping alive. Especially since there's really no imaginable use case except for playing around. Greetings, Andres Freund
On 20 August 2015 at 14:54, Greg Stark <stark@mit.edu> wrote: > On Wed, Jun 25, 2014 at 6:05 PM, John Klos <john@ziaspace.com> wrote: >> While I wouldn't be surprised if you remove the VAX code because not many >> people are going to be running PostgreSQL, I'd disagree with the assessment >> that this port is broken. It compiles, it initializes databases, it runs, et >> cetera, albeit not with the default postgresql.conf. > > So I've been playing with this a bit. I have simh running on my home > server as a Vax 3900 with NetBSD 6.1.5. My home server was mainly > intended to be a SAN and its cpu is woefully underpowered so the > resulting VAX is actually very very slow. So slow I wonder if there's > a bug in the emulator but anyways. I've run NetBS/vax in simh on a laptop with a 2.5Ghz i5-2520M and it feels "reasonably fast or a VAX" (make of that what you will :) > I'm coming to the conclusion that the port doesn't really work > practically speaking but the failures are more interesting than I > expected. They come in a few varieties: Mmm. edge cases and failing (probably reasonable :) assumptions. > 1) Vax does not have IEEE fp. This manifests in a few ways, some of > which may be our own bugs or missing expected outputs. The numeric > data type maths often produce numbers rounded off differently, the > floating point tests print numbers one digit shorter than our expected > results expect and sometimes in scientific notation where we don't > expect. The overflow tests generate floating point exceptions rather > than overflows. Infinity and NaN don't work. The Json code in > particular generates large numbers where +/- Infinity literals are > supplied. > > There are some planner tests that fail with floating point exceptions > -- that's probably a bug on our part. And I've seen at least one > server crash (maybe two) apparently caused by one as well which I > don't believe is expected. Sounds like some useful test cases there. > 2) The initdb problem is actually not our fault. It looks like a > NetBSD kernel bug when allocating large shared memory blocks on a > machine without lots of memory. There's not much initdb can do with a > kernel panic... That should definitely be fixed... > BSD still has the problem of kern.maxfiles defaulting > to a value low enough that even two connections causes the regression > tests to run out of file descriptors. That's documented and it would > be a right pain for initdb to detect that case. Is initdb calling ulimit() to check/set open files? Its probably worth it as a sanity check if nothing else. I think the VAX default open_max is 128. The 'bigger' ports have a default of 1024, and I think they should probably all be updated to that, though that is orthogonal to a ulimit() check. > 3) The tests take so long to run that autovacuum kicks in and the > tests start producing rows in inconsistent orderings. I assume that's > a problem we've run into on the CLOBBER_CACHE animals as well? Can the autovaccum daemon settings be bumped/disabled while running the tests? > 4) One of the tablesample tests seems to freeze indefinitely. I > haven't looked into why yet. That might indeed indicate that the > spinlock code isn't working? > > So my conclusion tentatively is that while the port doesn't actually > work practically speaking it is nevertheless uncovering some > interesting bugs. Good to hear. Looks like bugs in both the OS and software side, so fun for all. Thanks for taking the time to do this! David
[- the vax lists since they cause majordomo confirmation emails for anyone responding] On Thu, Aug 20, 2015 at 3:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> There are some planner tests that fail with floating point exceptions >> -- that's probably a bug on our part. And I've seen at least one >> server crash (maybe two) apparently caused by one as well which I >> don't believe is expected. > > That seems worth poking into. Mea culpa. Not a planner crash but rather an overflow from exp(). It turns out exp() and other math library functions on Vax do not signal FPE but rather have a curious api that lets us catch the overflow by defining a function "infnan()" to call when it overflows. If we don't define that function then it executes an illegal instruction which generates SIGILL with errno set to EDOM (iirc). For the moment I've just attached our FPE handler to SIGILL and that's letting me run the tests without crashes. It's probably just silly make-work but it would be pretty easy to add a simple function to call our FPE handler directly to avoid having to have a SIGILL handler which seems like a bad idea in general. >> 4) One of the tablesample tests seems to freeze indefinitely. I >> haven't looked into why yet. That might indeed indicate that the >> spinlock code isn't working? > > The tablesample tests seem like a not-very-likely first place for such a > thing to manifest. What I'm thinking is that there are places in there > where we loop till we get an expected result. Offhand I thought they were > all integer math; but if one was float and the VAX code wasn't doing what > was expected, maybe we could blame this on float discrepancies as well. The hang is actually in the groupingset tests in bipartite_match.c:hk_breadth_search(). Looking at that function it's not surprising that it doesn't work without IEEE floats given that the first line is distance[0] = get_float4_infinity(); And the return value of the function is !isinf(distance[0]); The other place where non-IEEE floats are causing problems internal to postgres appears to be inside spgist -- even when planning queries using spgist: EXPLAIN (COSTS OFF) SELECT count(*) FROM radix_text_tbl WHERE t < 'Aztec Ct '; ! ERROR: floating-point exception ! DETAIL: An invalid floating-point operation was signaled. This probably means an out-of-range result or an invalid operation, such as division by zero. Other than these two places I think all the other failures are user-visible arithmetic producing different results or getting SIGFPE instead of displaying Inf/-Inf/NaN values. Some of it seems rather suspect results but I assume there's some numerically sensitive arithmetic that's producing it. -- greg
Greg Stark <stark@mit.edu> writes:
> On Thu, Aug 20, 2015 at 3:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> That seems worth poking into.
> Mea culpa. Not a planner crash but rather an overflow from exp(). It
> turns out exp() and other math library functions on Vax do not signal
> FPE but rather have a curious api that lets us catch the overflow by
> defining a function "infnan()" to call when it overflows. If we don't
> define that function then it executes an illegal instruction which
> generates SIGILL with errno set to EDOM (iirc). For the moment I've
> just attached our FPE handler to SIGILL and that's letting me run the
> tests without crashes. It's probably just silly make-work but it would
> be pretty easy to add a simple function to call our FPE handler
> directly to avoid having to have a SIGILL handler which seems like a
> bad idea in general.
Why not define infnan() and make it do the same thing as 
FloatExceptionHandler?  Or was that what you meant?
> The hang is actually in the groupingset tests in
> bipartite_match.c:hk_breadth_search().
> Looking at that function it's not surprising that it doesn't work
> without IEEE floats given that the first line is
>   distance[0] = get_float4_infinity();
> And the return value of the function is
>   !isinf(distance[0]);
Is it that function itself that's hanging, or is the problem that its
caller expects it to ultimately return true, and it never does?
I don't think we're really insisting on a true infinity here, only that
get_float4_infinity() produce a large value that isinf() will recognize.
I'm surprised that any of the hacks in src/port/isinf.c compile on Vax
at all --- did you invent a new one?
Also, I'd have thought that both get_floatX_infinity and get_floatX_nan
would be liable to produce SIGFPE on non-IEEE machines.  Did you mess
with those?
> The other place where non-IEEE floats are causing problems internal to
> postgres appears to be inside spgist -- even when planning queries
> using spgist:
That's pretty odd --- it does not look like spgcostestimate does anything
very exciting.  Can you get a stack trace showing where that FPE happens?
        regards, tom lane
			
		On 22 Aug 2015 18:02, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: > > Why not define infnan() and make it do the same thing as > FloatExceptionHandler? Or was that what you meant? That's exactly what I meant, instead of my quick hack to add a signal handler for sigill. > > The hang is actually in the groupingset tests in > > bipartite_match.c:hk_breadth_search(). ... > Is it that function itself that's hanging, or is the problem that its > caller expects it to ultimately return true, and it never does? I think it never exits that function but I'll try it again. > I don't think we're really insisting on a true infinity here, only that > get_float4_infinity() produce a large value that isinf() will recognize. > > I'm surprised that any of the hacks in src/port/isinf.c compile on Vax > at all --- did you invent a new one? > > Also, I'd have thought that both get_floatX_infinity and get_floatX_nan > would be liable to produce SIGFPE on non-IEEE machines. Did you mess > with those? I didn't do anything. There's no isinf.o in that directory so I don't think anything got compiled. There are other files in src/port but not that. > > The other place where non-IEEE floats are causing problems internal to > > postgres appears to be inside spgist -- even when planning queries > > using spgist: > > That's pretty odd --- it does not look like spgcostestimate does anything > very exciting. Can you get a stack trace showing where that FPE happens? Hmm. The backtrace is here but I think it's lying about the specific line. #0 convert_one_string_to_scalar (value=0x7f20e9a3 " ", rangelo=32, rangehi=122, 2132863395, 32, 122) at selfuncs.c:3873 #1 0x00435880 in convert_string_to_scalar ( value=0x7f20e990 "Aztec\n", ' ' <repeats 11 times>, "Ct ", scaledvalue=0x7fffdb44, lobound=0x7f225bf4 "Audrey", ' ' <repeats 24 times>, "Dr ", scaledlobound=0x7fffdb34, hibound=0x7f225c40 "Balmoral", ' ' <repeats 22 times>, "Dr ", scaledhibound=0x7fffdb3c, 2132863376,2147474244, 2132958196, 2147474228, 2132958272, 2147474236) at selfuncs.c:3847 Stepping through the code it looks like it actually happens on line 3882 when denom overflows. (gdb) n 3882 denom *= base; 3: denom = 1.666427615935998e+37 2: num = 0.37361810145459621 1: slen = 0 (gdb) n Program received signal SIGFPE, Arithmetic exception. convert_one_string_to_scalar (value=0x7f20e9a4 " ", rangelo=32, rangehi=122, 2132863396, 32, 122) at selfuncs.c:3873
Greg Stark <stark@mit.edu> writes:
> Hmm. The backtrace is here but I think it's lying about the specific line.
> #0  convert_one_string_to_scalar (value=0x7f20e9a3 "  ",
>     rangelo=32, rangehi=122, 2132863395, 32, 122)
>     at selfuncs.c:3873
> #1  0x00435880 in convert_string_to_scalar (
>     value=0x7f20e990 "Aztec\n", ' ' <repeats 11 times>, "Ct  ",
> scaledvalue=0x7fffdb44,
>     lobound=0x7f225bf4 "Audrey", ' ' <repeats 24 times>, "Dr  ",
> scaledlobound=0x7fffdb34,
>     hibound=0x7f225c40 "Balmoral", ' ' <repeats 22 times>, "Dr  ",
>     scaledhibound=0x7fffdb3c, 2132863376, 2147474244, 2132958196,
> 2147474228, 2132958272, 2147474236) at selfuncs.c:3847
> Stepping through the code it looks like it actually happens on line
> 3882 when denom overflows.
Oh, interesting.  The largest possible value of "base" is 256, and the
code limits the amount of string it'll scan to 20 bytes, which means
"denom" could reach at most 256^21 = 3.7e50.  Perfectly fine with
IEEE-math doubles, but not so much with other arithmetics.
We could hold the worst-case value to within the VAX range if we
considered only about 14 bytes instead of 20.  Probably that wouldn't
lose much in terms of estimation accuracy, but given the lack of
complaints to date, I'm not sure we should change it ...
        regards, tom lane
			
		Greg Stark <stark@mit.edu> writes:
> On 22 Aug 2015 18:02, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
>>> The hang is actually in the groupingset tests in
>>> bipartite_match.c:hk_breadth_search().
>> Is it that function itself that's hanging, or is the problem that its
>> caller expects it to ultimately return true, and it never does?
> I think it never exits that function but I'll try it again.
I looked at that some more, and so far as I can see, its dependence on
Infinity, or really its use of float arithmetic at all, is a dumb-ass
idea.  The distances are integers, and not very large ones either.
Which is fortunate, because if they did get large, you'd be having
problems with lost precision (ie, x+1 == x) somewhere around 2^24, long
before you got anywhere near exceeding the range of float or even int.
I think we should replace the whole mess with, say, uint32 for float and
UINT_MAX for infinity.  That will be more portable, probably faster, and
it will work correctly up to substantially *larger* peak distances than
the existing code.
>> I'm surprised that any of the hacks in src/port/isinf.c compile on Vax
>> at all --- did you invent a new one?
>> 
>> Also, I'd have thought that both get_floatX_infinity and get_floatX_nan
>> would be liable to produce SIGFPE on non-IEEE machines.  Did you mess
>> with those?
> I didn't do anything. There's no isinf.o in that directory so I don't
> think anything got compiled. There are other files in src/port but not
> that.
Some googling produced NetBSD man pages saying that isinf() and isnan()
are "not supported" on VAX.  Given that your build is evidently finding
system-provided versions of them, it's a good bet that they are hard-wired
to produce 0.  That would mean hk_breadth_search() necessarily returns
true, which would put its sole caller into an infinite loop.  So quite
aside from VAX, this coding is utterly dependent on the assumption that
get_float4_infinity() produces something that isinf() will return true
for.  I do not believe we have a hard dependency on that anywhere else.
        regards, tom lane
			
		I wrote:
> I think we should replace the whole mess with, say, uint32 for float and
> UINT_MAX for infinity.  That will be more portable, probably faster, and
> it will work correctly up to substantially *larger* peak distances than
> the existing code.
After studying the logic a bit more, I realized that the "finite"
distances computed by the algorithm can actually never exceed u_size,
which we're already constraining to be less than SHRT_MAX so that the
adjacency arrays can be "short".  So I made it use "short" storage for
distances too, with SHRT_MAX as the infinity value.  If we ever find a
need to work with graphs exceeding 32K nodes, it will be trivial to
s/short/int/g in this code.
        regards, tom lane
			
		I wrote:
> Oh, interesting.  The largest possible value of "base" is 256, and the
> code limits the amount of string it'll scan to 20 bytes, which means
> "denom" could reach at most 256^21 = 3.7e50.  Perfectly fine with
> IEEE-math doubles, but not so much with other arithmetics.
> We could hold the worst-case value to within the VAX range if we
> considered only about 14 bytes instead of 20.  Probably that wouldn't
> lose much in terms of estimation accuracy, but given the lack of
> complaints to date, I'm not sure we should change it ...
On further reflection, there seems little reason not to change it: it's
pretty silly to imagine that selectivity estimates produced via this
technique would have anything like 14 decimal places of precision anyhow.
We've already stripped off any common prefix of the strings we're
comparing, so the strings are certain to differ at the first byte.
Reducing the maximum number of bytes considered will save cycles and I
really doubt that it would cost anything in estimation accuracy.
        regards, tom lane
			
		I wrote:
> On further reflection, there seems little reason not to change it: it's
> pretty silly to imagine that selectivity estimates produced via this
> technique would have anything like 14 decimal places of precision anyhow.
I've done something about both this and the bipartite_match issue in HEAD.
I'd be curious to see all the remaining regression differences on VAX.
        regards, tom lane
			
		On Sun, Aug 23, 2015 at 8:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I've done something about both this and the bipartite_match issue in HEAD. > I'd be curious to see all the remaining regression differences on VAX. I'll run a git pull overnight :) -- greg
On Thu, Aug 20, 2015 at 04:32:19PM +0100, Greg Stark wrote: > > That's the problem. initdb tests how many connections can start up > when writing the default config. But we assume that each process can > use up to the rlimit file descriptors without running into a > system-wide limit. That sounds like a fairly bogus assumption -- unless the system-wide limit is to be meaningless. The default NetBSD limits on the VAX are probably still too low, however. Thor
Attached is the pg_regress diff. I believe they are all user-visible
effects of non-iee fp math though I would have expected the rounding
to work right and I'm not clear how gist ends up returning rows in a
different order.
There are still two local changes. The SIGILL handler which is set to
the FPE handler function and initdb is forced to allocate a smaller
shared memory and smaller number of connections to avoid the kernel
panic. I'm running the regression tests with MAX_CONNECTIONS=2.
They take 7h20m to run the regression tests (The git pull actually
only took 7m and rebuilding took three hours)
test tablespace               ... ok
parallel group (20 tests, in groups of 2):  char boolean name varchar
int2 text int4 int8 oid float4 float8 bit txid numeric uuid enum money
rangetypes regproc pg_lsn
     boolean                  ... ok
     char                     ... ok
     name                     ... ok
     varchar                  ... ok
     text                     ... ok
     int2                     ... FAILED
     int4                     ... FAILED
     int8                     ... FAILED
     oid                      ... ok
     float4                   ... FAILED
     float8                   ... FAILED
     bit                      ... ok
     numeric                  ... FAILED
     txid                     ... ok
     uuid                     ... ok
     enum                     ... ok
     money                    ... ok
     rangetypes               ... ok
     pg_lsn                   ... ok
     regproc                  ... ok
test strings                  ... ok
test numerology               ... FAILED
parallel group (20 tests, in groups of 2):  lseg point box line path
polygon circle date time timetz timestamp timestamptz abstime interval
reltime tinterval macaddr inet comments tstypes
     point                    ... FAILED
     lseg                     ... ok
     line                     ... FAILED
     box                      ... ok
     path                     ... ok
     polygon                  ... FAILED
     circle                   ... FAILED
     date                     ... ok
     time                     ... ok
     timetz                   ... ok
     timestamp                ... ok
     timestamptz              ... ok
     interval                 ... FAILED
     abstime                  ... ok
     reltime                  ... ok
     tinterval                ... ok
     inet                     ... ok
     macaddr                  ... ok
     tstypes                  ... ok
     comments                 ... ok
parallel group (6 tests, in groups of 2):  geometry horology regex
oidjoins type_sanity opr_sanity
     geometry                 ... FAILED
     horology                 ... ok
     regex                    ... ok
     oidjoins                 ... ok
     type_sanity              ... ok
     opr_sanity               ... ok
test insert                   ... ok
test insert_conflict          ... ok
test create_function_1        ... ok
test create_type              ... ok
test create_table             ... ok
test create_function_2        ... ok
parallel group (2 tests):  copyselect copy
     copy                     ... ok
     copyselect               ... ok
parallel group (2 tests):  create_operator create_misc
     create_misc              ... ok
     create_operator          ... ok
parallel group (2 tests):  create_view create_index
     create_index             ... ok
     create_view              ... ok
parallel group (13 tests, in groups of 2):  create_aggregate
create_function_3 create_cast constraints triggers inherit typed_table
create_table_like drop_if_exists vacuum rolenames updatable_views
roleattributes
     create_aggregate         ... ok
     create_function_3        ... ok
     create_cast              ... ok
     constraints              ... ok
     triggers                 ... ok
     inherit                  ... ok
     create_table_like        ... ok
     typed_table              ... ok
     vacuum                   ... ok
     drop_if_exists           ... ok
     updatable_views          ... FAILED
     rolenames                ... ok
     roleattributes           ... ok
test sanity_check             ... ok
test errors                   ... ok
test select                   ... ok
parallel group (20 tests, in groups of 2):  select_distinct
select_into select_distinct_on select_implicit select_having subselect
case union aggregates join random transactions portals arrays
hash_index btree_index namespace update delete prepared_xacts
     select_into              ... ok
     select_distinct          ... ok
     select_distinct_on       ... ok
     select_implicit          ... ok
     select_having            ... ok
     subselect                ... ok
     union                    ... FAILED
     case                     ... ok
     join                     ... ok
     aggregates               ... FAILED
     transactions             ... ok
     random                   ... ok
     portals                  ... ok
     arrays                   ... FAILED
     btree_index              ... ok
     hash_index               ... ok
     update                   ... ok
     namespace                ... ok
     prepared_xacts           ... ok
     delete                   ... ok
parallel group (14 tests, in groups of 2):  gin brin spgist gist
security_label privileges collate matview lock replica_identity
object_address rowsecurity groupingsets tablesample
     brin                     ... ok
     gin                      ... ok
     gist                     ... FAILED
     spgist                   ... ok
     privileges               ... ok
     security_label           ... ok
     collate                  ... ok
     matview                  ... ok
     lock                     ... ok
     replica_identity         ... ok
     rowsecurity              ... ok
     object_address           ... ok
     tablesample              ... ok
     groupingsets             ... ok
parallel group (5 tests, in groups of 2):  alter_operator
alter_generic psql misc async
     alter_generic            ... ok
     alter_operator           ... ok
     misc                     ... ok
     psql                     ... ok
     async                    ... ok
test rules                    ... ok
parallel group (19 tests, in groups of 2):  portals_p2 select_views
cluster foreign_key guc dependency combocid bitmapops tsdicts tsearch
window foreign_data xmlmap functional_deps advisory_lock json jsonb
indirect_toast equivclass
     select_views             ... ok
     portals_p2               ... ok
     foreign_key              ... ok
     cluster                  ... ok
     dependency               ... ok
     guc                      ... ok
     bitmapops                ... ok
     combocid                 ... ok
     tsearch                  ... ok
     tsdicts                  ... ok
     foreign_data             ... ok
     window                   ... FAILED
     xmlmap                   ... ok
     functional_deps          ... ok
     advisory_lock            ... ok
     json                     ... FAILED
     jsonb                    ... ok
     indirect_toast           ... ok
     equivclass               ... ok
parallel group (19 tests, in groups of 2):  limit plancache copy2
plpgsql temp domain prepare rangefuncs conversion without_oid truncate
alter_table sequence polymorphism returning rowtypes with largeobject
xml
     plancache                ... ok
     limit                    ... ok
     plpgsql                  ... ok
     copy2                    ... ok
     temp                     ... ok
     domain                   ... ok
     rangefuncs               ... ok
     prepare                  ... ok
     without_oid              ... ok
     conversion               ... ok
     truncate                 ... ok
     alter_table              ... ok
     sequence                 ... ok
     polymorphism             ... ok
     rowtypes                 ... ok
     returning                ... ok
     largeobject              ... ok
     with                     ... ok
     xml                      ... ok
test event_trigger            ... ok
test stats                    ... ok
============== shutting down postmaster               ==============
			
		Вложения
Greg Stark <stark@mit.edu> writes:
> Attached is the pg_regress diff. I believe they are all user-visible
> effects of non-iee fp math though I would have expected the rounding
> to work right and I'm not clear how gist ends up returning rows in a
> different order.
I concur that these are generally unsurprising given what we know about
VAX arithmetic.  The tests that give different integer rounding results
are specifically checking whether the platform does round-to-nearest-even
as specified by IEEE.  It's not surprising that pre-IEEE platforms might
not have chosen that behavior.  The other stuff is due to different
range and precision of FP math, get_floatX_infinity() returning HUGE_VAL
rather than a true infinity, get_floatX_nan() throwing a SIGFPE, etc.
The gist tests in question appear to me to be underdetermined by design
--- for example, the first one is
select p from gist_tbl where p <@ box(point(0,0), point(0.5, 0.5))
order by p <-> point(0.2, 0.2);
and so there is nothing wrong with ordering (0.15,0.15) and (0.25,0.25)
differently, because they're exactly the same distance from (0.2,0.2).
I'm not sure why we've not seen more platform-specific failures on that
test.  Given that it's only existed since Nov 2014, maybe we shouldn't
assume that it's been through the wars yet.  I'm tempted to change the
reference point to (0.201,0.201) or so, so that the correct sort order
is unambiguous.  Heikki, did you make it like that intentionally?
We could eliminate the unexpected FPEs on use of "NaN" if we configured
get_floatX_nan() to throw a "platform does not support NaN" error rather
than intentionally executing an undefined operation.  However, I'm not
sure why we'd bother unless we're going to make VAX a supported platform,
and personally I don't want to change the other tests that are failing
here.
        regards, tom lane
			
		For completeness, here's the regression tests from the conrttrib modules. I haven't looked into why earthdistance is coming up with such odd results but I suspect it all comes from the same arithmetic source. I don't see any surprising internal dependencies on ieee floating point. For what it's worth there are a number of mentions in the docs of platforms that have non-ieee semantics behaving differently so I wouldn't say we don't support such platforms. If we could avoid the test failures without weakening the tests for other platforms that would be nice. But I don't see any obvious way to do that.
Вложения
Greg Stark <stark@mit.edu> writes:
> For completeness, here's the regression tests from the conrttrib
> modules. I haven't looked into why earthdistance is coming up with
> such odd results but I suspect it all comes from the same arithmetic
> source. I don't see any surprising internal dependencies on ieee
> floating point.
I think the tests that are giving unexpected results are simply
doing things that are numerically unstable.  For instance, in the
first test that's giving a problem:
 SELECT longitude(ll_to_earth(90,0))::numeric(20,10);
!   longitude   
! --------------
!  0.0000000000 (1 row)  SELECT longitude(ll_to_earth(-45,0))::numeric(20,10);
--- 365,373 ---- (1 row)  SELECT longitude(ll_to_earth(90,0))::numeric(20,10);
!    longitude    
! ----------------
!  180.0000000000 (1 row)
the very first thing that happens inside ll_to_earth is
"cos(radians(90))".  The exact answer to that of course should be zero,
but it never will be zero because pi/2 isn't exactly representable in
anybody's float arithmetic.  On my Intel machine it gives
6.12323399573677e-17, and it would be far from astonishing if the VAX's
arithmetic instead gives some very small negative value.  Such a sign
change would result in the observed flip.
> For what it's worth there are a number of mentions in the docs of
> platforms that have non-ieee semantics behaving differently so I
> wouldn't say we don't support such platforms. If we could avoid the
> test failures without weakening the tests for other platforms that
> would be nice. But I don't see any obvious way to do that.
Yeah.  The NaN and Infinity behavioral changes seem like a big problem.
And even if we wanted to carry alternative expected-files, how would
we maintain them?  Can't ask people to spin up a Vax emulator to submit
a patch.
        regards, tom lane
			
		I wrote:
> Greg Stark <stark@mit.edu> writes:
>> For what it's worth there are a number of mentions in the docs of
>> platforms that have non-ieee semantics behaving differently so I
>> wouldn't say we don't support such platforms. If we could avoid the
>> test failures without weakening the tests for other platforms that
>> would be nice. But I don't see any obvious way to do that.
> Yeah.  The NaN and Infinity behavioral changes seem like a big problem.
... having said that, it is fair to wonder why the pgstatindex() tests
involve NaNs at all.  It's one thing to cope with NaNs if they're present
in user input or the user does a computation that will produce them,
but defining database services that produce NaNs for no especially good
reason is something else again.
A look in the git logs suggests that this particular behavior was
probably completely accidental.  We memorialized it as official in commits
af7d18129 & bd165757f, but I would bet a good lunch that zero thought went
into it before that.
I don't care enough about this point to push it forward myself, but
I would support a change to make pgstatindex() return zero rather than
NaN for these statistics on empty indexes.
        regards, tom lane