Обсуждение: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)

Поиск
Список
Период
Сортировка

initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)

От
Umair Shahid
Дата:
On Fri, Jun 24, 2016 at 2:14 AM, Umair Shahid <umair.shahid@2ndquadrant.com> wrote:

---------- Forwarded message ----------
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Thu, Jun 23, 2016 at 9:32 PM
Subject: Re: [pgsql-packagers] PG 9.6beta2 tarballs are ready
To: Magnus Hagander <magnus@hagander.net>
Cc: Umair Shahid <umair.shahid@2ndquadrant.com>, Dave Page <dpage@postgresql.org>, PostgreSQL Packagers <pgsql-packagers@postgresql.org>


Magnus Hagander <magnus@hagander.net> writes:
> That makes more sense as the joinrel stuff *has* been changed between the
> two betas. I'm sure someone who's touched that code (Tom?) can comment on
> that part..

It still makes little sense to me, as the previous reports say that the
problem happened during bootstrap, and the planner does not run
during bootstrap.

Could we get a look at debug_query_string in the coredump, to possibly
narrow down where the crash is really happening?

Moving thread to -hackers ... 

debug_query_string is


"INSERT INTO pg_description  SELECT t.objoid, c.oid, t.objsubid, t.description   FROM tmp_pg_description t, pg_class c     WHERE c.relname = t.classname;"

Happening in "setup_description"
 

> It's still strange that it doesn't affect woodlouse.

Or any of the other Windows critters...

                        regards, tom lane



--
Umair Shahid
2ndQuadrant - The PostgreSQL Support Company

Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)

От
Craig Ringer
Дата:
On 24 June 2016 at 05:17, Umair Shahid <umair.shahid@gmail.com> wrote:
On Fri, Jun 24, 2016 at 2:14 AM, Umair Shahid <umair.shahid@2ndquadrant.com> wrote:

---------- Forwarded message ----------
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Thu, Jun 23, 2016 at 9:32 PM
Subject: Re: [pgsql-packagers] PG 9.6beta2 tarballs are ready
To: Magnus Hagander <magnus@hagander.net>
Cc: Umair Shahid <umair.shahid@2ndquadrant.com>, Dave Page <dpage@postgresql.org>, PostgreSQL Packagers <pgsql-packagers@postgresql.org>


Magnus Hagander <magnus@hagander.net> writes:
> That makes more sense as the joinrel stuff *has* been changed between the
> two betas. I'm sure someone who's touched that code (Tom?) can comment on
> that part..

It still makes little sense to me, as the previous reports say that the
problem happened during bootstrap, and the planner does not run
during bootstrap.

Could we get a look at debug_query_string in the coredump, to possibly
narrow down where the crash is really happening?

Moving thread to -hackers ... 

debug_query_string is


"INSERT INTO pg_description  SELECT t.objoid, c.oid, t.objsubid, t.description   FROM tmp_pg_description t, pg_class c     WHERE c.relname = t.classname;"

Happening in "setup_description"
 

 

I was helping Haroon with this last night. I don't have access to the original thread and he's not around so I don't know how much he said. I'll repeat our findings here.

During debugging I found that:

* A VS 2013 build (perfomed by Haroon and copied to the test host) crashes consistently with the reported symptoms - "performing post-bootstrap initialization ... child process was terminated by exception 0xC0000005"

* The issue doesn't happen in a VS 2015 build done on the test host

* I couldn't use just-in-time debugging because the restricted execution token setup isolated the process. For the same reason, breakpoints stop working in initdb.c after line 3557.

* To get a backtrace, I had to:

  * Launch a VS x86 command prompt
  * devenv /debugexe bin\initdb.exe -D test
  * Set a breakpoint in initdb.c:3557 and initdb.c:3307
  * Run
  * When it traps at get_restricted_token(), manually move the execution pointer over the setup of the restricted execution token by dragging & dropping the yellow instruction pointer arrow. Yes, really. Or, y'know, comment it out and rebuild, but I was working with a supplied binary.
  * Continue until next breakpoint
  * Launch process explorer and find the pid of the postgres child process
  * Debug->attach to process, attach to the child postgres. This doesn't detach the parent, VS does multiprocess debugging.
  * Continue execution
  * vs will trap on the child when it crashes

* It is an access violation (segfault) in postgres.exe when attempting to read memory at 0xFFFFFFFFFFFFFFFF in calc_joinrel_size_estimate() at costsize.c:3940

fkselec = get_foreign_key_join_selectivity(root,
  outer_rel->relids,
  inner_rel->relids,
  sjinfo,
  &restrictlist);

with debug_query_string:

0x0000000009bf6140 "INSERT INTO pg_description  SELECT t.objoid, c.oid, t.objsubid, t.description   FROM tmp_pg_description t, pg_class c     WHERE c.relname = t.classname;\n"



Backtrace:

        Exception thrown at 0x00000001401A5A81 in postgres.exe: 0xC0000005: Access violation reading location 0xFFFFFFFFFFFFFFFF.

> postgres.exe!calc_joinrel_size_estimate(PlannerInfo * root, RelOptInfo * outer_rel, RelOptInfo * inner_rel, double outer_rows, double inner_rows, SpecialJoinInfo * sjinfo, List * restrictlist) Line 3944 C
  postgres.exe!set_joinrel_size_estimates(PlannerInfo * root, RelOptInfo * rel, RelOptInfo * outer_rel, RelOptInfo * inner_rel, SpecialJoinInfo * sjinfo, List * restrictlist) Line 3852 C
  postgres.exe!build_join_rel(PlannerInfo * root, Bitmapset * joinrelids, RelOptInfo * outer_rel, RelOptInfo * inner_rel, SpecialJoinInfo * sjinfo, List * * restrictlist_ptr) Line 521 C
  postgres.exe!make_join_rel(PlannerInfo * root, RelOptInfo * rel1, RelOptInfo * rel2) Line 721 C
  postgres.exe!make_rels_by_clause_joins(PlannerInfo * root, RelOptInfo * old_rel, ListCell * other_rels) Line 266 C
  postgres.exe!join_search_one_level(PlannerInfo * root, int level) Line 69 C
  postgres.exe!standard_join_search(PlannerInfo * root, int levels_needed, List * initial_rels) Line 2172 C
  postgres.exe!query_planner(PlannerInfo * root, List * tlist, void(*)(PlannerInfo *, void *) qp_callback, void * qp_extra) Line 255 C
  postgres.exe!grouping_planner(PlannerInfo * root, char inheritance_update, double tuple_fraction) Line 1695 C
  postgres.exe!subquery_planner(PlannerGlobal * glob, Query * parse, PlannerInfo * parent_root, char hasRecursion, double tuple_fraction) Line 775 C
  postgres.exe!standard_planner(Query * parse, int cursorOptions, ParamListInfoData * boundParams) Line 312 C
  postgres.exe!pg_plan_query(Query * querytree, int cursorOptions, ParamListInfoData * boundParams) Line 800 C
  postgres.exe!exec_simple_query(const char * query_string) Line 1023 C
  postgres.exe!PostgresMain(int argc, char * * argv, const char * dbname, const char * username) Line 4076 C
  postgres.exe!main(int argc, char * * argv) Line 227 C


Local vars:

+ inner_rel 0x0000000009dfd170 {type=T_EquivalenceClass (537) reloptkind=RELOPT_BASEREL (0) relids=0x0000000009d6d718 {...} ...} RelOptInfo *
inner_rows 270.00000000000000 double
+ outer_rel 0x00000001401ded48 {postgres.exe!build_joinrel_tlist(PlannerInfo * root, RelOptInfo * joinrel, RelOptInfo * input_rel), Line 646} {...} RelOptInfo *
outer_rows 2.653352065130e-314#DEN double
+ restrictlist 0x0000000009d6f7f8 {type=T_List (656) length=1 head=0x0000000009d6f7d8 {data={ptr_value=0x0000000009d6e980 ...} ...} ...} List *
+ root 0x0000000009dfd800 {type=1 parse=0x000000000067d220 {type=T_AllocSetContext (601) commandType=CMD_UNKNOWN (0) ...} ...} PlannerInfo *
+ sjinfo 0x000000000043f870 {type=T_SpecialJoinInfo (543) min_lefthand=0x0000000009dfcfd8 {nwords=1 words=0x0000000009dfcfdc {...} } ...} SpecialJoinInfo *






--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)

От
Michael Paquier
Дата:
On Fri, Jun 24, 2016 at 11:21 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
>   * Launch a VS x86 command prompt
>   * devenv /debugexe bin\initdb.exe -D test
>   * Set a breakpoint in initdb.c:3557 and initdb.c:3307
>   * Run
>   * When it traps at get_restricted_token(), manually move the execution
> pointer over the setup of the restricted execution token by dragging &
> dropping the yellow instruction pointer arrow. Yes, really. Or, y'know,
> comment it out and rebuild, but I was working with a supplied binary.
>   * Continue until next breakpoint
>   * Launch process explorer and find the pid of the postgres child process
>   * Debug->attach to process, attach to the child postgres. This doesn't
> detach the parent, VS does multiprocess debugging.
>   * Continue execution
>   * vs will trap on the child when it crashes

Do you think a crash dump could have been created by creating
crashdumps/ in PGDATA as part of initdb before this query is run?
-- 
Michael



Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)

От
Craig Ringer
Дата:

On 24 June 2016 at 10:21, Craig Ringer <craig@2ndquadrant.com> wrote:
 
* To get a backtrace, I had to:

  * Launch a VS x86 command prompt
  * devenv /debugexe bin\initdb.exe -D test
  * Set a breakpoint in initdb.c:3557 and initdb.c:3307
  * Run
  * When it traps at get_restricted_token(), manually move the execution pointer over the setup of the restricted execution token by dragging & dropping the yellow instruction pointer arrow. Yes, really. Or, y'know, comment it out and rebuild, but I was working with a supplied binary.
  * Continue until next breakpoint
  * Launch process explorer and find the pid of the postgres child process
  * Debug->attach to process, attach to the child postgres. This doesn't detach the parent, VS does multiprocess debugging.
  * Continue execution
  * vs will trap on the child when it crashes


Also, to save anyone else this hassle, I have saved a process dump (windows core file) and the debug symbols to gdrive. You can get them at:

Note that you will need a Visual Studio version installed. VS Community 2015 works fine. You only need to install the C++ devenv and C++ headers, you don't need MFC or any of the rest. The default install is fine if you don't mind a bigger download.  Once installed, open postgres.dmp, then go to debug->options, symbols. There, enable the Microsoft Symbol Server, and also add a new entry for the absolute path to the symbols directory for the archive you unpacked. You should enable the symbol cache directory too, make a directory in your user dir and put it there.

If Haroon shared some gdrive links earlier on the thread I don't have access to, this is the same data just efficiently compressed (32MB instead of 180MB) and packaged up in a single convenient archive with the matching sources and a full working install. You'll need 7zip to unpack it, but that should be on your "install as soon as you install Windows" list anyway.



--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)

От
Craig Ringer
Дата:


On 24 June 2016 at 10:28, Michael Paquier <michael.paquier@gmail.com> wrote:
On Fri, Jun 24, 2016 at 11:21 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
>   * Launch a VS x86 command prompt
>   * devenv /debugexe bin\initdb.exe -D test
>   * Set a breakpoint in initdb.c:3557 and initdb.c:3307
>   * Run
>   * When it traps at get_restricted_token(), manually move the execution
> pointer over the setup of the restricted execution token by dragging &
> dropping the yellow instruction pointer arrow. Yes, really. Or, y'know,
> comment it out and rebuild, but I was working with a supplied binary.
>   * Continue until next breakpoint
>   * Launch process explorer and find the pid of the postgres child process
>   * Debug->attach to process, attach to the child postgres. This doesn't
> detach the parent, VS does multiprocess debugging.
>   * Continue execution
>   * vs will trap on the child when it crashes

Do you think a crash dump could have been created by creating
crashdumps/ in PGDATA as part of initdb before this query is run?

I see what you did there ;)

Yes, quite possibly, actually. I should've just got Haroon to build me a new initdb without the priv setting and with creation of crashdumps/ .

It might be worth testing that out and adding an initdb startup flag to create the directory, since initdb is such a PITA to debug.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)

От
Michael Paquier
Дата:
On Fri, Jun 24, 2016 at 11:33 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
> Yes, quite possibly, actually. I should've just got Haroon to build me a new
> initdb without the priv setting and with creation of crashdumps/ .
>
> It might be worth testing that out and adding an initdb startup flag to
> create the directory, since initdb is such a PITA to debug.

I was more thinking about putting that under -DDEBUG for example.
-- 
Michael



Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)

От
"Tsunakawa, Takayuki"
Дата:
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Michael Paquier
> Sent: Friday, June 24, 2016 11:37 AM
> On Fri, Jun 24, 2016 at 11:33 AM, Craig Ringer <craig@2ndquadrant.com>
> wrote:
> It might be worth testing that out and adding an initdb startup flag
> > to create the directory, since initdb is such a PITA to debug.
> 
> I was more thinking about putting that under -DDEBUG for example.
> 

I think just the existing option -d (--debug) and/or -n (--no-clean) would be OK.

Regards
Takayuki Tsunakawa



Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)

От
Michael Paquier
Дата:
On Fri, Jun 24, 2016 at 11:51 AM, Tsunakawa, Takayuki
<tsunakawa.takay@jp.fujitsu.com> wrote:
>> From: pgsql-hackers-owner@postgresql.org
>> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Michael Paquier
>> Sent: Friday, June 24, 2016 11:37 AM
>> On Fri, Jun 24, 2016 at 11:33 AM, Craig Ringer <craig@2ndquadrant.com>
>> wrote:
>> It might be worth testing that out and adding an initdb startup flag
>> > to create the directory, since initdb is such a PITA to debug.
>>
>> I was more thinking about putting that under -DDEBUG for example.
>>
>
> I think just the existing option -d (--debug) and/or -n (--no-clean) would be OK.

If the majority thinks that an option switch is more adapted, I won't
fight it strongly. Just please let's not mess up with the behavior of
the existing options.
-- 
Michael



Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)

От
Craig Ringer
Дата:


On 24 June 2016 at 05:17, Umair Shahid <umair.shahid@gmail.com> wrote:
 

> It's still strange that it doesn't affect woodlouse.

Or any of the other Windows critters...

Given that it's only been seen in VS 2013, it's particularly odd that it's not biting woodlouse. 

I'd like more details from those whose installs are crashing. What exact vcvars env did you run under, with which exact cl.exe version?



--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)

От
Michael Paquier
Дата:
On Fri, Jun 24, 2016 at 1:28 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
> Given that it's only been seen in VS 2013, it's particularly odd that it's
> not biting woodlouse.
>
> I'd like more details from those whose installs are crashing. What exact
> vcvars env did you run under, with which exact cl.exe version?

Which OS did you use for the compilation? I don't think that this
matters much but woodloose is using Win7.
-- 
Michael



Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)

От
Craig Ringer
Дата:


On 24 June 2016 at 12:31, Michael Paquier <michael.paquier@gmail.com> wrote:
On Fri, Jun 24, 2016 at 1:28 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
> Given that it's only been seen in VS 2013, it's particularly odd that it's
> not biting woodlouse.
>
> I'd like more details from those whose installs are crashing. What exact
> vcvars env did you run under, with which exact cl.exe version?

Which OS did you use for the compilation? I don't think that this
matters much but woodloose is using Win7.



I'll have to wait for Haroon for that info for the crashing builds he did, but I've now reproduced it with:

Windows server 2012 R2, VS 2013 Community Update 5, cross compile tools for x86 to amd64.  cl 18.00.40629 for x64, env:

  %comspec% /k  ""C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\vcvarsall.bat" x86_amd64"

"where cl" reports

  C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\x86_amd64\cl.exe

Note that cross compilation is a typical configuration on Windows, where you routinely use 32bit x86 compilers to build 64bit code, except in the newest SDKs.

I see the same symptoms, with the segfault.

This host is a clean install, an AWS instance created for the purpose.



It looks like woodlouse probably runs an older VS2013 and uses the native x64 toolchain; its env includes:

  C:\\Program Files (x86)\\Microsoft Visual Studio 12.0\\VC\\BIN\\amd64

and does not have x86_amd64 in it. 




BTW, I suggested to Haroon that he clone beta2 from git, then do a git-bisect between beta1 (works) and beta2 (fails) to see if he can identify the commit that causes things to start failing. I don't know how far he got with that yesterday.


By comparison, I had no problems on the same host with VS Community 2015, cl 19.00.23918, env "VS2015 x64 Native Tools Command Prompt":

   %comspec% /k ""C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\vcvarsall.bat"" amd64



On a side note I'm unable to build with vs2013 community u5 native tools ( for some reason. Link errors, unresolved external symbol _ischartype_l . cl 18.00.42629 for x64, env:

   %comspec% /k  ""C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\vcvarsall.bat" amd64"

"where cl" reports:

   C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\amd64\cl.exe








--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)

От
Craig Ringer
Дата:
On 24 June 2016 at 13:00, Craig Ringer <craig@2ndquadrant.com> wrote:
 
 
 I've now reproduced it with:


I can also confirm that it _doesn't_ crash with the same SDK using a 32-bit build (running under WoW on x64). cl 18.00.40629 for x86, env:

  %comspec% /k  ""C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\vcvarsall.bat" x86" 


--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)

От
Craig Ringer
Дата:


On 24 June 2016 at 10:28, Michael Paquier <michael.paquier@gmail.com> wrote:
On Fri, Jun 24, 2016 at 11:21 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
>   * Launch a VS x86 command prompt
>   * devenv /debugexe bin\initdb.exe -D test
>   * Set a breakpoint in initdb.c:3557 and initdb.c:3307
>   * Run
>   * When it traps at get_restricted_token(), manually move the execution
> pointer over the setup of the restricted execution token by dragging &
> dropping the yellow instruction pointer arrow. Yes, really. Or, y'know,
> comment it out and rebuild, but I was working with a supplied binary.
>   * Continue until next breakpoint
>   * Launch process explorer and find the pid of the postgres child process
>   * Debug->attach to process, attach to the child postgres. This doesn't
> detach the parent, VS does multiprocess debugging.
>   * Continue execution
>   * vs will trap on the child when it crashes

Do you think a crash dump could have been created by creating
crashdumps/ in PGDATA as part of initdb before this query is run?


The answer is "yes" btw. Add "crashdumps" to the static array of directories created by initdb and it works great.

Sigh. It'd be less annoying if I hadn't written most of the original patch.

For convenience I also commented out the check_root call in src/backend/main.c and the get_restricted_token(progname) call in initdb.c, so I could run it easily under an admin account where I can also install tools etc without hassle. Not recommended on a non-throwaway machine of course.

The generated crashdump shows the same crash in the same location.

I have absolutely no idea why it's trying to access memory at what looks like   (uint64)(-1) though.  Nothing in the auto vars list:

+ &restrictlist 0x000000000043f7b0 {0x0000000009e32600 {type=T_List (656) length=1 head=0x0000000009e325e0 {data={ptr_value=...} ...} ...}} List * *
+ inner_rel 0x0000000009e7ad68 {type=T_EquivalenceClass (537) reloptkind=RELOPT_BASEREL (0) relids=0x0000000009e30520 {...} ...} RelOptInfo *
+ inner_rel->relids 0x0000000009e30520 {nwords=658 words=0x0000000009e30524 {...} } Bitmapset *
+ outer_rel 0x00000001401dec98 {postgres.exe!build_joinrel_tlist(PlannerInfo * root, RelOptInfo * joinrel, RelOptInfo * input_rel), Line 646} {...} RelOptInfo *
+ outer_rel->relids 0xe808498b48d78b48 {nwords=??? words=0xe808498b48d78b4c {...} } Bitmapset *
+ sjinfo 0x000000000043f870 {type=T_SpecialJoinInfo (543) min_lefthand=0x0000000009e7abd0 {nwords=1 words=0x0000000009e7abd4 {...} } ...} SpecialJoinInfo *

or locals:

+ inner_rel 0x0000000009e7ad68 {type=T_EquivalenceClass (537) reloptkind=RELOPT_BASEREL (0) relids=0x0000000009e30520 {...} ...} RelOptInfo *
inner_rows 270.00000000000000 double
+ outer_rel 0x00000001401dec98 {postgres.exe!build_joinrel_tlist(PlannerInfo * root, RelOptInfo * joinrel, RelOptInfo * input_rel), Line 646} {...} RelOptInfo *
outer_rows 2.653351978175e-314#DEN double
+ restrictlist 0x0000000009e32600 {type=T_List (656) length=1 head=0x0000000009e325e0 {data={ptr_value=0x0000000009e31788 ...} ...} ...} List *
+ root 0x0000000009e7b3f8 {type=1 parse=0x0000000000504ad0 {type=T_AllocSetContext (601) commandType=CMD_UNKNOWN (0) ...} ...} PlannerInfo *
+ sjinfo 0x000000000043f870 {type=T_SpecialJoinInfo (543) min_lefthand=0x0000000009e7abd0 {nwords=1 words=0x0000000009e7abd4 {...} } ...} SpecialJoinInfo *

seems to fit. Though outer_rel->relids is a pretty weird address - 0xe808498b48d78b48? Really?

I'd point DrMemory at it, but unfortunately it only supports 32-bit applications so far. I don't have access to any of the commerical tools like Purify. Maybe someone at EDB can help out with that, if you guys do?

Register states are:

RAX = 000000000043F7B0 RBX = 0000000009E32218 RCX = 0000000009E78510 RDX = 0000000009E7ABD0 RSI = 0000000009E78510 RDI = 0000000009E32218 R8  = 0000000009E7B3F8 R9  = 0000000009E7B1E8 R10 = 0000000009E7A9C0 R11 = 0000000000000001 R12 = 0000000009E32200 R13 = 0000000000000000 R14 = 0000000009E7B1E8 R15 = 0000000000000000 RIP = 00000001401A59D1 RSP = 000000000043F6E0 RBP = 0000000009E7A9C0 EFL = 00010202 

and the exact crash site is

fkselec = get_foreign_key_join_selectivity(root,
  outer_rel->relids,
  inner_rel->relids,
  sjinfo,
  &restrictlist);
00000001401A59AB  mov         r8,qword ptr [r8+8]  
00000001401A59AF  mov         rdx,qword ptr [rdx+8]  
00000001401A59B3  movaps      xmmword ptr [rax-28h],xmm6  
00000001401A59B7  movaps      xmmword ptr [rax-38h],xmm7  
00000001401A59BB  movaps      xmmword ptr [rax-48h],xmm8  
00000001401A59C0  movaps      xmmword ptr [rax-58h],xmm9  
00000001401A59C5  lea         rax,[rax+38h]  
00000001401A59C9  movaps      xmm7,xmm3  
00000001401A59CC  mov         qword ptr [rsp+20h],rax  
00000001401A59D1  movaps      xmmword ptr [rax-68h],xmm10     <---- here
00000001401A59D6  mov         qword ptr [rax-48h],r14  
00000001401A59DA  mov         r14,qword ptr [sjinfo]  
00000001401A59E2  mov         ebp,dword ptr [r14+28h]  
00000001401A59E6  mov         qword ptr [rax-50h],r15  
00000001401A59EA  mov         r9,r14  
00000001401A59ED  mov         r15,rcx  
00000001401A59F0  call        get_foreign_key_join_selectivity (01401A5C30h)  

with

XMM3 000000000000000040A5720000000000
RAX 000000000043F7B0
XMM7 000000000000000040A5720000000000
RSP 000000000043F6E0
XMM10 00000000000000000000000000000000


I'm about 100% ignorant of x64 asm, but hopefully someone can interpret this usefully. I can tell it's doing a sse "Move Aligned Packed Single-Precision Floating-Point Values" (from memory into a sse register?) but that's about it.

rax-68h is 0x000000000043F748. The memory at that location is

00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0 bf 00 00 00 00 00 00 00 00 c0 a9 e7 09 00 00 00 00 f8 b3 e7 09 00 00


So there you go, a whole bunch of data and I, at least, am still none the wiser.




--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)

От
Michael Paquier
Дата:
On Fri, Jun 24, 2016 at 3:22 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
>
>
> On 24 June 2016 at 10:28, Michael Paquier <michael.paquier@gmail.com> wrote:
>>
>> On Fri, Jun 24, 2016 at 11:21 AM, Craig Ringer <craig@2ndquadrant.com>
>> wrote:
>> >   * Launch a VS x86 command prompt
>> >   * devenv /debugexe bin\initdb.exe -D test
>> >   * Set a breakpoint in initdb.c:3557 and initdb.c:3307
>> >   * Run
>> >   * When it traps at get_restricted_token(), manually move the execution
>> > pointer over the setup of the restricted execution token by dragging &
>> > dropping the yellow instruction pointer arrow. Yes, really. Or, y'know,
>> > comment it out and rebuild, but I was working with a supplied binary.
>> >   * Continue until next breakpoint
>> >   * Launch process explorer and find the pid of the postgres child
>> > process
>> >   * Debug->attach to process, attach to the child postgres. This doesn't
>> > detach the parent, VS does multiprocess debugging.
>> >   * Continue execution
>> >   * vs will trap on the child when it crashes
>>
>> Do you think a crash dump could have been created by creating
>> crashdumps/ in PGDATA as part of initdb before this query is run?
>
>
>
> The answer is "yes" btw. Add "crashdumps" to the static array of directories
> created by initdb and it works great.

As simple as attached..

> Sigh. It'd be less annoying if I hadn't written most of the original patch.

You mean the patch that created the crashdumps/ trick? This has saved
me a couple of months back to analyze a problem TBH.
--
Michael

Вложения
On Fri, Jun 24, 2016 at 11:21 AM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:

>> I was helping Haroon with this last night. I don't have access to the
>> original thread and he's not around so I don't know how much he said. I'll
>> repeat our findings here.

Craig, I am around now looking into this. I'll update the list as I get more info.

- Haroon

On 24 June 2016 at 11:27, Michael Paquier <michael.paquier@gmail.com> wrote:
On Fri, Jun 24, 2016 at 3:22 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
>
>
> On 24 June 2016 at 10:28, Michael Paquier <michael.paquier@gmail.com> wrote:
>>
>> On Fri, Jun 24, 2016 at 11:21 AM, Craig Ringer <craig@2ndquadrant.com>
>> wrote:
>> >   * Launch a VS x86 command prompt
>> >   * devenv /debugexe bin\initdb.exe -D test
>> >   * Set a breakpoint in initdb.c:3557 and initdb.c:3307
>> >   * Run
>> >   * When it traps at get_restricted_token(), manually move the execution
>> > pointer over the setup of the restricted execution token by dragging &
>> > dropping the yellow instruction pointer arrow. Yes, really. Or, y'know,
>> > comment it out and rebuild, but I was working with a supplied binary.
>> >   * Continue until next breakpoint
>> >   * Launch process explorer and find the pid of the postgres child
>> > process
>> >   * Debug->attach to process, attach to the child postgres. This doesn't
>> > detach the parent, VS does multiprocess debugging.
>> >   * Continue execution
>> >   * vs will trap on the child when it crashes
>>
>> Do you think a crash dump could have been created by creating
>> crashdumps/ in PGDATA as part of initdb before this query is run?
>
>
>
> The answer is "yes" btw. Add "crashdumps" to the static array of directories
> created by initdb and it works great.

As simple as attached..

> Sigh. It'd be less annoying if I hadn't written most of the original patch.

You mean the patch that created the crashdumps/ trick? This has saved
me a couple of months back to analyze a problem TBH.
--
Michael



--
Haroon                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Craig Ringer <craig@2ndquadrant.com> writes:
> I have absolutely no idea why it's trying to access memory at what looks
> like   (uint64)(-1) though.  Nothing in the auto vars list:

> + &restrictlist 0x000000000043f7b0 {0x0000000009e32600 {type=T_List (656)
> length=1 head=0x0000000009e325e0 {data={ptr_value=...} ...} ...}} List * *
> + inner_rel 0x0000000009e7ad68 {type=T_EquivalenceClass (537)
> reloptkind=RELOPT_BASEREL (0) relids=0x0000000009e30520 {...} ...} RelOptInfo
> *
> + inner_rel->relids 0x0000000009e30520 {nwords=658 words=0x0000000009e30524
> {...} } Bitmapset *
> + outer_rel 0x00000001401dec98
> {postgres.exe!build_joinrel_tlist(PlannerInfo * root, RelOptInfo * joinrel,
> RelOptInfo * input_rel), Line 646} {...} RelOptInfo *
> + outer_rel->relids 0xe808498b48d78b48 {nwords=??? words=0xe808498b48d78b4c
> {...} } Bitmapset *
> + sjinfo 0x000000000043f870 {type=T_SpecialJoinInfo (543)
> min_lefthand=0x0000000009e7abd0 {nwords=1 words=0x0000000009e7abd4 {...} }
> ...} SpecialJoinInfo *


inner_rel seems to be pointing at garbage, or at least why is the
referenced object tag T_EquivalenceClass not T_RelOptInfo?  And
why aren't we being given anything for outer_rel?  The value for
outer_rel->relids isn't inspiring any confidence either, and
for that matter inner_rel->relids couldn't possibly have more than
nwords==1 given how simple the query is.  In short, either the
debugger is totally confused or the code is, because most of these
pointers aren't pointing at anything sane.

TBH, this looks more like a compiler bug than anything else.  I wonder
whether it's getting confused by taking the address of a parameter
(although surely we do that elsewhere).

It would be worth recompiling at -O0, or whatever the local equivalent
of that is, to see if (1) the crash goes away or (2) the debugger's
printouts get any more reliable.
        regards, tom lane




> On Fri, Jun 24, 2016 at 1:28 PM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:

> > I'd like more details from those whose installs are crashing. What exact
> > vcvars env did you run under, with which exact cl.exe version?

This is a Windows server 2012 R2 Standard. 
Devenv: Microsoft Visual Studio 2013 Community Version 12.0.31101.0.

     Env:

            %comspec% /k ""C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\vcvarsall.bat"" x86_amd64

     'where cl.exe'

            C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\x86_amd64\cl.exe
            C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\cl.exe


I have been able to reproduce it on Windows 7 Professional (Service Pack 1 ) also with Microsoft Visual Studio 2013 Community Version 12.0.40629.0.
      
       Env:
             %comspec% /k ""C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\vcvarsall.bat"" x86_amd64
  
       'Where cl.exe'
              C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\x86_amd64\cl.exe
              C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\cl.exe



I started with bisect activity between beta2 (bad) and beta1(good) given that beta1 works fine. Crash occurs at following commit.

commit 100340e2dcd05d6505082a8fe343fb2ef2fa5b2a
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Sat Jun 18 15:22:34 2016 -0400

   
Restore foreign-key-aware estimation of join relation sizes.

    This patch provides a new implementation of the logic added by commit
    137805f89 and later removed by 77ba61080.  It differs from the original
    primarily in expending much less effort per joinrel in large queries,
    which it accomplishes by doing most of the matching work once per query not
    once per joinrel.  Hopefully, it's also less buggy and better commented.
    The never-documented enable_fkey_estimates GUC remains gone.
 
    There remains work to be done to make the selectivity estimates account
    for nulls in FK referencing columns; but that was true of the original
    patch as well.  We may be able to address this point later in beta.
    In the meantime, any error should be in the direction of overestimating
    rather than underestimating joinrel sizes, which seems like the direction
    we want to err in.

    Tomas Vondra and Tom Lane
    Discussion: <31041.1465069446@sss.pgh.pa.us>

This appears consistent with the crash in planner suggested by crash dump Craig shared.

Tom any ideas on what could be going wrong here ?

Given that it fails on 'setup_description', I tried bypassing that by commenting it out, it again crashes on 'setup_privileges' and 'setup_schema'.

debug_query_string for setup_privileges:

INSERT INTO pg_init_privs   (objoid, classoid, objsubid, initprivs, privtype)    SELECT        oid,        (SELECT oid FROM pg_class WHERE relname = 'pg_class'),        0,        relacl,        'i'    FROM        pg_class    WHERE        relacl IS NOT NULL        AND relkind IN ('r', 'v', 'm', 'S');INSERT INTO pg_init_privs   (objoid, classoid, objsubid, initprivs, privtype)    SELECT        pg_class.oid,        (SELECT oid FROM pg_class WHERE relname = 'pg_class'),        pg_attribute.attnum,        pg_attribute.attacl,        'i'    FROM        pg_class        JOIN pg_attribute ON (pg_class.oid = pg_attribute.attrelid)    WHERE        pg_attribute.attacl IS NOT NULL        AND pg_class.relkind IN ('r', 'v', 'm', 'S');INSERT INTO pg_init_privs   (objoid, classoid, objsubid, initprivs, privtype)    SELECT        oid,        (SELECT oid FROM pg_class WHERE relname = 'pg_proc'),        0,        proacl,        'i'    FROM        pg_proc    WHERE        proacl IS NOT NULL;INSERT INTO pg_init_privs   (objoid, classoid, objsubid, initprivs, privtype)    SELECT        oid,        (SELECT oid FROM pg_class WHERE relname = 'pg_type'),        0,        typacl,        'i'    FROM        pg_type    WHERE        typacl IS NOT NULL;INSERT INTO pg_init_privs   (objoid, classoid, objsubid, initprivs, privtype)    SELECT        oid,        (SELECT oid FROM pg_class WHERE relname = 'pg_language'),        0,        lanacl,        'i'    FROM        pg_language    WHERE        lanacl IS NOT NULL;INSERT INTO pg_init_privs   (objoid, classoid, objsubid, initprivs, privtype)    SELECT        oid,        (SELECT oid FROM pg_class WHERE  relname = 'pg_largeobject_metadata'),        0,        lomacl,        'i'    FROM        pg_largeobject_metadata    WHERE        lomacl IS NOT NULL;INSERT INTO pg_init_privs   (objoid, classoid, objsubid, initprivs, privtype)    SELECT        oid,        (SELECT oid FROM pg_class WHERE relname = 'pg_namespace'),        0,        nspacl,        'i'    FROM        pg_namespace    WHERE        nspacl IS NOT NULL;INSERT INTO pg_init_privs   (objoid, classoid, objsubid, initprivs, privtype)    SELECT        oid,        (SELECT oid FROM pg_class WHERE relname = 'pg_database'),        0,        datacl,        'i'    FROM        pg_database    WHERE        datacl IS NOT NULL;INSERT INTO pg_init_privs   (objoid, classoid, objsubid, initprivs, privtype)    SELECT        oid,        (SELECT oid FROM pg_class WHERE relname = 'pg_tablespace'),        0,        spcacl,        'i'    FROM        pg_tablespace    WHERE        spcacl IS NOT NULL;INSERT INTO pg_init_privs   (objoid, classoid, objsubid, initprivs, privtype)    SELECT        oid,        (SELECT oid FROM pg_class WHERE  relname = 'pg_foreign_data_wrapper'),        0,        fdwacl,        'i'    FROM        pg_foreign_data_wrapper    WHERE        fdwacl IS NOT NULL;INSERT INTO pg_init_privs   (objoid, classoid, objsubid, initprivs, privtype)    SELECT        oid,        (SELECT oid FROM pg_class  WHERE relname = 'pg_foreign_server'),        0,        srvacl,        'i'    FROM        pg_foreign_server    WHERE        srvacl IS NOT NULL;/*
 * SQL Information Schema
 * as defined in ISO/IEC 9075-11:2011
 *
 * Copyright (c) 2003-2016, PostgreSQL Global Development Group
 *
 * src/backend/catalog/information_schema.sql
 *
 * Note: this file is read in single-user -j mode, which means that the
 * command terminator is semicolon-newline-newline; whenever the backend
 * sees that, it stops and executes what it's got.  If you write a lot of
 * statements without empty lines between, they'll all get quoted to you
 * in any error message about one of them, so don't do that.  Also, you
 * cannot write a semicolon immediately followed by an empty line in a
 * string literal (including a function body!) or a multiline comment.
 */

/*
 * Note: Generally, the definitions in this file should be ordered
 * according to the clause numbers in the SQL standard, which is also the
 * alphabetical order.  In some cases it is convenient or necessary to
 * define one information schema view by using another one; in that case,
 * put the referencing view at the very end and leave a note where it
 * should have been put.
 */


/*
 * 5.1
 * INFORMATION_SCHEMA schema
 */

CREATE SCHEMA information_schema;
GRANT USAGE ON SCHEMA information_schema TO PUBLIC;
SET search_path TO information_schema;


debug_query_string for setup_schema:

INSERT INTO sql_implementation_info VALUES ('10003', 'CATALOG NAME', NULL, 'Y', NULL);
INSERT INTO sql_implementation_info VALUES ('10004', 'COLLATING SEQUENCE', NULL, (SELECT default_collate_name FROM character_sets), NULL);
INSERT INTO sql_implementation_info VALUES ('23',    'CURSOR COMMIT BEHAVIOR', 1, NULL, 'close cursors and retain prepared statements');
INSERT INTO sql_implementation_info VALUES ('2',     'DATA SOURCE NAME', NULL, '', NULL);
INSERT INTO sql_implementation_info VALUES ('17',    'DBMS NAME', NULL, (select trim(trailing ' ' from substring(version() from '^[^0-9]*'))), NULL);
INSERT INTO sql_implementation_info VALUES ('18',    'DBMS VERSION', NULL, '???', NULL); -- filled by initdb
INSERT INTO sql_implementation_info VALUES ('26',    'DEFAULT TRANSACTION ISOLATION', 2, NULL, 'READ COMMITTED; user-settable');
INSERT INTO sql_implementation_info VALUES ('28',    'IDENTIFIER CASE', 3, NULL, 'stored in mixed case - case sensitive');
INSERT INTO sql_implementation_info VALUES ('85',    'NULL COLLATION', 0, NULL, 'nulls higher than non-nulls');
INSERT INTO sql_implementation_info VALUES ('13',    'SERVER NAME', NULL, '', NULL);
INSERT INTO sql_implementation_info VALUES ('94',    'SPECIAL CHARACTERS', NULL, '', 'all non-ASCII characters allowed');
INSERT INTO sql_implementation_info VALUES ('46',    'TRANSACTION CAPABLE', 2, NULL, 'both DML and DDL');


And if I comment these out i.e. setup_description, setup_privileges and 'setup_schema' it seem to progress well without any errors/crashes.


Regards,
Haroon

-- 
Haroon                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)

От
Craig Ringer
Дата:


On 24 June 2016 at 21:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:
 

TBH, this looks more like a compiler bug than anything else. 

I tend to agree. Especially since valgrind has no complaints on x64 linux, and neither does DrMemory for 32-bit builds with the same toolchain on the same Windows and same SDK.

I don't see any particular reason we can't proceed with 9.6beta2 and build x64 Pg with MS VS 2015. There's no evidence turning up of a Pg bug here, and compiling with a different toolchain gets us working binaries for the target platform in question.
 
It would be worth recompiling at -O0, or whatever the local equivalent
of that is, to see if (1) the crash goes away or (2) the debugger's
printouts get any more reliable

Yeah, it probably is. I'll see if I can find time this w/e.


--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
Craig Ringer <craig@2ndquadrant.com> writes:
> On 24 June 2016 at 21:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> TBH, this looks more like a compiler bug than anything else.

> I tend to agree. Especially since valgrind has no complaints on x64 linux,
> and neither does DrMemory for 32-bit builds with the same toolchain on the
> same Windows and same SDK.

If that is the explanation, I'm suspicious that it's got something to do
with the interaction of a static inline-able (single-call-site) function
and taking the address of a formal parameter.  We certainly have multiple
other instances of each thing, but maybe not both at the same place.
This leads to a couple of suggestions for dodging the problem:

1. Make get_foreign_key_join_selectivity non-static so that it doesn't
get inlined, along the lines of
                       List *restrictlist);
-static Selectivity get_foreign_key_join_selectivity(PlannerInfo *root,
+extern Selectivity get_foreign_key_join_selectivity(PlannerInfo *root,                             Relids
outer_relids,
... */
-static Selectivity
+Selectivityget_foreign_key_join_selectivity(PlannerInfo *root,

2. Don't pass the original formal parameter to
get_foreign_key_join_selectivity, ie do something like
static doublecalc_joinrel_size_estimate(PlannerInfo *root,                           RelOptInfo *outer_rel,
             RelOptInfo *inner_rel,                           double outer_rows,                           double
inner_rows,                          SpecialJoinInfo *sjinfo,
 
-                            List *restrictlist)
+                            List *orig_restrictlist){    JoinType    jointype = sjinfo->jointype;
+    List       *restrictlist = orig_restrictlist;    Selectivity fkselec;    Selectivity jselec;    Selectivity
pselec;

Obviously, if either of those things do make the problem go away, it's
a compiler bug.  If not, we'll need to dig deeper.
        regards, tom lane



"Haroon ." <contact.mharoon@gmail.com> writes:
> And if I comment these out i.e. setup_description, setup_privileges and
> 'setup_schema' it seem to progress well without any errors/crashes.

Presumably, what you've done there is remove every single join query
from the post-bootstrap scripts.  That isn't particularly useful in
itself, but it does suggest that you would be able to fire up a
normal session afterwards in which you could use a more conventional
debugging approach.  The problem can evidently be categorized as
"planning of any join query whatsoever crashes", so a test case
ought to be easy enough to come by.
        regards, tom lane



On Sat, Jun 25, 2016 at 6:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

If that is the explanation, I'm suspicious that it's got something to do
with the interaction of a static inline-able (single-call-site) function
and taking the address of a formal parameter.  We certainly have multiple
other instances of each thing, but maybe not both at the same place.
This leads to a couple of suggestions for dodging the problem:

2. Don't pass the original formal parameter to
get_foreign_key_join_selectivity, ie do something like

 static double
 calc_joinrel_size_estimate(PlannerInfo *root,
                                                   RelOptInfo *outer_rel,
                                                   RelOptInfo *inner_rel,
                                                   double outer_rows,
                                                   double inner_rows,
                                                   SpecialJoinInfo *sjinfo,
-                                                  List *restrictlist)
+                                                  List *orig_restrictlist)
 {
        JoinType        jointype = sjinfo->jointype;
+       List       *restrictlist = orig_restrictlist;
        Selectivity fkselec;
        Selectivity jselec;
        Selectivity pselec;


The problem appears to be related to 'taking the address of a formal parameter'. NOT passing the original formal parameter to get_foreign_key_join_selectivity fixes it (dodges the problem) on VS2013. Resulting binaries seem to work fine as initdb doesn't experience child process crash anymore. 'vcregress check' does not report any failures also.



Anyways, We have decided to use VS2015 tool chain for 9.6beta2 release.

Thanks everyone for the valuable input and help. Appreciate it!

Regards,
Haroon

-- 
Haroon                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
"Haroon ." <contact.mharoon@gmail.com> writes:
> On Sat, Jun 25, 2016 at 6:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> This leads to a couple of suggestions for dodging the problem:
>> 
>> 2. Don't pass the original formal parameter to
>> get_foreign_key_join_selectivity, ie do something like
>> 
>> static double
>> calc_joinrel_size_estimate(PlannerInfo *root,
>> RelOptInfo *outer_rel,
>> RelOptInfo *inner_rel,
>> double outer_rows,
>> double inner_rows,
>> SpecialJoinInfo *sjinfo,
>> -                                                  List *restrictlist)
>> +                                                  List *orig_restrictlist)
>> {
>> JoinType        jointype = sjinfo->jointype;
>> +       List       *restrictlist = orig_restrictlist;
>> Selectivity fkselec;
>> Selectivity jselec;
>> Selectivity pselec;
>> 
>> 
> The problem appears to be related to 'taking the address of a formal
> parameter'. NOT passing the original formal parameter to
> get_foreign_key_join_selectivity fixes it (dodges the problem) on VS2013.

Thanks for investigating!  I'll go commit that change.  I wish someone
would put up a buildfarm critter using VS2013, though.
        regards, tom lane



Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)

От
Alvaro Herrera
Дата:
Tom Lane wrote:
> "Haroon ." <contact.mharoon@gmail.com> writes:

> > The problem appears to be related to 'taking the address of a formal
> > parameter'. NOT passing the original formal parameter to
> > get_foreign_key_join_selectivity fixes it (dodges the problem) on VS2013.
> 
> Thanks for investigating!  I'll go commit that change.  I wish someone
> would put up a buildfarm critter using VS2013, though.

Uh, isn't that what woodlouse is using?

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)

От
Alvaro Herrera
Дата:
Michael Paquier wrote:
> On Fri, Jun 24, 2016 at 11:51 AM, Tsunakawa, Takayuki
> <tsunakawa.takay@jp.fujitsu.com> wrote:
> >> From: pgsql-hackers-owner@postgresql.org
> >> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Michael Paquier
> >> Sent: Friday, June 24, 2016 11:37 AM
> >> On Fri, Jun 24, 2016 at 11:33 AM, Craig Ringer <craig@2ndquadrant.com>
> >> wrote:
> >> > It might be worth testing that out and adding an initdb startup
> >> > flag to create the directory, since initdb is such a PITA to
> >> > debug.
> >>
> >> I was more thinking about putting that under -DDEBUG for example.
> >
> > I think just the existing option -d (--debug) and/or -n (--no-clean)
> > would be OK.
> 
> If the majority thinks that an option switch is more adapted, I won't
> fight it strongly. Just please let's not mess up with the behavior of
> the existing options.

I think creating crashdumps/ when both -d and -n are specified is a bit
odd but reasonable.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Tom Lane wrote:
>> Thanks for investigating!  I'll go commit that change.  I wish someone
>> would put up a buildfarm critter using VS2013, though.

> Uh, isn't that what woodlouse is using?

Well, it wasn't reporting this crash, so there's *something* different.
        regards, tom lane



Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)

От
Craig Ringer
Дата:
On 30 June 2016 at 07:21, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Tom Lane wrote:
>> Thanks for investigating!  I'll go commit that change.  I wish someone
>> would put up a buildfarm critter using VS2013, though.

> Uh, isn't that what woodlouse is using?

Well, it wasn't reporting this crash, so there's *something* different.


It may only affect the i386 to x86_64 cross compiler. If Woodlouse is using native x86_64 compilers perhaps that's why?

We've confirmed it on two different versions of VS 2013, so it's not specific to one minor compiler point release.

It'd be handy if the buildfarm captured the output of:

* cl   (no arguments, first line only)
* msbuild /nologo /version

and the env vars:

* VS*COMNTOOLS  (* being any 3 digits)
* PROCESSOR_ARCHITECTURE
* PROCESSOR_IDENTIFIER
* PROCESSOR_ARCHITEW6432

since right now it's hard to be totally sure exactly what a VS animal is building with unless there's a log attached due to a failure.

That said, TBH I doubt we can or should cover every VS release in every VS configuration. Especially since there are so many ways you can excitingly break and mangle VS, particularly when installing multiple VS versions on one host. It's a great IDE with a truly awful set of installation and managment tools.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)

От
Alvaro Herrera
Дата:
Craig Ringer wrote:
> On 30 June 2016 at 07:21, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 
> > Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > > Tom Lane wrote:
> > >> Thanks for investigating!  I'll go commit that change.  I wish someone
> > >> would put up a buildfarm critter using VS2013, though.
> >
> > > Uh, isn't that what woodlouse is using?
> >
> > Well, it wasn't reporting this crash, so there's *something* different.

> It may only affect the i386 to x86_64 cross compiler. If Woodlouse is using
> native x86_64 compilers perhaps that's why?

Hmm, so what about a pure 32bit build, if such a thing still exists?  If
so and it causes the same crash, perhaps we should have one member for
each VS version running on 32bit x86.

(I note that the coverage of MSVC versions has greatly improved in
recent months.)

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)

От
Craig Ringer
Дата:


On 30 June 2016 at 20:19, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Craig Ringer wrote:
> On 30 June 2016 at 07:21, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> > Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > > Tom Lane wrote:
> > >> Thanks for investigating!  I'll go commit that change.  I wish someone
> > >> would put up a buildfarm critter using VS2013, though.
> >
> > > Uh, isn't that what woodlouse is using?
> >
> > Well, it wasn't reporting this crash, so there's *something* different.

> It may only affect the i386 to x86_64 cross compiler. If Woodlouse is using
> native x86_64 compilers perhaps that's why?

Hmm, so what about a pure 32bit build, if such a thing still exists?  If
so and it causes the same crash, perhaps we should have one member for
each VS version running on 32bit x86.

It's fine for a pure 32-bit build, i.e. 32-bit tools and 32-bit target. I tested that.


--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)

От
Alvaro Herrera
Дата:
Craig Ringer wrote:
> On 30 June 2016 at 20:19, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> 
> > Hmm, so what about a pure 32bit build, if such a thing still exists?  If
> > so and it causes the same crash, perhaps we should have one member for
> > each VS version running on 32bit x86.
> 
> It's fine for a pure 32-bit build, i.e. 32-bit tools and 32-bit target. I
> tested that.

Ah, okay.  I doubt it's worth setting up buildfarm members testing all
cross-compiles just to try and catch possible compiler bugs that way, so
unless somebody wants to invest more effort in this area, it seems we're
done here.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)

От
Michael Paquier
Дата:
On Fri, Jul 1, 2016 at 9:57 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> Craig Ringer wrote:
>> On 30 June 2016 at 20:19, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>>
>> > Hmm, so what about a pure 32bit build, if such a thing still exists?  If
>> > so and it causes the same crash, perhaps we should have one member for
>> > each VS version running on 32bit x86.
>>
>> It's fine for a pure 32-bit build, i.e. 32-bit tools and 32-bit target. I
>> tested that.
>
> Ah, okay.  I doubt it's worth setting up buildfarm members testing all
> cross-compiles just to try and catch possible compiler bugs that way, so
> unless somebody wants to invest more effort in this area, it seems we're
> done here.

Sure. To be honest just using the latest version of MSVC available for
the builds is fine I think. Windows is very careful regarding
backward-compatibility of its compiled stuff usually, even if by using
VS2015 you make the builds of Postgres incompatible with XP. But
software is a world that keeps moving on, and XP is already out of
support by Redmond.
-- 
Michael



Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)

От
Craig Ringer
Дата:


On 1 July 2016 at 09:02, Michael Paquier <michael.paquier@gmail.com> wrote:
On Fri, Jul 1, 2016 at 9:57 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> Craig Ringer wrote:
>> On 30 June 2016 at 20:19, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>>
>> > Hmm, so what about a pure 32bit build, if such a thing still exists?  If
>> > so and it causes the same crash, perhaps we should have one member for
>> > each VS version running on 32bit x86.
>>
>> It's fine for a pure 32-bit build, i.e. 32-bit tools and 32-bit target. I
>> tested that.
>
> Ah, okay.  I doubt it's worth setting up buildfarm members testing all
> cross-compiles just to try and catch possible compiler bugs that way, so
> unless somebody wants to invest more effort in this area, it seems we're
> done here.

Sure. To be honest just using the latest version of MSVC available for
the builds is fine I think. Windows is very careful regarding
backward-compatibility of its compiled stuff usually, even if by using
VS2015 you make the builds of Postgres incompatible with XP. But
software is a world that keeps moving on, and XP is already out of
support by Redmond.

I agree. I'm happier now that we've got evidence it's a compiler bug, though.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services