Обсуждение: Crash with old Windows on new CPU

Поиск
Список
Период
Сортировка

Crash with old Windows on new CPU

От
Christian Ullrich
Дата:
Hello,

I just found a compatibility issue when I was migrating an elderly VM to 
a new host. The VM is running Windows Server 2008 SP2, and it has the 
EDB build of PostgreSQL 9.4.5 on it. (9.4.6 behaves the same.) It is 
also not dependent on running in a VM; it would fail on the hardware as 
well.

Backends (and possibly other processes) crash at the slightest 
provocation, such as "SELECT * FROM pg_stat_activity;" or VACUUM. The 
log says either "exception 0xC0000005" (segfault) or "exception 
0xC000001D" (illegal instruction).

The interesting reason: The old host had a Core-generation CPU, which 
does not support the AVX2 instruction set. The new one has a 
Haswell-generation one, and this one does. The EDB distribution of 9.4 
was built with the Visual Studio 2013 compiler, whose CRT (aka libc) has 
a bug where it uses AVX2 instructions if the *CPU* supports them, but 
does not care whether the *OS* does, and 2008 doesn't. That support was 
added in SP1 for 7/2008R2.

There is a workaround, see 

<https://connect.microsoft.com/VisualStudio/feedback/details/811093/visual-studio-2013-rtm-c-x64-code-generation-bug-for-avx2-instructions>.

It consists of a single function call, required only if the OS does not 
support AVX2.

I just tried it, and it appears to work. If there is any interest in 
fixing this, I'll be happy to prepare a patch. (Where would be the best 
place to put a function call from <math.h> that has to be done during 
startup of each server process, on Windows only?)

Otherwise, it may be time to update the manual (15.6 Supported 
Platforms) where it says PostgreSQL "can be expected to work on these 
operating systems: [...] Windows (Win2000 SP4 and later), [...]". 
Perhaps we could add "except Windows before 7 SP1/2008R2 SP1 when 
running in x64 mode on Intel CPUs introduced after May 2013 (Haswell and 
later)"?

-- 
Christian




Re: Crash with old Windows on new CPU

От
Christian Ullrich
Дата:
* Christian Ullrich wrote:

> Backends (and possibly other processes) crash at the slightest
> provocation, such as "SELECT * FROM pg_stat_activity;" or VACUUM. The
> log says either "exception 0xC0000005" (segfault) or "exception
> 0xC000001D" (illegal instruction).
>
> The interesting reason: The old host had a Core-generation CPU, which
> does not support the AVX2 instruction set. The new one has a
> Haswell-generation one, and this one does. The EDB distribution of 9.4
> was built with the Visual Studio 2013 compiler, whose CRT (aka libc) has
> a bug where it uses AVX2 instructions if the *CPU* supports them, but
> does not care whether the *OS* does, and 2008 doesn't. That support was
> added in SP1 for 7/2008R2.

> I just tried it, and it appears to work. If there is any interest in
> fixing this, I'll be happy to prepare a patch. (Where would be the best
> place to put a function call from <math.h> that has to be done during
> startup of each server process, on Windows only?)

startup_hacks(), I think. Proposed patch attached.

-- 
Christian

Re: Crash with old Windows on new CPU

От
Robert Haas
Дата:
On Fri, Feb 12, 2016 at 7:26 PM, Christian Ullrich <chris@chrullrich.net> wrote:
> * Christian Ullrich wrote:
>> Backends (and possibly other processes) crash at the slightest
>> provocation, such as "SELECT * FROM pg_stat_activity;" or VACUUM. The
>> log says either "exception 0xC0000005" (segfault) or "exception
>> 0xC000001D" (illegal instruction).
>>
>> The interesting reason: The old host had a Core-generation CPU, which
>> does not support the AVX2 instruction set. The new one has a
>> Haswell-generation one, and this one does. The EDB distribution of 9.4
>> was built with the Visual Studio 2013 compiler, whose CRT (aka libc) has
>> a bug where it uses AVX2 instructions if the *CPU* supports them, but
>> does not care whether the *OS* does, and 2008 doesn't. That support was
>> added in SP1 for 7/2008R2.
>
>> I just tried it, and it appears to work. If there is any interest in
>> fixing this, I'll be happy to prepare a patch. (Where would be the best
>> place to put a function call from <math.h> that has to be done during
>> startup of each server process, on Windows only?)
>
> startup_hacks(), I think. Proposed patch attached.

Thanks for the report and patch.  Regrettably I haven't the Windows
knowledge to have any idea whether it's right or wrong, but hopefully
someone who knows Windows will jump in here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Crash with old Windows on new CPU

От
Christian Ullrich
Дата:
* Robert Haas wrote:

> On Fri, Feb 12, 2016 at 7:26 PM, Christian Ullrich <chris@chrullrich.net> wrote:

>> startup_hacks(), I think. Proposed patch attached.
>
> Thanks for the report and patch.  Regrettably I haven't the Windows
> knowledge to have any idea whether it's right or wrong, but hopefully
> someone who knows Windows will jump in here.

In commitfest now.

-- 
Christian





Re: Crash with old Windows on new CPU

От
Tom Lane
Дата:
Christian Ullrich <chris@chrullrich.net> writes:
> * Robert Haas wrote:
>> Thanks for the report and patch.  Regrettably I haven't the Windows
>> knowledge to have any idea whether it's right or wrong, but hopefully
>> someone who knows Windows will jump in here.

> In commitfest now.

FWIW, I'm a tad suspicious of the notion that it's our job to make this
case work.  How practical is it really to run a Windows release on
unsupported-by-Microsoft hardware --- aren't dozens of other programs
going to have the same issue?

I'm also suspicious of the "#if _MSC_VER == 1800" tests, that is,
the code compiles on *exactly one* MSVC version.  Maybe that's actually
what's needed, but it sure looks fishy.  And what connection does the
build toolchain version have to the runtime environment anyway?

Likewise, how can we know that !IsWindows7SP1OrGreater() is the exactly
right runtime test?

Lastly, I'd like to see some discussion of what side effects
"_set_FMA3_enable(0);" has ... I rather doubt that it's really
a magic-elixir-against-crashes-with-no-downsides.  That would
give us some context to estimate the risks of this code executing
when it's not really needed.  Without that, I'd be worried that
this cure is worse than the disease because it breaks cases that
weren't broken before.
        regards, tom lane



Re: Crash with old Windows on new CPU

От
Christian Ullrich
Дата:
On February 13, 2016 4:10:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Christian Ullrich <chris@chrullrich.net> writes:
>> * Robert Haas wrote:
>>> Thanks for the report and patch.  Regrettably I haven't the Windows
>>> knowledge to have any idea whether it's right or wrong, but hopefully
>>> someone who knows Windows will jump in here.
>
>> In commitfest now.
>
> FWIW, I'm a tad suspicious of the notion that it's our job to make this
> case work.  How practical is it really to run a Windows release on
> unsupported-by-Microsoft hardware --- aren't dozens of other programs
> going to have the same issue?

Why would the hardware be unsupported? The problem occurs on new CPUs, not old ones, and even the OS (2008) is still in
extendedsupport until next year, IIRC. 

> I'm also suspicious of the "#if _MSC_VER == 1800" tests, that is,
> the code compiles on *exactly one* MSVC version.

The bug exists in only that compiler version's CRT, also, that is not the complete version number. There may be
differentbuilds somewhere, but they all start with 18.0.  

After all, MS is in the business of selling new compilers, not maintaining the old ones.

> Maybe that's actually
> what's needed, but it sure looks fishy.  And what connection does the
> build toolchain version have to the runtime environment anyway?

The CRT version is tied to the compiler version. It has mainly to do with matching allocators.

> Likewise, how can we know that !IsWindows7SP1OrGreater() is the exactly
> right runtime test?

Because all sources, including Microsoft, say that AVX2 support was added in 7SP1.

> Lastly, I'd like to see some discussion of what side effects
> "_set_FMA3_enable(0);" has ... I rather doubt that it's really
> a magic-elixir-against-crashes-with-no-downsides.

It tells the math library (in the CRT, no separate libm on Windows) not to use the AVX2-based implementations of log()
andpossibly other functions. AIUI, FMA means "fused multiply-add" and is apparently something that increases
performanceand accuracy in transcendental functions. 

I can check the CRT source later today and figure out exactly what it does.

Also, if you look at the link I sent, you will find that a member of the Visual C++ Libraries team at MS is the source
forthe workaround. They probably know what they are doing, present circumstances excepted.  

> That would
> give us some context to estimate the risks of this code executing
> when it's not really needed.

Hence all the conditions. The problem is *certain* to occur under these specific conditions (x64 code on Windows before
7SP1on a CPU with AVX2 when built with VS2013), and under no others, and these conditions flip the switch exactly then.


> Without that, I'd be worried that
> this cure is worse than the disease because it breaks cases that
> weren't broken before.

Isn't that what the buildfarm is (among other things) for?

--
Christian Ullrich



Re: Crash with old Windows on new CPU

От
Yury Zhuravlev
Дата:
Tom Lane wrote:
> Lastly, I'd like to see some discussion of what side effects
> "_set_FMA3_enable(0);" has ... I rather doubt that it's really
> a magic-elixir-against-crashes-with-no-downsides.  That would
> give us some context to estimate the risks of this code executing
> when it's not really needed.  Without that, I'd be worried that
> this cure is worse than the disease because it breaks cases that
> weren't broken befor

I think I have already proposed to drop MSVC support? :)
Too many problems due to of one compiler...

PS Cry from the heart. I work too much with him. Never mind.

--
Yury Zhuravlev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Crash with old Windows on new CPU

От
Christian Ullrich
Дата:
* From: Christian Ullrich

> On February 13, 2016 4:10:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 
> > Christian Ullrich <chris@chrullrich.net> writes:

> > Lastly, I'd like to see some discussion of what side effects
> > "_set_FMA3_enable(0);" has ... I rather doubt that it's really
> > a magic-elixir-against-crashes-with-no-downsides.
> 
> It tells the math library (in the CRT, no separate libm on Windows)
> not to use the AVX2-based implementations of log() and possibly
> other functions. AIUI, FMA means "fused multiply-add" and is
> apparently something that increases performance and accuracy in
> transcendental functions.
> 
> I can check the CRT source later today and figure out exactly what
> it does.

OK, it turns out that the CRT source MS ships is not quite as complete as I thought it was (up until 2013, at least),
soI had a look at the disassembly. When the library initializes, it checks whether the CPU supports the FMA
instructionsby looking at a certain bit in the CPUID result. If that is set, it sets a flag to use the FMA
instructions.Later, in exp(), log(), pow() and the trigonometrical functions, it first checks whether that flag is set,
andif so, uses the AVX-based implementation. If the flag is not set, it falls back to an SSE2-based one. So, yes, that
functiononly and specifically disables the use of instructions that do not work in the problematic case.
 

The bug appears to be that it uses all manner of AVX and AVX2 instructions based only on the FMA support flag in CPUID,
eventhough AVX2 has its own bit there.
 

To reiterate: The problem occurs because the library only asks the CPU whether it is *able* to perform the AVX
instructions,but not whether it is *willing* to do so. In this particular situation, the former applies but not the
latter,because the CPU needs OS support (saving the XMM/YMM registers across context switches), and the OS has not
declaredits support for that.
 

The downside to disabling the AVX implementations is a performance loss compared to using it. I ran a microbenchmark
(avg(log(x)from generate_series(1,1e8))), and the result was that with FMA enabled, it is ~5.5% faster than without.
 

-- 
Christian


Re: Crash with old Windows on new CPU

От
Craig Ringer
Дата:
On 13 February 2016 at 23:45, Christian Ullrich <chris@chrullrich.net> wrote:
 

> Maybe that's actually
> what's needed, but it sure looks fishy.  And what connection does the
> build toolchain version have to the runtime environment anyway?

The CRT version is tied to the compiler version. It has mainly to do with matching allocators.

In UNIX terms, the compiler and libc are tightly coupled on Windows and each executable or shared library links to the C library provided by the compiler that built the DLL or EXE. It's normal to link a program with multiple C runtime libraries if that program is composed of one or more DLLs that were each compiled by a different toolchain and therefore linked to a different C runtime ... which is common in the Windows binary distribution model.

"msvcrt.dll" is the best-known C runtime, the one bundled in Windows. It's ancient and buggy and retained only for backward compatibility. mingw has to pile hacks upon hacks around it to use it.

Microsoft has been bundling the C runtime libraries with the compiler since at least Visual Studio 7. Application installers are expected to silently run an installer for the C runtime library before installing the application. They do not use the C runtime library from the operating system.

You might have seen "Visual Studio Redistributable" packages installed by various app installers. That's what they are - they're making sure the specific C runtime they are built against is installed.

It's a bit like, on UNIX, each shared library expecting to have a different libc. One expects libc-2.20.so, one expects libc-2.19.so, etc. Each with linkage private to the shared library that loaded it. (We have this to a lesser degree with libgcc, but everyone uses one version of gcc for building everything on a given system so in practice it makes no difference.)

This is why applications on Windows must be very careful not to malloc() memory in one DLL and free() it in another, etc. They're effectively leaking it and freeing unallocated memory at the same time because the two C libraries have separate memory management structures.

It lets applications be composed of binary-distributed components from different vendors, using different toolchains. Want to build new parts of an app with VS 2017 while compiling your legacy components with VS 7, linking with a libabc built using mingw32 by a vendor who won't show you the source, and linking a libxyz built with an ancient version of Borland C from sources you lost 5 years ago? You can do that.

I don't pretend it's a good idea, but it does have advantages sometimes.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Crash with old Windows on new CPU

От
Peter Eisentraut
Дата:
On 2/12/16 11:24 AM, Christian Ullrich wrote:
> Otherwise, it may be time to update the manual (15.6 Supported
> Platforms) where it says PostgreSQL "can be expected to work on these
> operating systems: [...] Windows (Win2000 SP4 and later), [...]".
> Perhaps we could add "except Windows before 7 SP1/2008R2 SP1 when
> running in x64 mode on Intel CPUs introduced after May 2013 (Haswell and
> later)"?

Wouldn't the fix be for users to upgrade their service packs?



Re: Crash with old Windows on new CPU

От
Christian Ullrich
Дата:
* Peter Eisentraut wrote:

> On 2/12/16 11:24 AM, Christian Ullrich wrote:

>> Otherwise, it may be time to update the manual (15.6 Supported
>> Platforms) where it says PostgreSQL "can be expected to work on these
>> operating systems: [...] Windows (Win2000 SP4 and later), [...]".
>> Perhaps we could add "except Windows before 7 SP1/2008R2 SP1 when
>> running in x64 mode on Intel CPUs introduced after May 2013 (Haswell and
>> later)"?

> Wouldn't the fix be for users to upgrade their service packs?

Windows 2008 and 2008R2 are entirely different things: 2008 is the 
server sibling of Vista (internal version 6.0), 2008R2 is that of 
Windows 7 (6.1). There is no version of 2008 that supports AVX2.

-- 
Christian




Re: Crash with old Windows on new CPU

От
Magnus Hagander
Дата:


On Sat, Feb 13, 2016 at 4:45 PM, Christian Ullrich <chris@chrullrich.net> wrote:
On February 13, 2016 4:10:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Christian Ullrich <chris@chrullrich.net> writes:
>> * Robert Haas wrote:
>>> Thanks for the report and patch.  Regrettably I haven't the Windows
>>> knowledge to have any idea whether it's right or wrong, but hopefully
>>> someone who knows Windows will jump in here.
>
>> In commitfest now.
>
> FWIW, I'm a tad suspicious of the notion that it's our job to make this
> case work.  How practical is it really to run a Windows release on
> unsupported-by-Microsoft hardware --- aren't dozens of other programs
> going to have the same issue?

Why would the hardware be unsupported? The problem occurs on new CPUs, not old ones, and even the OS (2008) is still in extended support until next year, IIRC.

> I'm also suspicious of the "#if _MSC_VER == 1800" tests, that is,
> the code compiles on *exactly one* MSVC version.

The bug exists in only that compiler version's CRT, also, that is not the complete version number. There may be different builds somewhere, but they all start with 18.0.


IIRC, the CRT itself does explicit checks against _MSC_VER == 1800. As in, they don't actually bump that number in different build numbers.


How does this work wrt mingw, though? Do we have the same problem there? AIUI this code can never run on mingw, correct?


> Lastly, I'd like to see some discussion of what side effects
> "_set_FMA3_enable(0);" has ... I rather doubt that it's really
> a magic-elixir-against-crashes-with-no-downsides.

It tells the math library (in the CRT, no separate libm on Windows) not to use the AVX2-based implementations of log() and possibly other functions. AIUI, FMA means "fused multiply-add" and is apparently something that increases performance and accuracy in transcendental functions.

I can check the CRT source later today and figure out exactly what it does.

Also, if you look at the link I sent, you will find that a member of the Visual C++ Libraries team at MS is the source for the workaround. They probably know what they are doing, present circumstances excepted.

I notice the code checks IsWindows7SP1OrGreater() but the comment refers to W7SP1 *or* 2008R2 SP1. I assume this is correct, or should there actually be a separate check for server-windows?


 
> That would
> give us some context to estimate the risks of this code executing
> when it's not really needed.

Hence all the conditions. The problem is *certain* to occur under these specific conditions (x64 code on Windows before 7SP1 on a CPU with AVX2 when built with VS2013), and under no others, and these conditions flip the switch exactly then.


Well, it doesn't flip it based on the specifics "running on a CPU with AVX2". But presumably turning of AVX2 if the CPU doesn't support it is a no-op.



> Without that, I'd be worried that
> this cure is worse than the disease because it breaks cases that
> weren't broken before.

Isn't that what the buildfarm is (among other things) for?

The buildfarm doesn't really have a big spread of Windows animals, unfortunately.
 
--

Re: Crash with old Windows on new CPU

От
Christian Ullrich
Дата:
* Magnus Hagander wrote:

> On Sat, Feb 13, 2016 at 4:45 PM, Christian Ullrich <chris@chrullrich.net>
> wrote:

>> On February 13, 2016 4:10:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

>>> I'm also suspicious of the "#if _MSC_VER == 1800" tests, that is,
>>> the code compiles on *exactly one* MSVC version.
>>
>> The bug exists in only that compiler version's CRT, also, that is not the
>> complete version number. There may be different builds somewhere, but they
>> all start with 18.0.

> IIRC, the CRT itself does explicit checks against _MSC_VER == 1800. As in,
> they don't actually bump that number in different build numbers.
>
> How does this work wrt mingw, though? Do we have the same problem there?
> AIUI this code can never run on mingw, correct?

Not unless mingw defines _MSC_VER.

(If they do, I suggest we make Yury Zhuravlev cry and drop MinGW support 
instead. IMHO everyone should boycott them anyway until they come up 
with a working libc of their own instead of doing unspeakable things to 
a helpless msvcrt.dll that is not intended for use by non-system 
components at all. But I digress.)

> I notice the code checks IsWindows7SP1OrGreater() but the comment refers to
> W7SP1 *or* 2008R2 SP1. I assume this is correct, or should there actually
> be a separate check for server-windows?

No, that is fine. I think it's just to keep the function name from 
getting too ridiculously long. The functions in <versionhelpers.h> are 
all named for the client versions only, and only check the version 
number, not the client/server capability flags. Or, rather, there is a 
separate function to determine that.

>>> That would
>>> give us some context to estimate the risks of this code executing
>>> when it's not really needed.
>>
>> Hence all the conditions. The problem is *certain* to occur under these
>> specific conditions (x64 code on Windows before 7SP1 on a CPU with AVX2
>> when built with VS2013), and under no others, and these conditions flip the
>> switch exactly then.

> Well, it doesn't flip it based on the specifics "running on a CPU with
> AVX2". But presumably turning of AVX2 if the CPU doesn't support it is a
> no-op.

Precisely.

>> Isn't that what the buildfarm is (among other things) for?
>
> The buildfarm doesn't really have a big spread of Windows animals,
> unfortunately.

And apparently not a single one with VS 2013. OK, I'll see what I can do 
about setting some up soonish, at least with (server) 2008 and (client) 
7. FWIW, I have a local build of 9.5.1 with this patch in daily use on 
2008 now, with no complaints.

-- 
Christian




Re: Crash with old Windows on new CPU

От
Magnus Hagander
Дата:


On Wed, Mar 9, 2016 at 4:36 PM, Christian Ullrich <chris@chrullrich.net> wrote:
* Magnus Hagander wrote:

On Sat, Feb 13, 2016 at 4:45 PM, Christian Ullrich <chris@chrullrich.net>
wrote:

On February 13, 2016 4:10:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm also suspicious of the "#if _MSC_VER == 1800" tests, that is,
the code compiles on *exactly one* MSVC version.

The bug exists in only that compiler version's CRT, also, that is not the
complete version number. There may be different builds somewhere, but they
all start with 18.0.

IIRC, the CRT itself does explicit checks against _MSC_VER == 1800. As in,
they don't actually bump that number in different build numbers.

How does this work wrt mingw, though? Do we have the same problem there?
AIUI this code can never run on mingw, correct?

Not unless mingw defines _MSC_VER.

The question is then - should it, under some conditions?

 
(If they do, I suggest we make Yury Zhuravlev cry and drop MinGW support instead. IMHO everyone should boycott them anyway until they come up with a working libc of their own instead of doing unspeakable things to a helpless msvcrt.dll that is not intended for use by non-system components at all. But I digress.)


Haha, well, at least they're not cygwin...

 
I notice the code checks IsWindows7SP1OrGreater() but the comment refers to
W7SP1 *or* 2008R2 SP1. I assume this is correct, or should there actually
be a separate check for server-windows?

No, that is fine. I think it's just to keep the function name from getting too ridiculously long. The functions in <versionhelpers.h> are all named for the client versions only, and only check the version number, not the client/server capability flags. Or, rather, there is a separate function to determine that.

Presumably the link is documented somewhere (the docs don't seem to say anything about it).
 
That would
give us some context to estimate the risks of this code executing
when it's not really needed.

Hence all the conditions. The problem is *certain* to occur under these
specific conditions (x64 code on Windows before 7SP1 on a CPU with AVX2
when built with VS2013), and under no others, and these conditions flip the
switch exactly then.

Well, it doesn't flip it based on the specifics "running on a CPU with
AVX2". But presumably turning of AVX2 if the CPU doesn't support it is a
no-op.

Precisely.

Isn't that what the buildfarm is (among other things) for?

The buildfarm doesn't really have a big spread of Windows animals,
unfortunately.

And apparently not a single one with VS 2013. OK, I'll see what I can do about setting some up soonish, at least with (server) 2008 and (client) 7. FWIW, I have a local build of 9.5.1 with this patch in daily use on 2008 now, with no complaints.

Having that added to the buildfarm would definitely be very useful!


--

Re: Crash with old Windows on new CPU

От
Alvaro Herrera
Дата:
Magnus Hagander wrote:
> On Wed, Mar 9, 2016 at 4:36 PM, Christian Ullrich <chris@chrullrich.net>
> wrote:

> > And apparently not a single one with VS 2013. OK, I'll see what I can do
> > about setting some up soonish, at least with (server) 2008 and (client) 7.
> > FWIW, I have a local build of 9.5.1 with this patch in daily use on 2008
> > now, with no complaints.
> 
> Having that added to the buildfarm would definitely be very useful!

Yeah, ideally we would have one buildfarm member for each MSVC compiler
supported by each Windows version.  AFAICS we only have

Windows 8 8.1 Pro Visual Studio 2012 x86_64
Windows Server 2008 R2 (64bit) Visual C++ 2005 AMD64
Windows Server 2003 R2 5.2.3790 MSVC 2005 Express 8.0.50727.42 x86
Vista Ultimate 6.0.6000 MSVC 2005 Pro 8.0.50727.867 x86
Windows XP-PRO SP3 MSVC++ 2008 Express i686

which is pretty unfulfilling.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Crash with old Windows on new CPU

От
Christian Ullrich
Дата:
* Magnus Hagander wrote:

> On Wed, Mar 9, 2016 at 4:36 PM, Christian Ullrich <chris@chrullrich.net>
> wrote:
>
>> * Magnus Hagander wrote:

>>> How does this work wrt mingw, though? Do we have the same problem there?
>>> AIUI this code can never run on mingw, correct?
>>
>> Not unless mingw defines _MSC_VER.
>
> The question is then - should it, under some conditions?

I have no clue about MinGW, so all I can say is that msvcrt.dll (even in 
Windows 10) does not use the problematic instructions, nor does it 
export anything with "FMA" in its name. If there is nothing to turn off, 
and nothing to turn it off with, then I suspect we are safe there. (This 
is based on a minimal test program, just to see where MinGW takes its 
log() from; it comes from msvcrt.)

Also, we have confirmation, in the link I sent when I started this 
thread, that the bug is only and specifically in the VS2013 CRT, not 
anywhere else before or since.

>>> I notice the code checks IsWindows7SP1OrGreater() but the comment refers to
>>> W7SP1 *or* 2008R2 SP1. I assume this is correct, or should there actually
>>> be a separate check for server-windows?

>> No, that is fine. I think it's just to keep the function name from getting
>> too ridiculously long. The functions in <versionhelpers.h> are all named
>> for the client versions only, and only check the version number, not the
>> client/server capability flags. Or, rather, there is a separate function to
>> determine that.

> Presumably the link is documented somewhere (the docs don't seem to say
> anything about it).

<https://msdn.microsoft.com/en-us/library/windows/desktop/dn424960(v=vs.85).aspx> 
("IsWindows7SP1OrGreater function"):
> This function does not differentiate between client and server> releases. It will return true if the current OS
versionnumber is> equal to or higher than the version of the client named in the call.> For example, a call to
IsWindowsXPSP3OrGreaterwill return true on> Windows Server 2008. Applications that need to distinguish between> server
andclient versions of Windows should call IsWindowsServer.
 

The internal version numbers (since NT4) are:
 Version | Client | Server
---------+--------+--------     4.0 | NT4    | NT4     5.0 | 2000   | 2000     5.1 | XP     |     5.2 |        | 2003
 6.0 | Vista  | 2008     6.1 | 7      | 2008R2     6.2 | 8      | 2012     6.3 | 8.1    | 2012R2    10.0 | 10     |
[2016]

The relevant SDK header (<sdkddkver.h>), where constants for the version 
numbers are defined, also only has entries named for the client 
versions, with the sole exception of 2008, with the same value as Vista, 
and the special case of 2003 with no equivalent client version (unless 
you count the second attempt at an XP x64 ... but I digress again).

If Microsoft wanted the added complexity of separate symbols for what is 
basically the same code, I rather think they would have them.

-- 
Christian




Re: Crash with old Windows on new CPU

От
Magnus Hagander
Дата:


On Wed, Mar 9, 2016 at 5:48 AM, Christian Ullrich <chris@chrullrich.net> wrote:
* Peter Eisentraut wrote:

On 2/12/16 11:24 AM, Christian Ullrich wrote:

Otherwise, it may be time to update the manual (15.6 Supported
Platforms) where it says PostgreSQL "can be expected to work on these
operating systems: [...] Windows (Win2000 SP4 and later), [...]".
Perhaps we could add "except Windows before 7 SP1/2008R2 SP1 when
running in x64 mode on Intel CPUs introduced after May 2013 (Haswell and
later)"?

Wouldn't the fix be for users to upgrade their service packs?

Windows 2008 and 2008R2 are entirely different things: 2008 is the server sibling of Vista (internal version 6.0), 2008R2 is that of Windows 7 (6.1). There is no version of 2008 that supports AVX2.


Windows 2008 went out of mainstream support in January last year, but is on extended support until 2020. Extended support means both paid support and security support is still there (what you don't get is new hardware support and generic non-security related updates). That means we're going to see these versions around for a long time.  (And Vista is in extended support until 2017).

And exactly the type of upgrade scenario outlined in the OP is only going to be come more common as old hardware gets replaced. If we didn't patch this, the reasonable thing would be to say we don't support Visual Studio 2013, rather than a specific version of Windows, I think.

Given the small and localized change of this (and hey it even goes inside a function labeled hacks), I definitely think it's worth doing.

I've commited this one, including a reference to the MS bug report where they say they're not going to fix it in existing versions. I didn't actually test it on mingw, but as it doesn't define MSC_VER it shouldn't be affected (by the bug or by the patch).

I did notice the #ifdef's are actually different in the header and body section of the patch, which seems wrong. I used the one from the actual implementation (_M_AMD64) for the header includes as, and also merged the #ifdef's together to a single #if in each section. Please verify!

Thanks for a very good analysis and patch, and for good explanations of the details! :)

--

Re: Crash with old Windows on new CPU

От
Christian Ullrich
Дата:
* Magnus Hagander wrote:

> I did notice the #ifdef's are actually different in the header and body
> section of the patch, which seems wrong. I used the one from the actual
> implementation (_M_AMD64) for the header includes as, and also merged the
> #ifdef's together to a single #if in each section. Please verify!

Should be fine. I remember I had a reason for doing it this way, but it 
can't have been a very good one. Thanks for cleaning it up.

-- 
Christian




New Windows animal, mostly ecpg trouble (was: Crash with old Windows on new CPU)

От
Christian Ullrich
Дата:
* Alvaro Herrera wrote:

> Magnus Hagander wrote:

>> On Wed, Mar 9, 2016 at 4:36 PM, Christian Ullrich <chris@chrullrich.net>
>> wrote:

>>> And apparently not a single one with VS 2013. OK, I'll see what I can do
>>> about setting some up soonish, at least with (server) 2008 and (client) 7.
>>> FWIW, I have a local build of 9.5.1 with this patch in daily use on 2008
>>> now, with no complaints.
>>
>> Having that added to the buildfarm would definitely be very useful!
>
> Yeah, ideally we would have one buildfarm member for each MSVC compiler
> supported by each Windows version.  AFAICS we only have
>
> Windows 8 8.1 Pro Visual Studio 2012 x86_64
> Windows Server 2008 R2 (64bit) Visual C++ 2005 AMD64
> Windows Server 2003 R2 5.2.3790 MSVC 2005 Express 8.0.50727.42 x86
> Vista Ultimate 6.0.6000 MSVC 2005 Pro 8.0.50727.867 x86
> Windows XP-PRO SP3 MSVC++ 2008 Express i686

Add to that Windows 7 with Visual Studio 2013 (x64), named woodlouse, 
and there will be another one sometime over the weekend. Setting up 
Windows animals is a major pain; no wonder so few people run any.

I spent some hours getting rid of random linker errors in ecpg 
(kernel32.lib not found, msvcrt.lib not found, or unresolved symbols 
that should be in one of the two); they finally went away when I stopped 
defining the build environment in build-farm.conf and ran vcvarsall.bat 
instead (this was probably pilot error; I think I had a mix of x86 and 
x64 paths there). At that point, I managed one successful run on HEAD. 
Things started failing in ecpg again after that, with crashes of the 
test programs for the thread-thread and thread-prep test cases, 
apparently somewhere in setlocale(). I had to --skip-steps=ecpg-check 
because the crashes cause blocking UI.

The other troublemaker is plperl; it produces a wall of compiler errors 
which I cleverly did not copy, most are about something not being in a 
formal parameter list. Some preprocessor trouble, I suspect. This is 
with both ActiveState and Strawberry Perl.

I did not try Tcl, and plpython3 apparently builds, but I cannot see any 
tests being run.

-- 
Christian