Re: Crash with old Windows on new CPU

Поиск
Список
Период
Сортировка
От Christian Ullrich
Тема Re: Crash with old Windows on new CPU
Дата
Msg-id AM2PR06MB0690415F667B2A8864CEE893D4AA0@AM2PR06MB0690.eurprd06.prod.outlook.com
обсуждение исходный текст
Ответ на Re: Crash with old Windows on new CPU  (Christian Ullrich <chris@chrullrich.net>)
Список pgsql-hackers
* From: Christian Ullrich

> On February 13, 2016 4:10:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 
> > Christian Ullrich <chris@chrullrich.net> writes:

> > Lastly, I'd like to see some discussion of what side effects
> > "_set_FMA3_enable(0);" has ... I rather doubt that it's really
> > a magic-elixir-against-crashes-with-no-downsides.
> 
> It tells the math library (in the CRT, no separate libm on Windows)
> not to use the AVX2-based implementations of log() and possibly
> other functions. AIUI, FMA means "fused multiply-add" and is
> apparently something that increases performance and accuracy in
> transcendental functions.
> 
> I can check the CRT source later today and figure out exactly what
> it does.

OK, it turns out that the CRT source MS ships is not quite as complete as I thought it was (up until 2013, at least),
soI had a look at the disassembly. When the library initializes, it checks whether the CPU supports the FMA
instructionsby looking at a certain bit in the CPUID result. If that is set, it sets a flag to use the FMA
instructions.Later, in exp(), log(), pow() and the trigonometrical functions, it first checks whether that flag is set,
andif so, uses the AVX-based implementation. If the flag is not set, it falls back to an SSE2-based one. So, yes, that
functiononly and specifically disables the use of instructions that do not work in the problematic case.
 

The bug appears to be that it uses all manner of AVX and AVX2 instructions based only on the FMA support flag in CPUID,
eventhough AVX2 has its own bit there.
 

To reiterate: The problem occurs because the library only asks the CPU whether it is *able* to perform the AVX
instructions,but not whether it is *willing* to do so. In this particular situation, the former applies but not the
latter,because the CPU needs OS support (saving the XMM/YMM registers across context switches), and the OS has not
declaredits support for that.
 

The downside to disabling the AVX implementations is a performance loss compared to using it. I ran a microbenchmark
(avg(log(x)from generate_series(1,1e8))), and the result was that with FMA enabled, it is ~5.5% faster than without.
 

-- 
Christian


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Yury Zhuravlev
Дата:
Сообщение: Re: Crash with old Windows on new CPU
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: extend pgbench expressions with functions