[HACKERS] Non-deterministic behavior with floating point in parallel mode

Поиск
Список
Период
Сортировка
От Ruben Buchatskiy
Тема [HACKERS] Non-deterministic behavior with floating point in parallel mode
Дата
Msg-id CAFRJ5K0+ZZaUz0-ihX-aCj1h42H=s-CLWO+2Fb6nHCvXx19Diw@mail.gmail.com
обсуждение исходный текст
Ответы Re: [HACKERS] Non-deterministic behavior with floating point in parallel mode  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Hi hackers,

We have found that in parallel mode result of queries is non-deterministic when the types of the attributes in table are double precision (floating-point).

Our example is based on TPC-H, but some NUMERIC columns type was changed to DOUBLE PRECISION;

When running without parallelism

tpch=# set max_parallel_workers_per_gather to 0;
SET
tpch=# select sum(l_extendedprice) from lineitem where l_shipdate <= date '1998-12-01' - interval '116 days';
       sum        
------------------
 448157055361.319
(1 row)

output is always the same.

But in parallel mode

tpch=# set max_parallel_workers_per_gather to 1;
SET
tpch=# select sum(l_extendedprice) from lineitem where l_shipdate <= date '1998-12-01' - interval '116 days';
       sum        
------------------
 448157055361.341
(1 row)

tpch=# select sum(l_extendedprice) from lineitem where l_shipdate <= date '1998-12-01' - interval '116 days';
       sum       
-----------------
 448157055361.348
(1 row)

result differs between runs.

That is because floating-point addition is not necessarily associative. That is, (a + b) + c is not necessarily equal to a + (b + c).
In parallel mode, the order in which the attribute values are added (summed) changes between runs, which leads to non-deterministic results.

Is this desirable behavior?

--
Best Regards,
Ruben. <ruben@ispras.ru>
ISP RAS.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Magnus Hagander
Дата:
Сообщение: Re: [HACKERS] Enabling replication connections by default in pg_hba.conf
Следующее
От: Alexey Bashtanov
Дата:
Сообщение: [HACKERS] patch: optimize information_schema.constraint_column_usage