Обсуждение: [HACKERS] [PATCH] Improve geometric types

Поиск
Список
Период
Сортировка

[HACKERS] [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
This is my next attempt to bring more sanity to the geometric types.
After the previous one [1] went nowhere, I extracted the parts I can
fix without touching the EPSILON.  I still hope to improve the
problems around EPSILON after this.  I organised the changes as 4
patches for ease of review:


== geo-funcs-v1 ==

Refactor geometric functions and operators code

The geometric types were not using each other's functions.  I believe
the reason behind this is simpler types line point and line being
developed after more complicated ones.  This patch reduces duplicate
code and makes functions of different datatypes more compatible.  We can
do much better than that, but it would require touching many more lines.
The changes can be summarised as:

* Re-use more functions to implement others
* Unify *_construct functions to obtain pre-allocated memory
* Unify *_interpt_internal functions to obtain pre-allocated memory
* Remove private functions from geo_decls.h
* Switch using C11 hypot() as the comment suggested

 src/backend/utils/adt/geo_ops.c | 810 +++++++++++++-------------------
 src/include/utils/geo_decls.h   |  12 +-
 src/test/regress/regress.c      |  11 +-
 3 files changed, 345 insertions(+), 488 deletions(-)


== float-header-v04 ==

Provide header file for built-in float datatypes

Even though, some datatypes under adt/ have separate header files,
most of the simple ones do not.  Their public functions were on
the builtins.h.  We would need to make more functions of floats public
to let the geometric types built on top of them.  This is a good
opportunity to make a separate header file for floats.

1acf7572554515b99ef6e783750aaea8777524ec made _cmp functions public
to solve NaN problem locally for GiST indexes.  This patch reworks it
in favour of a more extensive API.  Kevin Grittner suggested to design
the API using inline functions.  They are easier to use compared
to macros, and avoid double-evaluation hazards.

 contrib/btree_gin/btree_gin.c                 |   3 +-
 contrib/btree_gist/btree_ts.c                 |   2 +-
 contrib/cube/cube.c                           |   2 +-
 contrib/postgres_fdw/postgres_fdw.c           |   2 +-
 src/backend/access/gist/gistget.c             |   2 +-
 src/backend/access/gist/gistproc.c            |  55 +-
 src/backend/access/gist/gistutil.c            |   2 +-
 src/backend/utils/adt/float.c                 | 593 ++++--------------
 src/backend/utils/adt/formatting.c            |   8 +-
 src/backend/utils/adt/geo_ops.c               |   6 +-
 src/backend/utils/adt/geo_spgist.c            |   2 +-
 src/backend/utils/adt/numeric.c               |   1 +
 src/backend/utils/adt/rangetypes_gist.c       |   3 +-
 src/backend/utils/adt/rangetypes_selfuncs.c   |   3 +-
 src/backend/utils/adt/rangetypes_typanalyze.c |   3 +-
 src/backend/utils/adt/timestamp.c             |   1 +
 src/backend/utils/misc/guc.c                  |   1 +
 src/include/utils/builtins.h                  |  14 -
 src/include/utils/float.h                     | 383 +++++++++++
 src/include/utils/geo_decls.h                 |   1 +
 20 files changed, 561 insertions(+), 526 deletions(-)


== geo-float-v1 ==

Use the built-in float datatype to implement geometric types

This will provide:

* Check for underflow and overflow
* Check for division by zero
* Handle NaNs consistently

The patch also replaces all occurrences of "double" as "float8".  They
are the same, but were randomly spread around on the same file.

 src/backend/access/gist/gistproc.c | 156 +++++----
 src/backend/utils/adt/geo_ops.c    | 546 +++++++++++++++--------------
 src/backend/utils/adt/geo_spgist.c |  36 +-
 src/include/utils/float.h          |  13 +
 src/include/utils/geo_decls.h      |  40 +--
 5 files changed, 412 insertions(+), 379 deletions(-)


== line-fixes-v1 ==

Fix obvious problems around the line datatype

I have noticed some line operators retuning wrong results, and Tom Lane
spotted similar problems on more places.  Source history reveals that
during 1990s, the internal format of the line datatype is changed, but
most functions haven't got the hint.  The fixes are:

* Make operators more symmetric
* Reject invalid specification A=B=0 on receive
* Avoid division by zero on perpendicular operator
* Fix intersection and distance operators when neither A nor B is 1
* Avoid point distance operator crashing on float precision loss
* Fix line segment distance by getting the minimum as the comment suggested

Previous discussion:

https://www.postgresql.org/message-id/flat/CAE2gYzw_-z%3DV2kh8QqFjenu%3D8MJXzOP44wRW%3DAzzeamrmTT1%3DQ%40mail.gmail.com

 src/backend/utils/adt/geo_ops.c | 115 +++++++++++++++++++++-----------
 1 file changed, 77 insertions(+), 38 deletions(-)


[1] https://www.postgresql.org/message-id/flat/CAE2gYzwwxPWbzxY3mtN4WL7W0DCkWo8gnB2ThUHU2XQ9XwgHMg%40mail.gmail.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] [PATCH] Improve geometric types

От
Aleksander Alekseev
Дата:
The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:       not tested
Spec compliant:           not tested
Documentation:            not tested

Hi Emre,

I'm afraid these patches conflict with current master branch.

The new status of this patch is: Waiting on Author

Re: [HACKERS] [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> I'm afraid these patches conflict with current master branch.

Rebased.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] [PATCH] Improve geometric types

От
Aleksander Alekseev
Дата:
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:       not tested
Spec compliant:           not tested
Documentation:            not tested

PostgreSQL fails with SIGSEGV during `make check-world`.

Backtrace: http://afiskon.ru/s/d4/f3dc17838a_sigsegv.txt
regression.diffs (not very useful): http://afiskon.ru/s/ac/ac5294656c_regression.diffs.txt
regression.out: http://afiskon.ru/s/70/39d872e2b8_regression.out.txt
How to reproduce: https://github.com/afiskon/pgscripts/blob/master/full-build.sh

The environment is Arch Linux x64, gcc 7.1.1

The new status of this patch is: Waiting on Author

Re: [HACKERS] [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> PostgreSQL fails with SIGSEGV during `make check-world`.

Fixed.  I am sorry for not running "check-world" before.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] [PATCH] Improve geometric types

От
Aleksander Alekseev
Дата:
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           tested, passed
Documentation:            tested, passed

LGTM.

The new status of this patch is: Ready for Committer

Re: [HACKERS] [PATCH] Improve geometric types

От
Kyotaro HORIGUCHI
Дата:
Hello, sorry to late for the party, but may I comment on this?

At Tue, 05 Sep 2017 13:18:12 +0000, Aleksander Alekseev <a.alekseev@postgrespro.ru> wrote in
<20170905131812.18925.13551.pgcf@coridan.postgresql.org>
> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, passed
> Implements feature:       tested, passed
> Spec compliant:           tested, passed
> Documentation:            tested, passed
> 
> LGTM.
> 
> The new status of this patch is: Ready for Committer

The first patch reconstructs the operators in layers. These
functions are called very frequently when used. Some function are
already inlined in float.h but some static functions in float.h
also can be and are better be inlined. Some of *_internal,
point_construct, line_calculate_point and so on are the
candidates.

You removed some DirectFunctionCall to the functions within the
same file but other functions remain in the style,
ex. poly_center or on_sl. The function called from the former
seems large enough but the latter function calls a so small
function that it could be inlined. Would you like to make some
additional functions use C call (instead of DirectFunctionCall)
and inlining them?

This is not a fault of this patch, but some functions like on_pb
seems missing comment to describe what it is. Would you like to
add some?


In the second patch, the additional include fmgrprotos.h in
btree_gin.c seems needless. Some float[48] features were macros
so that they share the same expressions between float4 and
float8. They still seems sharing perfectly the same expressions
in float.h. Is there any reason for converting them into typed
inline functions?

In float.h, MAXDOUBLEWIDTH is redueced from 500 to 128, but the
exponent of double is up to 308 so it doesn't seem sufficient. On
the other hand we won't use non-scientific notation for extremely
large numbers and it requires (perhaps) up to 26 bytes in the
case. In the soruce code, most of them uses "%e" and one of them
uses '%g". %e always takes the format of
"-1.(17digits)e+308".. So it would be less than 26
characters. 

=# set extra_float_digits to 3;
=# select -1.221423424320453e308::float8;        ?column?          
----------------------------1.22142342432045302e+308

man printf: (linux)
> Style e is used if the exponent from its conversion is less than
> -4 or greater than or equal to the precision.

So we should be safe to have a buffer with 26 byte length and 500
bytes will apparently too large and even 128 will be too loose in
most cases. So how about something like the following?

#define MINDOUBLEWIDTH 32
...
float4out@float.c<modified>:
>    int      ndig = FLT_DIG + extra_float_digits;
> 
>    if (ndig < 1)
>       ndig = 1;
> 
>    len = snprintf(ascii, MINDOUBLEWIDTH + 1, "%+.*g", ndig, num);
>     if (len > MINDOUBLEWIDTH + 1)
>    {
>        ascii = (char *) repalloc(ascii, len);
>        if (snprintf(ascii, len, "%+.*e", ndig, num) > len)
>           error(ERROR, "something wrong happens...");
>     }

I don't think the if part can be used so there would be no
performance degradation, I believe.


----
I'd like to pause here.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> Hello, sorry to late for the party, but may I comment on this?

Thank you for picking this up again.

> The first patch reconstructs the operators in layers. These
> functions are called very frequently when used. Some function are
> already inlined in float.h but some static functions in float.h
> also can be and are better be inlined. Some of *_internal,
> point_construct, line_calculate_point and so on are the
> candidates.

They are static functions.  I though compiler can decide to inline
them.  Do you think adding "inline" to the function signatures are
necessary?

> You removed some DirectFunctionCall to the functions within the
> same file but other functions remain in the style,
> ex. poly_center or on_sl. The function called from the former
> seems large enough but the latter function calls a so small
> function that it could be inlined. Would you like to make some
> additional functions use C call (instead of DirectFunctionCall)
> and inlining them?

I tried to minimise my changes to make reviewing easier.  I can make
"_internal" functions for the remaining DirectFunctionCall()s, if you
find it necessary.

> This is not a fault of this patch, but some functions like on_pb
> seems missing comment to describe what it is. Would you like to
> add some?

I will add some on the next version.

> In the second patch, the additional include fmgrprotos.h in
> btree_gin.c seems needless.

It must be something I missed on rebase.  I will remove it.

> Some float[48] features were macros
> so that they share the same expressions between float4 and
> float8. They still seems sharing perfectly the same expressions
> in float.h. Is there any reason for converting them into typed
> inline functions?

Kevin Grittner suggested using inline functions instead of macros.
They are easier to use compared to macros, and avoid double-evaluation
hazards.

> In float.h, MAXDOUBLEWIDTH is redueced from 500 to 128, but the
> exponent of double is up to 308 so it doesn't seem sufficient. On
> the other hand we won't use non-scientific notation for extremely
> large numbers and it requires (perhaps) up to 26 bytes in the
> case. In the soruce code, most of them uses "%e" and one of them
> uses '%g". %e always takes the format of
> "-1.(17digits)e+308".. So it would be less than 26
> characters.
>
> =# set extra_float_digits to 3;
> =# select -1.221423424320453e308::float8;
>          ?column?
> ---------------------------
>  -1.22142342432045302e+308
>
> man printf: (linux)
>> Style e is used if the exponent from its conversion is less than
>> -4 or greater than or equal to the precision.
>
> So we should be safe to have a buffer with 26 byte length and 500
> bytes will apparently too large and even 128 will be too loose in
> most cases. So how about something like the following?
>
> #define MINDOUBLEWIDTH 32

Should it be same for float4 and float8?

> ...
> float4out@float.c<modified>:
>>    int      ndig = FLT_DIG + extra_float_digits;
>>
>>    if (ndig < 1)
>>       ndig = 1;
>>
>>    len = snprintf(ascii, MINDOUBLEWIDTH + 1, "%+.*g", ndig, num);
>>     if (len > MINDOUBLEWIDTH + 1)
>>    {
>>        ascii = (char *) repalloc(ascii, len);
>>        if (snprintf(ascii, len, "%+.*e", ndig, num) > len)
>>           error(ERROR, "something wrong happens...");
>>     }
>
> I don't think the if part can be used so there would be no
> performance degradation, I believe.

Wouldn't this change the output of the float datatypes?  That would be
a backwards incompatible change.

> I'd like to pause here.

I will submit new versions after your are done with your review.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Improve geometric types

От
Kyotaro HORIGUCHI
Дата:
At Tue, 12 Sep 2017 19:30:44 +0200, Emre Hasegeli <emre@hasegeli.com> wrote in
<CAE2gYzxDRvNdhrWQa7ym423uvHLWJXkqx=BoJePYMdrKPR1Zhg@mail.gmail.com>
> > Hello, sorry to late for the party, but may I comment on this?
> 
> Thank you for picking this up again.
> 
> > The first patch reconstructs the operators in layers. These
> > functions are called very frequently when used. Some function are
> > already inlined in float.h but some static functions in float.h
> > also can be and are better be inlined. Some of *_internal,
> > point_construct, line_calculate_point and so on are the
> > candidates.
> 
> They are static functions.  I though compiler can decide to inline
> them.  Do you think adding "inline" to the function signatures are
> necessary?

C++ surely make just static functions inlined but I'm not sure C
compiler does that. "static" is a scope modifier and "inline" is
not (what kind of modifier is it?). In regard to gcc,

https://gcc.gnu.org/onlinedocs/gcc-7.2.0/gcc/Inline.html#Inline

> By declaring a function inline, you can direct GCC to make
> calls to that function faster. <snip> You can also direct GCC
> to try to integrate all “simple enough” functions into their
> callers with the option -finline-functions.

I didn't find that postgresql uses the options so I think we
shouldn't expect that they are inlined automatically.

> > You removed some DirectFunctionCall to the functions within the
> > same file but other functions remain in the style,
> > ex. poly_center or on_sl. The function called from the former
> > seems large enough but the latter function calls a so small
> > function that it could be inlined. Would you like to make some
> > additional functions use C call (instead of DirectFunctionCall)
> > and inlining them?
> 
> I tried to minimise my changes to make reviewing easier.  I can make
> "_internal" functions for the remaining DirectFunctionCall()s, if you
> find it necessary.

Ok, it would be another patch even if we need that.

> > This is not a fault of this patch, but some functions like on_pb
> > seems missing comment to describe what it is. Would you like to
> > add some?
> 
> I will add some on the next version.
> 
> > In the second patch, the additional include fmgrprotos.h in
> > btree_gin.c seems needless.
> 
> It must be something I missed on rebase.  I will remove it.
> 
> > Some float[48] features were macros
> > so that they share the same expressions between float4 and
> > float8. They still seems sharing perfectly the same expressions
> > in float.h. Is there any reason for converting them into typed
> > inline functions?
> 
> Kevin Grittner suggested using inline functions instead of macros.
> They are easier to use compared to macros, and avoid double-evaluation
> hazards.

I recall a bit about the double-evaluation hazards. I think the
functions needs a comment describing the reasons so that anyone
kind won't try to merge them into a macro again.

> > In float.h, MAXDOUBLEWIDTH is redueced from 500 to 128, but the
> > exponent of double is up to 308 so it doesn't seem sufficient. On
> > the other hand we won't use non-scientific notation for extremely
> > large numbers and it requires (perhaps) up to 26 bytes in the
> > case. In the soruce code, most of them uses "%e" and one of them
> > uses '%g". %e always takes the format of
> > "-1.(17digits)e+308".. So it would be less than 26
> > characters.
> >
> > =# set extra_float_digits to 3;
> > =# select -1.221423424320453e308::float8;
> >          ?column?
> > ---------------------------
> >  -1.22142342432045302e+308
> >
> > man printf: (linux)
> >> Style e is used if the exponent from its conversion is less than
> >> -4 or greater than or equal to the precision.
> >
> > So we should be safe to have a buffer with 26 byte length and 500
> > bytes will apparently too large and even 128 will be too loose in
> > most cases. So how about something like the following?
> >
> > #define MINDOUBLEWIDTH 32
> 
> Should it be same for float4 and float8?

Strictly they are different each other but float8 needs 26 bytes
(additional 1 byte is the sign for expnent, which I forgot.) and
float4 needs 15 bytes so it could be reduced to 32 and 16
respectively.

| select -1.11111111111111111111111111e-38::float4;
|  -1.11111113e-38
| select -1.11111111111111111111111111e-4::float4;
|  -0.000111111112
| select -1.11111111111111111111111111e-5::float4;
|  -1.11111112e-05

On the other hand float4 is anyway converted into double in
float4out and I'm not sure that the extension doesn't adds
spurious digits. Therefore I think it would be reasonable that
"%g" formatter requires up to 27 bytes (26 + terminating zero) so
it should use MINDOUBLEWIDTH regardless of the precision of the
value given to the formatter.

Addition to that if we are deliberately using %g in float4out, we
can use flot8out_internal instead of most of the stuff of
float4out.

| Datum
| float4out(PG_FUNCTION_ARGS)
| {
|     float8 num = PG_GETARG_FLOAT4(0);
| 
|     PG_RETURN_CSTRING(float8out_internal(num));
| }

> > ...
> > float4out@float.c<modified>:
> >>    int      ndig = FLT_DIG + extra_float_digits;
> >>
> >>    if (ndig < 1)
> >>       ndig = 1;
> >>
> >>    len = snprintf(ascii, MINDOUBLEWIDTH + 1, "%+.*g", ndig, num);
> >>     if (len > MINDOUBLEWIDTH + 1)
> >>    {
> >>        ascii = (char *) repalloc(ascii, len);
> >>        if (snprintf(ascii, len, "%+.*e", ndig, num) > len)
> >>           error(ERROR, "something wrong happens...");
> >>     }
> >
> > I don't think the if part can be used so there would be no
> > performance degradation, I believe.
> 
> Wouldn't this change the output of the float datatypes?  That would be
> a backwards incompatible change.

It just reduces the unused part after terminator \0, and we won't
get incompatible result.

> > I'd like to pause here.
> 
> I will submit new versions after your are done with your review.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Improve geometric types

От
Robert Haas
Дата:
On Thu, Sep 14, 2017 at 3:33 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> I recall a bit about the double-evaluation hazards. I think the
> functions needs a comment describing the reasons so that anyone
> kind won't try to merge them into a macro again.

I think we can count on PostgreSQL developers to understand the
advantages of an inline function over a macro.  Even if they don't,
the solution can't be to put a comment in every place where an inline
function is used explaining it.  That would be very repetitive.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Improve geometric types

От
Kyotaro HORIGUCHI
Дата:
At Thu, 14 Sep 2017 16:19:13 -0400, Robert Haas <robertmhaas@gmail.com> wrote in
<CA+TgmobinBA7uvQifYaYGdDUoF6VTo56dvoTT6nKSpJF-Zfv5A@mail.gmail.com>
> On Thu, Sep 14, 2017 at 3:33 AM, Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > I recall a bit about the double-evaluation hazards. I think the
> > functions needs a comment describing the reasons so that anyone
> > kind won't try to merge them into a macro again.
> 
> I think we can count on PostgreSQL developers to understand the
> advantages of an inline function over a macro.  Even if they don't,
> the solution can't be to put a comment in every place where an inline
> function is used explaining it.  That would be very repetitive.

Of course putting such a comment to all inline functions is
silly. The point here is that many pairs of two functions with
exactly the same shape but handle different types are defined
side by side. Such situation seems tempting to merge them into
single macros, as the previous author did there.

So a simple one like the following would be enough.

/* don't merge the following same functions with different types  into single macros so that double evaluation won't
happen*/
 

Is it still too verbose?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Improve geometric types

От
Kyotaro HORIGUCHI
Дата:
At Fri, 15 Sep 2017 17:23:28 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in
<20170915.172328.97446299.horiguchi.kyotaro@lab.ntt.co.jp>
> At Thu, 14 Sep 2017 16:19:13 -0400, Robert Haas <robertmhaas@gmail.com> wrote in
<CA+TgmobinBA7uvQifYaYGdDUoF6VTo56dvoTT6nKSpJF-Zfv5A@mail.gmail.com>
> > On Thu, Sep 14, 2017 at 3:33 AM, Kyotaro HORIGUCHI
> > <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > > I recall a bit about the double-evaluation hazards. I think the
> > > functions needs a comment describing the reasons so that anyone
> > > kind won't try to merge them into a macro again.
> > 
> > I think we can count on PostgreSQL developers to understand the
> > advantages of an inline function over a macro.  Even if they don't,
> > the solution can't be to put a comment in every place where an inline
> > function is used explaining it.  That would be very repetitive.
> 
> Of course putting such a comment to all inline functions is
> silly. The point here is that many pairs of two functions with
> exactly the same shape but handle different types are defined
> side by side. Such situation seems tempting to merge them into
> single macros, as the previous author did there.
> 
> So a simple one like the following would be enough.
> 
> /* don't merge the following same functions with different types
>    into single macros so that double evaluation won't happen */
> 
> Is it still too verbose?

That being said, I'm not stick on that if Robert or others think
it as needless.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Improve geometric types

От
Kyotaro HORIGUCHI
Дата:
Hello, just one point on 0001.

The patch replace pg_hypot with hypot in libc. The man page says
as follows.

man 3 hypot
>       If the result overflows, a range error occurs, and the functions return
>       HUGE_VAL, HUGE_VALF, or HUGE_VALL, respectively.
..
>ERRORS
>       See math_error(7) for information on how to determine whether an  error
>       has occurred when calling these functions.
>
>       The following errors can occur:
>
>       Range error: result overflow
>              errno  is  set  to ERANGE.  An overflow floating-point exception
>              (FE_OVERFLOW) is raised.
>
>       Range error: result underflow
>              An underflow floating-point exception (FE_UNDERFLOW) is raised.
>
>              These functions do not set errno for this case.

So, the code seems to need some amendments following to this
spec.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Improve geometric types

От
Robert Haas
Дата:
On Fri, Sep 15, 2017 at 4:23 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> /* don't merge the following same functions with different types
>    into single macros so that double evaluation won't happen */
>
> Is it still too verbose?

Personally, I don't think such a comment has much value, but I am not
going to spend a lot of time arguing about it.  It's really up to the
eventual committer to decide, and that probably won't be me in this
case.  My knowledge of the geometric types isn't great, and I am a tad
busy breaking^Wimproving things I do understand.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Improve geometric types

От
Kyotaro HORIGUCHI
Дата:
At Fri, 15 Sep 2017 11:25:30 -0400, Robert Haas <robertmhaas@gmail.com> wrote in
<CA+TgmoYgg8=m9+Y54Az1r+KBpMuiEOZM_DLDF04jMP4twGR3Ng@mail.gmail.com>
> On Fri, Sep 15, 2017 at 4:23 AM, Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > /* don't merge the following same functions with different types
> >    into single macros so that double evaluation won't happen */
> >
> > Is it still too verbose?
> 
> Personally, I don't think such a comment has much value, but I am not
> going to spend a lot of time arguing about it.  It's really up to the
> eventual committer to decide, and that probably won't be me in this
> case.  My knowledge of the geometric types isn't great, and I am a tad
> busy breaking^Wimproving things I do understand.

Ok, I agree with you.

# Though it is not a issue of geometric types :p

I'll withdrow the comment. Maybe someone notices of that when
reviewing such a patch.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
The new versions of the patches are attached addressing your comments.

> C++ surely make just static functions inlined but I'm not sure C
> compiler does that.

Thank you for your explanation.  I marked the mentioned functions "inline".

> So we should be safe to have a buffer with 26 byte length and 500
> bytes will apparently too large and even 128 will be too loose in
> most cases. So how about something like the following?
>
> #define MINDOUBLEWIDTH 32

I left this part out for now.  We can improve it separately.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> The patch replace pg_hypot with hypot in libc. The man page says
> as follows.
>
> man 3 hypot
>>       If the result overflows, a range error occurs, and the functions return
>>       HUGE_VAL, HUGE_VALF, or HUGE_VALL, respectively.
> ..
>>ERRORS
>>       See math_error(7) for information on how to determine whether an  error
>>       has occurred when calling these functions.
>>
>>       The following errors can occur:
>>
>>       Range error: result overflow
>>              errno  is  set  to ERANGE.  An overflow floating-point exception
>>              (FE_OVERFLOW) is raised.
>>
>>       Range error: result underflow
>>              An underflow floating-point exception (FE_UNDERFLOW) is raised.
>>
>>              These functions do not set errno for this case.
>
> So, the code seems to need some amendments following to this
> spec.

I included them on the latest version.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Improve geometric types

От
Kyotaro HORIGUCHI
Дата:
Hello. Thank you for the new version.

0001: applies cleanly. regress passed.     this mainly refactoring geo_ops.c and replacing pg_hypot with hypot(3).
0002: applies cleanly. regress passed.     this just replaces float-ops macros into inline functions.
0003: applies cleanly. regress passed.     replaces double with float8 and bare arithmetic with defined functions.
0004: applies cleanly. regress passsed.     fix broken line-related functions.     I have some comments on this
(later).

At Wed, 27 Sep 2017 17:44:52 +0200, Emre Hasegeli <emre@hasegeli.com> wrote in
<CAE2gYzz2=335XEMHK-neNo=HgGskkPHQxUXkh8yDNsZAnCB5Bg@mail.gmail.com>
> The new versions of the patches are attached addressing your comments.
> 
> > C++ surely make just static functions inlined but I'm not sure C
> > compiler does that.
> 
> Thank you for your explanation.  I marked the mentioned functions "inline".

Thanks.

> > So we should be safe to have a buffer with 26 byte length and 500
> > bytes will apparently too large and even 128 will be too loose in
> > most cases. So how about something like the following?
> >
> > #define MINDOUBLEWIDTH 32
> 
> I left this part out for now.  We can improve it separately.

Agreed. I found that psprintf does that with initial length of
128.

By the way, I found that MAXDOUBLEWIDTH has two inconsistent
definitions in formatting.c(500) and float.c(128). The definition
in new float.h is according to float.c and it seems better to be
untouched and it would be another issue.


At Wed, 27 Sep 2017 17:45:04 +0200, Emre Hasegeli <emre@hasegeli.com> wrote in
<CAE2gYzz1KuT57vrO-XZk3KSqdpehAupJPLYu7AshuUwRF7MiMg@mail.gmail.com>
> > The patch replace pg_hypot with hypot in libc. The man page says
> > as follows.
...
> I included them on the latest version.

# The commit message of 0001 says that "using C11 hypot()" but it
# is sine C99. I suppose we shold follow C99 at the time so I
# suggest rewrite it in the next version if any.

It seems bettern than expected.  I confirmed that
pg_hypot(DBL_MAX, DBL_MAX) returned a value that is equivalent to
HUGE_VAL * HUGE_VAL on glibc, but I'm not sure the behavior on
other platforms is the same.

======
For other potential reviewers:

I found the origin of the function here.

https://www.postgresql.org/message-id/4A90BD76.7070804@netspace.net.au
https://www.postgresql.org/message-id/AANLkTim4cHELcGPf5w7Zd43_dQi_2RJ_b5_F_idSSbZI%40mail.gmail.com

And the reason for pg_hypot is seen here.

https://www.postgresql.org/message-id/407d949e0908222139t35ad3ad2q3e6b15646a27dd64@mail.gmail.com

I think the replacement is now acceptable according to the discussion.
======

And this should be the last comment of mine on the patch set.

In 0004, line_distance and line_interpt_internal seem
correct. Seemingly horizontal/vertical checks are redundant but
it is because unclearly defined EPSLON bahavior. lseg_perp seems
correct.

close_pl got a bug fix not only refactoring. I think it is
recommended to separate bugs and improvements, but I'm fine with
the current patch.


You added sanity check "A==0 && B==0" (in Ax + By + C) in
line_recv. I'm not sure the necessity of the check since it has
been checked in line_in but anyway the comparisons seem to be
typo(or thinko) of FPzero.

dist_pl is changed to take the smaller distance of both ends of
the segment. It seems absorbing error, so it might be better
taking the mean of the two distances. If you have a firm reason
for the change, it is better to be written there, or it might be
better left alone.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> And this should be the last comment of mine on the patch set.

Thank you.  The new versions of the patches are attached addressing
your comments.

> By the way, I found that MAXDOUBLEWIDTH has two inconsistent
> definitions in formatting.c(500) and float.c(128). The definition
> in new float.h is according to float.c and it seems better to be
> untouched and it would be another issue.

The last version of the patch don't move these declarations to the header file.

> # The commit message of 0001 says that "using C11 hypot()" but it
> # is sine C99. I suppose we shold follow C99 at the time so I
> # suggest rewrite it in the next version if any.

Changed.

> close_pl got a bug fix not only refactoring. I think it is
> recommended to separate bugs and improvements, but I'm fine with
> the current patch.

I split the refactoring to the first patch.

> You added sanity check "A==0 && B==0" (in Ax + By + C) in
> line_recv. I'm not sure the necessity of the check since it has
> been checked in line_in but anyway the comparisons seem to be
> typo(or thinko) of FPzero.

Tom Lane suggested [1] this one.  I now made it use FPzero().

> dist_pl is changed to take the smaller distance of both ends of
> the segment. It seems absorbing error, so it might be better
> taking the mean of the two distances. If you have a firm reason
> for the change, it is better to be written there, or it might be
> better left alone.

I don't really, so I left that part out.

[1] https://www.postgresql.org/message-id/11053.1466362319%40sss.pgh.pa.us

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] [PATCH] Improve geometric types

От
Robert Haas
Дата:
On Mon, Oct 2, 2017 at 4:23 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> For other potential reviewers:
>
> I found the origin of the function here.
>
> https://www.postgresql.org/message-id/4A90BD76.7070804@netspace.net.au
> https://www.postgresql.org/message-id/AANLkTim4cHELcGPf5w7Zd43_dQi_2RJ_b5_F_idSSbZI%40mail.gmail.com
>
> And the reason for pg_hypot is seen here.
>
> https://www.postgresql.org/message-id/407d949e0908222139t35ad3ad2q3e6b15646a27dd64@mail.gmail.com
>
> I think the replacement is now acceptable according to the discussion.
> ======

I think if we're going to do this it should be separated out as its
own patch.  Also, I think someone should explain what the reasoning
behind the change is.  Do we, for example, foresee that the built-in
code might be faster or less likely to overflow?  Because we're
clearly taking a risk -- most trivially, that the BF will break, or
more seriously, that some machines will have versions of this function
that don't actually behave quite the same.

That brings up a related point.  How good is our test case coverage
for hypot(), especially in strange corner cases, like this one
mentioned in pg_hypot()'s comment:
* This implementation conforms to IEEE Std 1003.1 and GLIBC, in that the* case of hypot(inf,nan) results in INF, and
notNAN.
 

I'm potentially willing to commit a patch that just makes the
pg_hypot() -> hypot() change and does nothing else, if there are not
objections to that change, but I want to be sure that we'll know right
away if that turns out to break.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Improve geometric types

От
Kyotaro HORIGUCHI
Дата:
Thanks.

At Mon, 2 Oct 2017 11:46:15 +0200, Emre Hasegeli <emre@hasegeli.com> wrote in
<CAE2gYzz7wiUnyb1TxCk5GzXt6j7efzGmMgH0gTe8xwKZ=FAF5A@mail.gmail.com>
> > And this should be the last comment of mine on the patch set.
> 
> Thank you.  The new versions of the patches are attached addressing
> your comments.
> 
> > By the way, I found that MAXDOUBLEWIDTH has two inconsistent
> > definitions in formatting.c(500) and float.c(128). The definition
> > in new float.h is according to float.c and it seems better to be
> > untouched and it would be another issue.
> 
> The last version of the patch don't move these declarations to the header file.

Yeah, it is not about the patch itself.

> > # The commit message of 0001 says that "using C11 hypot()" but it
> > # is sine C99. I suppose we shold follow C99 at the time so I
> > # suggest rewrite it in the next version if any.
> 
> Changed.
> 
> > close_pl got a bug fix not only refactoring. I think it is
> > recommended to separate bugs and improvements, but I'm fine with
> > the current patch.
> 
> I split the refactoring to the first patch.
> 
> > You added sanity check "A==0 && B==0" (in Ax + By + C) in
> > line_recv. I'm not sure the necessity of the check since it has
> > been checked in line_in but anyway the comparisons seem to be
> > typo(or thinko) of FPzero.
> 
> Tom Lane suggested [1] this one.  I now made it use FPzero().

I see. It's fine with me. I suppose that Tom didn't intened the
suggestion to be teken literary so using FPzero() seems better
(at least for now).

> > dist_pl is changed to take the smaller distance of both ends of
> > the segment. It seems absorbing error, so it might be better
> > taking the mean of the two distances. If you have a firm reason
> > for the change, it is better to be written there, or it might be
> > better left alone.
> 
> I don't really, so I left that part out.

Mmm, sorry. It's my mistake.

> [1] https://www.postgresql.org/message-id/11053.1466362319%40sss.pgh.pa.us

I'll look the new version further.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Improve geometric types

От
Kyotaro HORIGUCHI
Дата:
At Mon, 2 Oct 2017 08:23:49 -0400, Robert Haas <robertmhaas@gmail.com> wrote in
<CA+TgmoYsgw0TcjJQ1CE_6vDOxgEhxYQkfNx93Mfwx23WOLM0NA@mail.gmail.com>
> On Mon, Oct 2, 2017 at 4:23 AM, Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > For other potential reviewers:
> >
> > I found the origin of the function here.
> >
> > https://www.postgresql.org/message-id/4A90BD76.7070804@netspace.net.au
> > https://www.postgresql.org/message-id/AANLkTim4cHELcGPf5w7Zd43_dQi_2RJ_b5_F_idSSbZI%40mail.gmail.com
> >
> > And the reason for pg_hypot is seen here.
> >
> > https://www.postgresql.org/message-id/407d949e0908222139t35ad3ad2q3e6b15646a27dd64@mail.gmail.com
> >
> > I think the replacement is now acceptable according to the discussion.
> > ======
> 
> I think if we're going to do this it should be separated out as its
> own patch.

+1

>             Also, I think someone should explain what the reasoning
> behind the change is.  Do we, for example, foresee that the built-in
> code might be faster or less likely to overflow?  Because we're
> clearly taking a risk -- most trivially, that the BF will break, or
> more seriously, that some machines will have versions of this function
> that don't actually behave quite the same.
> 
> That brings up a related point.  How good is our test case coverage
> for hypot(), especially in strange corner cases, like this one
> mentioned in pg_hypot()'s comment:
> 
>  * This implementation conforms to IEEE Std 1003.1 and GLIBC, in that the
>  * case of hypot(inf,nan) results in INF, and not NAN.

I'm not sure how precise we practically need them to be
identical.  FWIW as a rough confirmation on my box, I compared
hypot and pg_hypot for the following arbitrary choosed pairs of
parameters.

{2.2e-308, 2.2e-308},{2.2e-308, 1.7e307},{1.7e307, 1.7e307},{1.7e308, 1.7e308},{2.2e-308, DBL_MAX},{1.7e308,
DBL_MAX},{DBL_MIN,DBL_MAX},{DBL_MAX, DBL_MAX},{1.7e307, INFINITY},{2.2e-308, INFINITY},{0, INFINITY},{DBL_MIN,
INFINITY},{INFINITY,INFINITY},{1, NAN},{INFINITY, NAN},{NAN, NAN},
 


Only the first pair gave slightly not-exactly-equal results but
it seems to do no harm. hypot set underflow flag.
0: hypot=3.111269837220809e-308 (== 0.0 is 0, < DBL_MIN is 0)  pg_hypot=3.11126983722081e-308 (== 0.0 is 0, < DBL_MIN
is0)  equal=0,  hypot(errno:0, inval:0, div0:0, of=0, uf=1),  pg_hypot(errno:0, inval:0, div0:0, of=0, uf=0)
 

But not sure how the both functions behave on other platforms.

> I'm potentially willing to commit a patch that just makes the
> pg_hypot() -> hypot() change and does nothing else, if there are not
> objections to that change, but I want to be sure that we'll know right
> away if that turns out to break.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
#include <stdio.h>
#include <float.h>
#include <math.h>
#include <fenv.h>
#include <errno.h>

double
pg_hypot(double x, double y)
{double        yx;
/* Handle INF and NaN properly */if (isinf(x) || isinf(y))    return (double) INFINITY;
if (isnan(x) || isnan(y))    return (double) NAN;
/* Else, drop any minus signs */x = fabs(x);y = fabs(y);
/* Swap x and y if needed to make x the larger one */if (x < y){    double        temp = x;
    x = y;    y = temp;}
/* * If y is zero, the hypotenuse is x.  This test saves a few cycles in * such cases, but more importantly it also
protectsagainst * divide-by-zero errors, since now x >= y. */if (y == 0.0)    return x;
 
/* Determine the hypotenuse */yx = y / x;return x * sqrt(1.0 + (yx * yx));
}

void
setfeflags(int *invalid, int *divzero, int *overflow, int *underflow)
{int err = fetestexcept(FE_INVALID | FE_DIVBYZERO |                       FE_OVERFLOW | FE_UNDERFLOW);*invalid = ((err
&FE_INVALID) != 0);*divzero = ((err & FE_DIVBYZERO) != 0);*overflow = ((err & FE_OVERFLOW) != 0);*underflow = ((err &
FE_UNDERFLOW)!= 0);
 
}

int
main(void)
{double x;double y;double p[][2] =    {        {2.2e-308, 2.2e-308},        {2.2e-308, 1.7e307},        {1.7e307,
1.7e307},       {1.7e308, 1.7e308},        {2.2e-308, DBL_MAX},        {1.7e308, DBL_MAX},        {DBL_MIN, DBL_MAX},
    {DBL_MAX, DBL_MAX},        {1.7e307, INFINITY},        {2.2e-308, INFINITY},        {0, INFINITY},        {DBL_MIN,
INFINITY},       {INFINITY, INFINITY},        {1, NAN},        {INFINITY, NAN},        {NAN, NAN},        {0.0, 0.0}
};inti;
 

for (i = 0 ; p[i][0] != 0.0 || p[i][1] != 0.0 ; i++){    int errno1, errno2;    int invalid1 = 0, divzero1 = 0,
overflow1= 0, underflow1 = 0;    int invalid2 = 0, divzero2 = 0, overflow2 = 0, underflow2 = 0;    int cmp;
 
    errno = 0;    feclearexcept(FE_ALL_EXCEPT);    x = hypot(p[i][0], p[i][1]);    errno1 = errno;
setfeflags(&invalid1,&divzero1, &overflow1, &underflow1);
 
    errno = 0;    feclearexcept(FE_ALL_EXCEPT);    y = pg_hypot(p[i][0], p[i][1]);    errno2 = errno;
setfeflags(&invalid2,&divzero2, &overflow2, &underflow2);
 
    /* assume NaN == NaN here */    if (isnan(x) && isnan(y))        cmp = 1;    else        cmp = (x == y);
    printf("%d: hypot=%.16lg (== 0.0 is %d, < DBL_MIN is %d)\n  pg_hypot=%.16lg (== 0.0 is %d, < DBL_MIN is %d)\n
equal=%d,hypot(errno:%d, inval:%d, div0:%d, of=%d, uf=%d), pg_hypot(errno:%d, inval:%d, div0:%d, of=%d, uf=%d)\n",
    i, x, x == 0.0, x < DBL_MIN, y, y == 0.0, y < DBL_MIN, cmp,           errno1, invalid1, divzero1, overflow1,
underflow1,          errno2, invalid2, divzero2, overflow2, underflow2);}return 0;
 
}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Improve geometric types

От
Alvaro Herrera
Дата:
Robert Haas wrote:

> I'm potentially willing to commit a patch that just makes the
> pg_hypot() -> hypot() change and does nothing else, if there are not
> objections to that change, but I want to be sure that we'll know right
> away if that turns out to break.

Uh, I thought pg_hypot() was still needed on our oldest supported
platforms.  Or is hypot() already supported there?  If not, I suppose we
can keep the HYPOT() macro, and have it use hypot() on platforms that
have it and pg_hypot elsewhere?  That particular fraction of 0001 seemed
a bit dubious to me, as the largest possible source of platform
dependency issues.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> I think if we're going to do this it should be separated out as its
> own patch.  Also, I think someone should explain what the reasoning
> behind the change is.  Do we, for example, foresee that the built-in
> code might be faster or less likely to overflow?  Because we're
> clearly taking a risk -- most trivially, that the BF will break, or
> more seriously, that some machines will have versions of this function
> that don't actually behave quite the same.

I included removal of pg_hypot() on my patch simply because the
comment on the function header is suggesting it.  I though if we are
going to clean this module up, we better deal it first.  I understand
the risk.  The patches include more changes.  It may be a good idea to
have those together.

> That brings up a related point.  How good is our test case coverage
> for hypot(), especially in strange corner cases, like this one
> mentioned in pg_hypot()'s comment:
>
>  * This implementation conforms to IEEE Std 1003.1 and GLIBC, in that the
>  * case of hypot(inf,nan) results in INF, and not NAN.

I don't see any tests of geometric types with INF or NaN.  Currently,
there isn't consistent behaviour for them.  I don't think we can
easily add portable ones on the current state, but we should be able
to do so with my patches.  I will look into it.

> I'm potentially willing to commit a patch that just makes the
> pg_hypot() -> hypot() change and does nothing else, if there are not
> objections to that change, but I want to be sure that we'll know right
> away if that turns out to break.

I can split this one into another patch.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> Uh, I thought pg_hypot() was still needed on our oldest supported
> platforms.  Or is hypot() already supported there?  If not, I suppose we
> can keep the HYPOT() macro, and have it use hypot() on platforms that
> have it and pg_hypot elsewhere?  That particular fraction of 0001 seemed
> a bit dubious to me, as the largest possible source of platform
> dependency issues.

What is our oldest supported platform?

We can also just keep pg_hypot().  I don't think getting rid of it
justifies much work.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Improve geometric types

От
Tom Lane
Дата:
Emre Hasegeli <emre@hasegeli.com> writes:
>> Uh, I thought pg_hypot() was still needed on our oldest supported
>> platforms.  Or is hypot() already supported there?

> What is our oldest supported platform?

Our normal reference for such questions is SUS v2 (POSIX 1997):
http://pubs.opengroup.org/onlinepubs/007908799/

I looked into that, and noted that it does specify hypot(), but
it has different rules for edge conditions:
If the result would cause overflow, HUGE_VAL is returned and errnomay be set to [ERANGE].
If x or y is NaN, NaN is returned. and errno may be set to [EDOM].
If the correct result would cause underflow, 0 is returned anderrno may be set to [ERANGE].

whereas POSIX 2008 saith:
If the correct value would cause overflow, a range error shalloccur and hypot(), hypotf(), and hypotl() shall return
thevalueof the macro HUGE_VAL, HUGE_VALF, and HUGE_VALL, respectively.
 
[MX] If x or y is ±Inf, +Inf shall be returned (even if one of xor y is NaN).
If x or y is NaN, and the other is not ±Inf, a NaN shall bereturned.
[MXX] If both arguments are subnormal and the correct result issubnormal, a range error may occur and the correct
resultshallbe returned.
 

In short, the reason that the existing comment makes a point of the
Inf-and-NaN behavior is that the standard changed somewhere along the
line.  While I believe we could get away with assuming that hypot()
exists everywhere, it's much less clear that we could rely on the result
being Inf and not NaN in this case.

Now, it's also not clear that anything in PG really cares.  But if we
do care, I think we should keep pg_hypot() ... and maybe clarify the
comment a bit more.
        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> Now, it's also not clear that anything in PG really cares.  But if we
> do care, I think we should keep pg_hypot() ... and maybe clarify the
> comment a bit more.

I am not sure how useful NaNs are in geometric types context, but we
allow them, so inconsistent hypot() would be a problem.  I will change
my patches to keep pg_hypot().


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> I am not sure how useful NaNs are in geometric types context, but we
> allow them, so inconsistent hypot() would be a problem.  I will change
> my patches to keep pg_hypot().

New versions of the patches are attached with 2 additional ones.  The
new versions leave pg_hypot() in place.  One of the new patches
improves the test coverage.  The line coverage of geo_ops.c increases
from 55% to 81%.  The other one fixes -0 values to 0 on float
operators.  I am not sure about performance implication of this, so
kept it separate.  It may be a better idea to check this only on the
platforms that has tendency to produce -0.

While working on the tests, I found some unreachable code and removed
it.  I also found that lseg ## lseg operator returning wrong results.
It is defined as "closest point to first segment on the second
segment", but:

> # select '[(1,2),(3,4)]'::lseg ## '[(0,0),(6,6)]'::lseg;
>  ?column?
> ----------
>  (1,2)
> (1 row)

I appended the fix to the patches.  This is also effecting lseg ## box operator.

I also changed recently band-aided point ## lseg operator to return
the point instead of NULL when it cannot find the correct result to
avoid the operators depending on this one to crash.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] [PATCH] Improve geometric types

От
Kyotaro HORIGUCHI
Дата:
Hello, thanks for the new patch.

0004 failed to be applied on the underneath patches.

At Sun, 5 Nov 2017 15:54:19 +0100, Emre Hasegeli <emre@hasegeli.com> wrote in
<CAE2gYzzngpYgrQbJ-2TjzZ+MZBa0D0Xzj8tjjJLv6C3CPARMGw@mail.gmail.com>
> > I am not sure how useful NaNs are in geometric types context, but we
> > allow them, so inconsistent hypot() would be a problem.  I will change
> > my patches to keep pg_hypot().
> 
> New versions of the patches are attached with 2 additional ones.  The
> new versions leave pg_hypot() in place.  One of the new patches
> improves the test coverage.  The line coverage of geo_ops.c increases
> from 55% to 81%.  The other one fixes -0 values to 0 on float
> operators.  I am not sure about performance implication of this, so
> kept it separate.  It may be a better idea to check this only on the
> platforms that has tendency to produce -0.
> 
> While working on the tests, I found some unreachable code and removed
> it.  I also found that lseg ## lseg operator returning wrong results.
> It is defined as "closest point to first segment on the second
> segment", but:
> 
> > # select '[(1,2),(3,4)]'::lseg ## '[(0,0),(6,6)]'::lseg;
> >  ?column?
> > ----------
> >  (1,2)
> > (1 row)
> 
> I appended the fix to the patches.  This is also effecting lseg ## box operator.

Mmm.. It returns (1.5, 1.5) with the 0004 patch. It is surely a
point on the second operand but I'm not sure it's right that the
operator returns a specific point for two parallel segments.

> I also changed recently band-aided point ## lseg operator to return
> the point instead of NULL when it cannot find the correct result to
> avoid the operators depending on this one to crash.

I'd like to put comments on 0001 and 0004 only now:
- Adding [LR]DELIM_L looks good but they should be described in  the comment just above.
- Renaming float8_slope to slope seems no problem.
- I'm not sure the change of box_construct is good but currently  all callers fits the new interface (giving two
points,not  four coordinates).
 
- close_lseg seems forgetting the case where the two given  segments are crossing (and parallel). For example,
  select '(-8,-8),(1,1)'::lseg ## '(-10,0),(2,0)'::lseg;
  is expected to return (0,0), which is the crossing point of  the two segments, but it returns (1,0) (and returned
(1,1) before the patch), which is the point on the l2 closest to the  closer end of l1 to l2.
 
  As mentioned above, it is difficult to dicide what is the  proper result for parallel segments. I suppose that the
followingoperation should return an invalid value as a point.
 
  select '(-1,0),(1,0)'::lseg ## '(-1,1),(1,1)'::lseg;
  close_lseg does the following operations in the else case of  'if (float8_...)'. If i'm not missing something, the
resultof  the following operation is always the closer end of l2. In  other words it seems a waste of cycles.    |
point= DirectFunctionCall2(close_ps,  |                             PointPGetDatum(&l2->p[closer_end2]),  |
               LsegPGetDatum(l1));  | return DirectFunctionCall2(close_ps, point, LsegPGetDatum(l2));
 


- make_bound_box operates directly on the poly->boundbox. I'm afraid that such coding hinders compiers from using
registers.
 This is a bit apart from this patch, it would be better if we could eliminate internal calls using
DirectFunctionCall.

reagrds,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Improve geometric types

От
Kyotaro HORIGUCHI
Дата:
Hello,

> I'd like to put comments on 0001 and 0004 only now:
...

I don't have a comment on 0002.

About 0003:

>  @@ -4487,21 +4486,21 @@ circle_in(PG_FUNCTION_ARGS)
> ...
>      circle->radius = single_decode(s, &s, "circle", str);
> -    if (circle->radius < 0)
> +    if (float8_lt(circle->radius, 0.0))
>          ereport(ERROR,
>                  (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),

flost8_lt and its family functions are provided to unify the
sorting order including NaN. NaN is not rejected by the usage of
float8_lt in the case but it is what the function is expected to
be used for. If we wanted to check if it is positive, it
unexpectedly throws an exception.  (I suppose that NaNs should be
silently ignored rather than stopping a query by throwng an
exception.)

Addition to that I don't think it proper that applying EPSILON(!)
there. It should be strictly compared regardless whether EPSION
is considered or not.


Similary, circle_overlap for example, float8_le is used to
compare the distance and the summed radius.

NaN causes a problem in another place. 
> PG_RETURN_BOOL(FPle(point_dt(&circle1->center, &circle2->center),
>                     float8_pl(circle1->radius, circle2->radius)));

If the distance was NaN and the summed radius is not, the
function returns true. I think that a reasonable behavior is that
an object containing NaN cannot make any meaningful relationship
with other objects as floating number itself behave so. (Any
comparison other than != with NaN returns always false)

Just using another series of comparison functions that return
false for NaN-containing comparison is not a solution since the
meaning of the returned false differs by context, just same as
the first problem above. For exameple, the fictious functions
below,

| bool circle_overlaps()
|   ret = FPle(distance, radius_sum);

This gives correct results, but

| bool circle_not_overlaps()
|   ret = FPgt(distance, radius_sum);

This gives a wrong result for NaN-containing objects.

Perhaps it is enough to explicitly define behaviors for NaN
before comparison.

circle_overlap()
> distance = point_dt(....);
> radius_sum = float8_pl(...);
>
> /* NaN-containing objects doesn't overlap any other objects */ 
> if (isnan(distance) || isnan(radius_sum))
>      PG_RETURN_BOOL(false);
>
> /* NaN ordering of FPle() doesn't get into mischief here */
> return PG_RETURN_BOOL(FPle(distance, radius_sum));

(End Of the Comment to 0003)

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
>> This is also effecting lseg ## box operator.
>
> Mmm.. It returns (1.5, 1.5) with the 0004 patch. It is surely a
> point on the second operand but I'm not sure it's right that the
> operator returns a specific point for two parallel segments.

I am changing it to return NULL, when they are parallel.

> I'd like to put comments on 0001 and 0004 only now:
>
>  - Adding [LR]DELIM_L looks good but they should be described in
>    the comment just above.

I will mention it on the new version.

>  - I'm not sure the change of box_construct is good but currently
>    all callers fits the new interface (giving two points, not
>    four coordinates).

I tried to make things more consistent.  The other constructors takes points.

>  - close_lseg seems forgetting the case where the two given
>    segments are crossing (and parallel).

I am re-implementing it covering those cases.

> - make_bound_box operates directly on the poly->boundbox. I'm
>   afraid that such coding hinders compiers from using registers.

I am changing it back.

>   This is a bit apart from this patch, it would be better if we
>   could eliminate internal calls using DirectFunctionCall.

We don't seem to be able to fix all issues without doing that.  I will
incorporate the change.

Thank you for your review.  I will address your other email before
posting new versions.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> dist_pl is changed to take the smaller distance of both ends of
> the segment. It seems absorbing error, so it might be better
> taking the mean of the two distances. If you have a firm reason
> for the change, it is better to be written there, or it might be
> better left alone.

I am sorry for not paying enough attention to this before.  The
distance functions are meant to return the minimum distance.  Getting
the maximum is just wrong in here.  Can you think of any particular
error it would absorb?  Otherwise I will put this change back to the
patch for fixes.


Re: [HACKERS] [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> flost8_lt and its family functions are provided to unify the
> sorting order including NaN. NaN is not rejected by the usage of
> float8_lt in the case but it is what the function is expected to
> be used for. If we wanted to check if it is positive, it
> unexpectedly throws an exception.  (I suppose that NaNs should be
> silently ignored rather than stopping a query by throwng an
> exception.)

It would at least be dump-and-restore hazard if we don't let them in.
The new version allows NaNs.

> This gives a wrong result for NaN-containing objects.

I removed the NaN aware comparisons from FP macros, and carefully
reviewed the places that needs to be NaN aware.

I am sorry that it took so long for me to post the new versions.  The
more I get into this the more problems I find.  The new versions
include non-trivial changes.  I would be glad if you can look into
them.

Вложения

Re: [HACKERS] [PATCH] Improve geometric types

От
Alvaro Herrera
Дата:

Re: [HACKERS] [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> Does this patch series fix bug #14971?

No, it doesn't.  I am trying to improve things without touching the EPSILON.


Re: [HACKERS] [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:

Re: [HACKERS] [PATCH] Improve geometric types

От
Kyotaro HORIGUCHI
Дата:
Hello, sorry for the absense.

(Sorry for the long but should be hard-to-read article..)

At Thu, 4 Jan 2018 14:53:47 +0100, Emre Hasegeli <emre@hasegeli.com> wrote in
<CAE2gYzz0hRfC-M8KDV4Aytj+6k6cvoiGqVvEjVyHTbtraf=YBQ@mail.gmail.com>
> Rebased versions are attached.

Thank you for the new patch.

I'm looking 0001 patch. This does many things at once but seems
hard to split into step-by-step patches. So I reviewed this from
the viewpoint that how each of the external function is
changed. (Attatchment 1).

I think that this patch is not intending to change any behaviors
of the external functions, but intending make it mathematically
reasonable. So some behavioral changes are ineviablly
requried. The problem here is what is the most reasonable (and
acceptable) behavior.

The comments below are just the starting of a discussion about
some aspects of design.  I'm sorry that this discussion might be
changing the way to go largily, but I'd like to hear what you
think about two topics.


I found seemingly inconsistent handling of NaN.

- Old box_same assumed NaN != NaN, but new assumes NaN ==
  NaN. I'm not sure how the difference is significant out of the
  context of sorting, though...

- box_overlap()'s behavior has not been changed. As the result
  box_same and box_overlap now behaves differently concerning
  NaN handling.

- line_construct_pts has been changed so that generates
  {NaN,-1,NaN} for two identical points (old code generated
  {-1,0,0}) but I think the right behavior here is erroring out.
  (as line_in behaves.)


I feel that it'd better to define how to handle non-numbers
before refactoring.  I prefer to just denying NaN as a part of
the geometric types, or any input containing NaN just yields a
result filled with NaNs.


The next point is reasonability of the algorithm and consistency
among related functions.

- line_parallel considers the EPSILON(ugaa) with different scale
  from the old code, but both don't seem reasonable.. It might
  not be the time to touch it, but if we changed the behavior,
  I'd like to choose reasonable one. At least the tolerance of
  parallelity should be taken by rotation, not by slope.

I think that one reasonable way to do this is taking the distance
between the far ends of two direction (unit) vectors.
  
Assuming Ax + By + C = 0, the direction vecotr dv(x, y) for the
line is:

    n = sqrt(A^2 + B^2), dv = (B / n, -A / n).

  (and the normal vector nv = (A / n, B / n))

The vector needs to be restricted within upper or any two
successive quadrants so that it is usable for this objective. So
let's restrict A to be negative value for now. Something like the
follwoing would be an answer.

  void line_direction_vector(Point *result, LINE *l)
  {
    float8 A = l->A;
    float8 B = l->B;
    float8 n;

    n = HYPOT(A, B);

    /* keep the result vector within the 1st-2nd quadrants */
    if (A > 0)
    {
       A = -A;
       B = -B;
    }
    point_construct(result, B / n, -A / n);
  }

  bool line_parallel(LINE *l1, LINE *l2)
  {
    Point d1, d2;

    line_direction_vector(&d1, l1);
    line_direction_vector(&d2, l2);
    return FPzero(point_dt(&d1, &d2));
  }
  
Also perpendicularity is detected by comparing the direction
vector of one line and the normal vector of another in the same
manner.

  void line_normal_vector(Point *result, LINE *l)
  {
    float8 A = l->A;
    float8 B = l->B;
    float8 n;

    /* keep the result vector within the 1st-2nd quadrants */
    n = HYPOT(A, B);
    if (B < 0)
    {
       A = -A;
       B = -B;
    }
    point_construct(result, A / n, B / n);
  }

  bool line_perp(LINE *l1, LINE *l2)
  {
    Point d1, d2;

    line_direction_vector(&d1, l1);
    line_normal_vector(&d2, l2);
    return FPzero(point_dt(&d1, &d2));
  }
  
In relation to this, equality of two lines is also nuisance.
From the definitions, two equal lines are parallel. In
contraposition, two nonparallel lines cannot be equal. This
relationship is represented by the following expression:

 is_equal =: line_parallel(l1, l2) && FPzero(line_distance(l1, l2))

But the line_distance returns zero in most cases even if it is
considered "parallel" with considering tolerance. This means that
There's almost no two lines that are parallel but not equal (that
is, the truely parallel lines)... sigh.

So we might should calculate the distance in different (not
purely mathematical/geometrical) way. For example, if we can
assume the geometric objects are placed mainly around the origin,
we could take the distance between the points nearest to origin
on both lines. (In other words, between the foots of
perpendicular lines from the origin onto the both lines). Of
couse this is not ideal but...

The same discussion also affects line_interpt_line or other
similar functions.


At last, just a simple comment.

- point_eq_point() assumes that NaN == NaN. This is an inherited
  behavior from old float[48]_cmp_internal() but it's not a
  common treat. point_eq_point() needs any comment about the
  special definition as float[48]_cmp_internal has.



regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: [HACKERS] [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> I found seemingly inconsistent handling of NaN.
>
> - Old box_same assumed NaN != NaN, but new assumes NaN ==
>   NaN. I'm not sure how the difference is significant out of the
>   context of sorting, though...

There is no box != box operator.  If there was one, previous code
would return false for every input including NaN, because it was
ignorant about NaNs other than a few bandaids to avoid crashes.

My idea is to make the equality (same) operators behave like the float
datatypes treating NaNs as equals.  The float datatypes also treat
NaNs as greater than any non-NAN.  We don't need to worry about this
part now, because there are no basic comparison operators defined for
the geometric types.

> - box_overlap()'s behavior has not been changed. As the result
>   box_same and box_overlap now behaves differently concerning
>   NaN handling.

After your previous email, I though this would be the right thing to
do.  I made the containment and placement operators return false for
NaN input unless we can find the result using non-NaN coordinates.  Do
you find this behaviour reasonable?

> - line_construct_pts has been changed so that generates
>   {NaN,-1,NaN} for two identical points (old code generated
>   {-1,0,0}) but I think the right behavior here is erroring out.
>   (as line_in behaves.)

I agree.  We should also check for the endpoints being the same on
lseg_interpt functions.  I will make the changes on the next version.

> I feel that it'd better to define how to handle non-numbers
> before refactoring.  I prefer to just denying NaN as a part of
> the geometric types, or any input containing NaN just yields a
> result filled with NaNs.

It would be nice to deny NaNs altogether, but I don't think it is
feasible.  People cannot restore their existing data if we start doing
that.  It would also require us to check any result we emit being NaN
and error out in more cases.

> The next point is reasonability of the algorithm and consistency
> among related functions.
>
> - line_parallel considers the EPSILON(ugaa) with different scale
>   from the old code, but both don't seem reasonable.. It might
>   not be the time to touch it, but if we changed the behavior,
>   I'd like to choose reasonable one. At least the tolerance of
>   parallelity should be taken by rotation, not by slope.

I think you are right.  Vector based algorithms would be good for many
other operations as well.  This would be more changes, though.  I am
try to reduce the amount of changes to increase my chances to get this
committed.  I just used slope() there to increase the code reuse and
to make the operation symmetrical.  I can revert it back to its
previous state, if you thing it is better.

> So we might should calculate the distance in different (not
> purely mathematical/geometrical) way. For example, if we can
> assume the geometric objects are placed mainly around the origin,
> we could take the distance between the points nearest to origin
> on both lines. (In other words, between the foots of
> perpendicular lines from the origin onto the both lines). Of
> couse this is not ideal but...

The EPSILON is unfair to coordinates closer to the origin.  I am not
sure what it the best way to improve this.  I am trying to avoid
dealing with EPSILON related issues in the scope of these changes.

> At last, just a simple comment.
>
> - point_eq_point() assumes that NaN == NaN. This is an inherited
>   behavior from old float[48]_cmp_internal() but it's not a
>   common treat. point_eq_point() needs any comment about the
>   special definition as float[48]_cmp_internal has.

I hope I answered this on the previous questions.  I will incorporate
the decision to threat NaNs as equals into the comments and the commit
messages on the next version, if we agree on it.


Re: [HACKERS] [PATCH] Improve geometric types

От
Kyotaro HORIGUCHI
Дата:
Hello,

I'm still wandering on the way and confused. Sorry for
inconsistent comments in advanceX-(

At Sun, 14 Jan 2018 13:20:57 +0100, Emre Hasegeli <emre@hasegeli.com> wrote in
<CAE2gYzyY3U4n1Zn48qG6dL=FWgv29yag5PdGV2Gc17w9Toeaog@mail.gmail.com>
> > I found seemingly inconsistent handling of NaN.
> >
> > - Old box_same assumed NaN != NaN, but new assumes NaN ==
> >   NaN. I'm not sure how the difference is significant out of the
> >   context of sorting, though...
> 
> There is no box != box operator.  If there was one, previous code
> would return false for every input including NaN, because it was
> ignorant about NaNs other than a few bandaids to avoid crashes.
> 
> My idea is to make the equality (same) operators behave like the float
> datatypes treating NaNs as equals.  The float datatypes also treat
> NaNs as greater than any non-NAN.  We don't need to worry about this
> part now, because there are no basic comparison operators defined for
> the geometric types.

I'm not sure what you mean by the "basic comparison ops"  but I'm
fine with the policy, handling each component values in the same
way with float. So point_eq_point() is right and other functions
should follow the same policy.

> > - box_overlap()'s behavior has not been changed. As the result
> >   box_same and box_overlap now behaves differently concerning
> >   NaN handling.
> 
> After your previous email, I though this would be the right thing to
> do.  I made the containment and placement operators return false for
> NaN input unless we can find the result using non-NaN coordinates.  Do
> you find this behaviour reasonable?

Sorry for going back and force, but I don't see a difference
between it and the original behavior from the point of view of
reasonability. Isn't is enough to let each component comparison
follow float by redefining FPxx() functions?

For example, define FPle as the follwoing:

static inline bool FP8le(float8 a, float8 b)
{
    return float8_le(a, float8_pl(b, EPSILON));
}

Then define box_ov using this function:

static bool box_ov(BOX *box1, BOX *box2)
{
  return (FP8le(box1->low.x, box2->high.x) &&....);
}

Another example, define FPeq as:

static inline bool FP8eq(float8 a, float8 b)
{
    if (isnan(a))
    {
      if (isnan(b))
        return true;
      return false;
    }
    else if (isnan(b))
      return false;

    if (isinf(a) && isinf(b))
      return true;

    return abs(a - b) <= EPSILON;
}

Then redefine point_eq_point as:

statinc inline bool
point_eq_point(Point *pt1, Point *pt2)
{
    return FP8eq(pt1->x, pt2->x) && FP8eq(pt1->y, pt2->y);
}

...

At least this is in a kind of consistency and the behavior is
inferable (in a certain sense).

> > - line_construct_pts has been changed so that generates
> >   {NaN,-1,NaN} for two identical points (old code generated
> >   {-1,0,0}) but I think the right behavior here is erroring out.
> >   (as line_in behaves.)
> 
> I agree.  We should also check for the endpoints being the same on
> lseg_interpt functions.  I will make the changes on the next version.
> 
> > I feel that it'd better to define how to handle non-numbers
> > before refactoring.  I prefer to just denying NaN as a part of
> > the geometric types, or any input containing NaN just yields a
> > result filled with NaNs.
> 
> It would be nice to deny NaNs altogether, but I don't think it is
> feasible.  People cannot restore their existing data if we start doing
> that.  It would also require us to check any result we emit being NaN
> and error out in more cases.

It sounds right.

> > The next point is reasonability of the algorithm and consistency
> > among related functions.
> >
> > - line_parallel considers the EPSILON(ugaa) with different scale
> >   from the old code, but both don't seem reasonable.. It might
> >   not be the time to touch it, but if we changed the behavior,
> >   I'd like to choose reasonable one. At least the tolerance of
> >   parallelity should be taken by rotation, not by slope.
> 
> I think you are right.  Vector based algorithms would be good for many
> other operations as well.  This would be more changes, though.  I am
> try to reduce the amount of changes to increase my chances to get this
> committed.  I just used slope() there to increase the code reuse and
> to make the operation symmetrical.  I can revert it back to its
> previous state, if you thing it is better.

I'm fine with both using slope with the same scale (that is,
keeping the previous behavior) or going into vector based. But
the latter raises a problem as below:(

> > So we might should calculate the distance in different (not
> > purely mathematical/geometrical) way. For example, if we can
> > assume the geometric objects are placed mainly around the origin,
> > we could take the distance between the points nearest to origin
> > on both lines. (In other words, between the foots of
> > perpendicular lines from the origin onto the both lines). Of
> > couse this is not ideal but...
> 
> The EPSILON is unfair to coordinates closer to the origin.  I am not
> sure what it the best way to improve this.  I am trying to avoid
> dealing with EPSILON related issues in the scope of these changes.

Right. And I don't have a good idea here..

> > At last, just a simple comment.
> >
> > - point_eq_point() assumes that NaN == NaN. This is an inherited
> >   behavior from old float[48]_cmp_internal() but it's not a
> >   common treat. point_eq_point() needs any comment about the
> >   special definition as float[48]_cmp_internal has.
> 
> I hope I answered this on the previous questions.  I will incorporate
> the decision to threat NaNs as equals into the comments and the commit
> messages on the next version, if we agree on it.

Perhas it's fine for me.



regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: [HACKERS] [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> I'm not sure what you mean by the "basic comparison ops"  but I'm
> fine with the policy, handling each component values in the same
> way with float. So point_eq_point() is right and other functions
> should follow the same policy.

I mean <, >, <= and >= by basic comparison operators.  Operators with
those symbols are available for some geometric types, but they are
comparing the sizes of the objects.  Currently only the equality
operators follow the same policy with point_eq_point (), others never
return true when NaNs are involved.

> Sorry for going back and force, but I don't see a difference
> between it and the original behavior from the point of view of
> reasonability. Isn't is enough to let each component comparison
> follow float by redefining FPxx() functions?

My previous patch was doing exactly that.  I though that is not what
we want to do.  Do we want box_overlap() to return true when NaNs are
involved?


Re: [HACKERS] [PATCH] Improve geometric types

От
Kyotaro HORIGUCHI
Дата:
Hello,

I played more around geometric types and recalled that geometric
types don't have orderings. Also have corrected some other
misunderstanding. Sorry for my confusion and I think I returned
on the way.

At Wed, 17 Jan 2018 18:59:30 +0100, Emre Hasegeli <emre@hasegeli.com> wrote in
<CAE2gYzzgJ7B-HdmxuSDoqu-_x1nEeoEA45is2hP8ex4r3KNH8Q@mail.gmail.com>
> > I'm not sure what you mean by the "basic comparison ops"  but I'm
> > fine with the policy, handling each component values in the same
> > way with float. So point_eq_point() is right and other functions
> > should follow the same policy.
> 
> I mean <, >, <= and >= by basic comparison operators.  Operators with
> those symbols are available for some geometric types, but they are
> comparing the sizes of the objects.  Currently only the equality
> operators follow the same policy with point_eq_point (), others never
> return true when NaNs are involved.
> 
> > Sorry for going back and force, but I don't see a difference
> > between it and the original behavior from the point of view of
> > reasonability. Isn't is enough to let each component comparison
> > follow float by redefining FPxx() functions?
> 
> My previous patch was doing exactly that.  I though that is not what
> we want to do.  Do we want box_overlap() to return true when NaNs are
> involved?

All comparison operators don't need consideration on
ordering. And the existing comparison operators behaves
diferrently from float and already behaves in the way of bare
float. There's no behavior as the whole of an object. I have
fixed my understanding as this. (And I saw all patches altoghter
by mistake..) As the result most my concern was pointless.


0001:

- You removed the check of parallelity check of
  line_interpt_line(old line_interpt_internal) but line_parallel
  is not changed so the consistency between the two functions are
  lost. It is better to be another patch file (maybe 0004?).

- +    Assert(p1->npts > 0 && p2->npts > 0);

  Other path_ functions are allowing path with no points so just
  returning false if (p1->npts == 0 || p2->npts == 0) seems
  enough. Assertion should be used only for something server
  cannot continue working. (division by zero doesn't crash
  server) I saw other similar assertions in the patch.


0002: I have no comment so far on this patch.

0003: 

 close_pl/ps/lseg/pb/ls/sb have changed to return null when
 lseg_closept_line returns NAN, but they are defined as strict
 and that is reasonable. Anyway pg_hypot returns NaN only when
 parameters contain INF or NAN so we should error out when it
 returns nan.

 circle_in rejects negative radius and circle_recv modived to
 reject zero and negative. What is the reason for the change?

 I understand that we don't need to consider NAN is equal to NAN.
 circle_same, point_eq_point, line_eq no longer needs such
 change?

 Assertion is added in pg_hypot but it is impossible for result
 to be negative there. Why do you added it?

0004:
 
 Looking line_recv's change, I became to think that FPxx macros
 is provided for coordinate values. I don't think it is applied
 to non-coordinate values like coefficients. If some kind of
 tolerance is needed here, it is not the same with the EPSILON.

 +    if (FPzero(line->A) && FPzero(line->B))
 +        ereport(ERROR,

 So, the above should be exact comparison with 0.0 and line_in
 also should be the same. And maybe the same thing should be done
 at many places..

 But the following line of line_parallel still requires some kind
 of tolerance to work properly. Since this patch is an
 imoprovement patch, I think we can change it to the vector method.

 The problem of line_distance is an existing one so we can leave
 it alone. It returns 0.0 for the most cases but it is a
 long-standing behavior.. (Anyway I don't find a reasonable
 definition of the distance between very-nearly parallel lines..)


-- Sorry time's up today.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: [HACKERS] [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> 0001:
>
> - You removed the check of parallelity check of
>   line_interpt_line(old line_interpt_internal) but line_parallel
>   is not changed so the consistency between the two functions are
>   lost. It is better to be another patch file (maybe 0004?).

I am making line_parallel() use line_interpt_line().  What they do is
exactly the same.

> - +     Assert(p1->npts > 0 && p2->npts > 0);
>
>   Other path_ functions are allowing path with no points so just
>   returning false if (p1->npts == 0 || p2->npts == 0) seems
>   enough. Assertion should be used only for something server
>   cannot continue working. (division by zero doesn't crash
>   server) I saw other similar assertions in the patch.

path_in() and path_recv() reject paths with no points.  I thought we
shouldn't spend cycles for things that cannot happen.  I can revert
this part if you find previous practice better.

> 0003:
>
>  close_pl/ps/lseg/pb/ls/sb have changed to return null when
>  lseg_closept_line returns NAN, but they are defined as strict
>  and that is reasonable. Anyway pg_hypot returns NaN only when
>  parameters contain INF or NAN so we should error out when it
>  returns nan.

I though strict is only related to the input being NULL.  Tom Lane
made close_ps() return NULL with commit
278148907a971ec7fa4ffb24248103d8012155d2.  The other functions
currently fail with elog()s from DirectFunctionCalls.  I don't have
strong preference for NULL or an error.  Could you please suggest an
errcode and errmsg, if you have?

>  circle_in rejects negative radius and circle_recv modived to
>  reject zero and negative. What is the reason for the change?

Overlooking.  Thank you for noticing.  I am fixing it.

>  I understand that we don't need to consider NAN is equal to NAN.
>  circle_same, point_eq_point, line_eq no longer needs such
>  change?

I though it would be better to consider NaNs equal only on the
equality operators.  That is why only they are changed that way.

>  Assertion is added in pg_hypot but it is impossible for result
>  to be negative there. Why do you added it?

True.  I am removing it.

> 0004:
>
>  Looking line_recv's change, I became to think that FPxx macros
>  is provided for coordinate values. I don't think it is applied
>  to non-coordinate values like coefficients. If some kind of
>  tolerance is needed here, it is not the same with the EPSILON.
>
>  +    if (FPzero(line->A) && FPzero(line->B))
>  +        ereport(ERROR,
>
>  So, the above should be exact comparison with 0.0 and line_in
>  also should be the same. And maybe the same thing should be done
>  at many places..

I agree you.  EPSILON is currently applied prematurely.  I am trying
to stay away from EPSION related issues to improve the chances to get
this part committed.

>  But the following line of line_parallel still requires some kind
>  of tolerance to work properly. Since this patch is an
>  imoprovement patch, I think we can change it to the vector method.

I am making line_parallel() use line_interpt_line() in response to
your first remark.  Vector based algorithm is probably better for
every use of line_interpt_line(), but it is a bigger change with more
user visible effects.

>  The problem of line_distance is an existing one so we can leave
>  it alone. It returns 0.0 for the most cases but it is a
>  long-standing behavior.. (Anyway I don't find a reasonable
>  definition of the distance between very-nearly parallel lines..)

Exact comparison with 0.0 instead of FPzero() would also be an
improvement for line_distance(), but I am not doing it now.

> -- Sorry time's up today.

I am waiting for the rest of your review to post the new versions.


Re: [HACKERS] [PATCH] Improve geometric types

От
Kyotaro HORIGUCHI
Дата:
Hello,

At Thu, 18 Jan 2018 16:01:01 +0100, Emre Hasegeli <emre@hasegeli.com> wrote in
<CAE2gYzwgcan3w3=-3oHa4Efif=0yvoErn-e_V5KJLUi9pd8ivQ@mail.gmail.com>
> > 0001:
> >
> > - You removed the check of parallelity check of
> >   line_interpt_line(old line_interpt_internal) but line_parallel
> >   is not changed so the consistency between the two functions are
> >   lost. It is better to be another patch file (maybe 0004?).
> 
> I am making line_parallel() use line_interpt_line().  What they do is
> exactly the same.

Thanks.

> > - +     Assert(p1->npts > 0 && p2->npts > 0);
> >
> >   Other path_ functions are allowing path with no points so just
> >   returning false if (p1->npts == 0 || p2->npts == 0) seems
> >   enough. Assertion should be used only for something server
> >   cannot continue working. (division by zero doesn't crash
> >   server) I saw other similar assertions in the patch.
> 
> path_in() and path_recv() reject paths with no points.  I thought we
> shouldn't spend cycles for things that cannot happen.  I can revert
> this part if you find previous practice better.

I'm fine with the current shape. Thanks. Maybe the same
discussion applies to polygons. (cf. poly_overlap)

> > 0003:
> >
> >  close_pl/ps/lseg/pb/ls/sb have changed to return null when
> >  lseg_closept_line returns NAN, but they are defined as strict
> >  and that is reasonable. Anyway pg_hypot returns NaN only when
> >  parameters contain INF or NAN so we should error out when it
> >  returns nan.
> 
> I though strict is only related to the input being NULL.  Tom Lane

Oops. Yes. you're rigtht.

> made close_ps() return NULL with commit
> 278148907a971ec7fa4ffb24248103d8012155d2.  The other functions

Thank you for the pointer. By the way,

https://www.postgresql.org/message-id/1919.1468269494%40sss.pgh.pa.us

| > Is it reasonable to disallow NaN coordinates on the next major
| > release.  Are there any reason to deal with them?
| 
| I don't see how we can do that; what would you do about tables already
| containing NaNs?  Even without that consideration, assuming that there's
| no way a NaN could creep in seems a pretty fragile assumption, considering
| that operations like Infinity/Infinity will produce one.

Ok, it is convicing. The policy is don't do anything that is not
harmful to server. Currently close_* are *strict* so what we
should do is avoid returning '(anything *)NULL' as a result.

> currently fail with elog()s from DirectFunctionCalls.  I don't have
> strong preference for NULL or an error.  Could you please suggest an
> errcode and errmsg, if you have?

=# select close_sb('((nan,0),(1,1))'::lseg, '((-1,-1),(2,2))'::box);
ERROR:  function 0x8fe61c returned NULL

Recosdering on it and I came to think that such usage of SQL null
is the same to "invalid" objects I mentioned upthread. But we
don't actively check component NaNs but if we happen to see an
invalid result, return SQL null instead.

In this criteria close_* functions looks good that return SQL
null.

0003:

 line_closept_point asserts line_interpt_line returns true but it
 is hazardous. It is safer if line_closept_point returns NaN if
 line_interpt_line returns false.

 All other functions looks good in the criteria.

> >  I understand that we don't need to consider NAN is equal to NAN.
> >  circle_same, point_eq_point, line_eq no longer needs such
> >  change?
> 
> I though it would be better to consider NaNs equal only on the
> equality operators.  That is why only they are changed that way.

I'm convinced by that.

> > 0004:
> >
> >  Looking line_recv's change, I became to think that FPxx macros
> >  is provided for coordinate values. I don't think it is applied
> >  to non-coordinate values like coefficients. If some kind of
> >  tolerance is needed here, it is not the same with the EPSILON.
> >
> >  +    if (FPzero(line->A) && FPzero(line->B))
> >  +        ereport(ERROR,
> >
> >  So, the above should be exact comparison with 0.0 and line_in
> >  also should be the same. And maybe the same thing should be done
> >  at many places..
> 
> I agree you.  EPSILON is currently applied prematurely.  I am trying
> to stay away from EPSION related issues to improve the chances to get
> this part committed.

Agreed.

> >  But the following line of line_parallel still requires some kind
> >  of tolerance to work properly. Since this patch is an
> >  imoprovement patch, I think we can change it to the vector method.
> 
> I am making line_parallel() use line_interpt_line() in response to
> your first remark.  Vector based algorithm is probably better for
> every use of line_interpt_line(), but it is a bigger change with more
> user visible effects.

I'm fine with that.

> >  The problem of line_distance is an existing one so we can leave
> >  it alone. It returns 0.0 for the most cases but it is a
> >  long-standing behavior.. (Anyway I don't find a reasonable
> >  definition of the distance between very-nearly parallel lines..)
> 
> Exact comparison with 0.0 instead of FPzero() would also be an
> improvement for line_distance(), but I am not doing it now.


reagrds,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: [HACKERS] [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> I'm fine with the current shape. Thanks. Maybe the same
> discussion applies to polygons. (cf. poly_overlap)

It indeed does.  I am incorporating.

>  line_closept_point asserts line_interpt_line returns true but it
>  is hazardous. It is safer if line_closept_point returns NaN if
>  line_interpt_line returns false.

Yes, it makes more sense.  I am changing it.

>> >  I understand that we don't need to consider NAN is equal to NAN.
>> >  circle_same, point_eq_point, line_eq no longer needs such
>> >  change?
>>
>> I though it would be better to consider NaNs equal only on the
>> equality operators.  That is why only they are changed that way.
>
> I'm convinced by that.

Great.  I am documenting this decision better on the patches.


Re: [HACKERS] [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:

Re: [HACKERS] [PATCH] Improve geometric types

От
Kyotaro HORIGUCHI
Дата:
At Sun, 21 Jan 2018 21:59:19 +0100, Emre Hasegeli <emre@hasegeli.com> wrote in
<CAE2gYzxDYs5tcvc4uErsWaFTb3UTYS0ERt_fFyi-28Ldvs5d4A@mail.gmail.com>
> New versions are attached including all changes we discussed.

Thanks for the new version.

# there's many changes from the previous version..

About 0001 and 0002.

1."COPT=-DGEODEBUG make" complains as follows.

  | geo_ops.c:2445:62: error: invalid type argument of unary ‘*’ (have ‘float8 {aka double}’)
  |   printf("dist_ppoly_internal- segment 0/n distance is %f\n", *result);

2. line_construct_pm has been renamed to line_construct. I
   noticed that the patch adds the second block for "(m == 0.0)"
   (from the ealier versions) but it seems to work exactly as the
   same to the "else" block. We need a comment about the reason
   for the seemingly redundant second block.

3. point_sl can return -0.0 and that is a thing that this patch
   intends to avoid. line_invsl has the same problem.

4. lseg_interpt_line is doing as follows.

   >  if (FPeq(lseg->p[0].x, interpt.x) && FPeq(lseg->p[0].y, interpt.y))
   >    *result = lseg->p[0];
   >  else if (FPeq(lseg->p[1].x, interpt.x) && FPeq(lseg->p[1].y, interpt.y))
   >    *result = lseg->p[1];

   I suppose we can use point_pt_point for this purpose.

   >  if (point_eq_point(&lseg->p[0], &interpt))
   >    *result = lseg->p[0];
   >  else if (point_eq_point(&lseg->p[1], &interpt))
   >    *result = lseg->p[1];

   However I'm not sure that adjusting the intersection to the
   tips of the segment is good or not. Adjusting onto the line
   can be better in another case. lseg_interpt_lseg, for
   instance, checks lseg_contain_point on the line parameter of
   lseg_interpt_line.

# I'll be back later..

regards

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: [HACKERS] [PATCH] Improve geometric types

От
Kyotaro HORIGUCHI
Дата:
At Wed, 31 Jan 2018 13:09:09 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in
<20180131.130909.210233873.horiguchi.kyotaro@lab.ntt.co.jp>
> At Sun, 21 Jan 2018 21:59:19 +0100, Emre Hasegeli <emre@hasegeli.com> wrote in
<CAE2gYzxDYs5tcvc4uErsWaFTb3UTYS0ERt_fFyi-28Ldvs5d4A@mail.gmail.com>
> > New versions are attached including all changes we discussed.
> 
> Thanks for the new version.
> 
> # there's many changes from the previous version..
> 
> About 0001 and 0002.
> 
> 1."COPT=-DGEODEBUG make" complains as follows.
> 
>   | geo_ops.c:2445:62: error: invalid type argument of unary ‘*’ (have ‘float8 {aka double}’)
>   |   printf("dist_ppoly_internal- segment 0/n distance is %f\n", *result);
> 
> 2. line_construct_pm has been renamed to line_construct. I
>    noticed that the patch adds the second block for "(m == 0.0)"
>    (from the ealier versions) but it seems to work exactly as the
>    same to the "else" block. We need a comment about the reason
>    for the seemingly redundant second block.
> 
> 3. point_sl can return -0.0 and that is a thing that this patch
>    intends to avoid. line_invsl has the same problem.
> 
> 4. lseg_interpt_line is doing as follows.
> 
>    >  if (FPeq(lseg->p[0].x, interpt.x) && FPeq(lseg->p[0].y, interpt.y))
>    >    *result = lseg->p[0];
>    >  else if (FPeq(lseg->p[1].x, interpt.x) && FPeq(lseg->p[1].y, interpt.y))
>    >    *result = lseg->p[1];
> 
>    I suppose we can use point_pt_point for this purpose.
> 
>    >  if (point_eq_point(&lseg->p[0], &interpt))
>    >    *result = lseg->p[0];
>    >  else if (point_eq_point(&lseg->p[1], &interpt))
>    >    *result = lseg->p[1];
> 
>    However I'm not sure that adjusting the intersection to the
>    tips of the segment is good or not. Adjusting onto the line
>    can be better in another case. lseg_interpt_lseg, for
>    instance, checks lseg_contain_point on the line parameter of
>    lseg_interpt_line.
> 

> # I'll be back later..

I've been back.

0003: This patch replaces "double" with float and bare arithmetic
    and comparisons on double to special functions accompanied
    with checking against abnormal values.

 - Almost all of them are eliminated with a few safe exceptions.

    - circle_recv(), circle_distance(), dist_pc(), dist_cpoint()
      are using "< 0.0" comparison but it looks fine.

    - pg_hypot is right to use bare arithmetics.

    ! circle_contain_pt() does the following comparison and it
      seems to be out of our current policy.

      point_dt(center, point) <= radius

      I suppose this should use FPle.

      FPle(point_dt(center, point), radius)

      The same is true of circle_contain_pt(), pt_contained_circle .



# Sorry, that' all for today..

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: [HACKERS] [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> 1."COPT=-DGEODEBUG make" complains as follows.
>
>   | geo_ops.c:2445:62: error: invalid type argument of unary ‘*’ (have ‘float8 {aka double}’)
>   |   printf("dist_ppoly_internal- segment 0/n distance is %f\n", *result);

Fixing.

> 2. line_construct_pm has been renamed to line_construct. I
>    noticed that the patch adds the second block for "(m == 0.0)"
>    (from the ealier versions) but it seems to work exactly as the
>    same to the "else" block. We need a comment about the reason
>    for the seemingly redundant second block.

It is indeed redundant.  I am removing it.

> 3. point_sl can return -0.0 and that is a thing that this patch
>    intends to avoid. line_invsl has the same problem.

The existing version of the function has the same issue.  I think we
need a global solution for -0s.  See the float-zero patch.

> 4. lseg_interpt_line is doing as follows.
>
>    >  if (FPeq(lseg->p[0].x, interpt.x) && FPeq(lseg->p[0].y, interpt.y))
>    >    *result = lseg->p[0];
>    >  else if (FPeq(lseg->p[1].x, interpt.x) && FPeq(lseg->p[1].y, interpt.y))
>    >    *result = lseg->p[1];
>
>    I suppose we can use point_pt_point for this purpose.

Yes, I am changing it.

>    >  if (point_eq_point(&lseg->p[0], &interpt))
>    >    *result = lseg->p[0];
>    >  else if (point_eq_point(&lseg->p[1], &interpt))
>    >    *result = lseg->p[1];
>
>    However I'm not sure that adjusting the intersection to the
>    tips of the segment is good or not. Adjusting onto the line
>    can be better in another case. lseg_interpt_lseg, for
>    instance, checks lseg_contain_point on the line parameter of
>    lseg_interpt_line.

Me neither, but it is probably better than returning a point that
extends the endpoints of the segment.  I am inclined to leave it alone
for now.


Re: [HACKERS] [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
>     ! circle_contain_pt() does the following comparison and it
>       seems to be out of our current policy.
>
>       point_dt(center, point) <= radius
>
>       I suppose this should use FPle.
>
>       FPle(point_dt(center, point), radius)
>
>       The same is true of circle_contain_pt(), pt_contained_circle .

box_contain_point() also doesn't use the macros.  They are certainly
inconsistent, but I don't think it would be an improvement to make
them use the macros.  As we have discussed, there are many problems
with the current application of EPSILON.  I think we would be better
off not using the macros for none of the containment operators, but
this is out of my scope for now.

> # Sorry, that' all for today..

I am waiting the rest of your review to post the new versions.


Re: [HACKERS] [PATCH] Improve geometric types

От
Kyotaro HORIGUCHI
Дата:
Hello,

At Wed, 31 Jan 2018 17:33:42 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in
<20180131.173342.26333067.horiguchi.kyotaro@lab.ntt.co.jp>
> 0003: This patch replaces "double" with float and bare arithmetic
>     and comparisons on double to special functions accompanied
>     with checking against abnormal values.
> 
>  - Almost all of them are eliminated with a few safe exceptions.
> 
>     - circle_recv(), circle_distance(), dist_pc(), dist_cpoint()
>       are using "< 0.0" comparison but it looks fine.
> 
>     - pg_hypot is right to use bare arithmetics.
> 
>     ! circle_contain_pt() does the following comparison and it
>       seems to be out of our current policy.
> 
>       point_dt(center, point) <= radius
> 
>       I suppose this should use FPle.
> 
>       FPle(point_dt(center, point), radius)
> 
>       The same is true of circle_contain_pt(), pt_contained_circle .

 - line_eq looks too complex in the normal (not containing NANs)
   cases.  We should avoid such complexity if possible.

   One problem here is that comparison conceals NANness of
   operands. Conversely arithmetics propagate it. We can converge
   NANness into a number. The attached line_eq() doesn that. This
   doesn't have almost no additional complexity when NAN is
   involved. I believe it qbehaves in the same way
   and shares a doubious behavior like this.

   =# select '{nan, 1, nan}'::line = '{nan, 2, nan}'::line;
    ?column? 
   ----------
    t

   But probably no point in fixing(?) it.

   The attached file contains line_eq, point_eq_point and
   circle_same. I expect that line_eq is fast but other two are
   doubious.

0004:

 - line_perp

   We can detect perpendicularity without division.

   The normal vecotor of Ax + Bx + C = 0 is (A, B). If two lines
   are perpendicular, the inner product of the normal vectors of
   v1 and v2 is 0. No point in dividing.

   l1->A * l2->A + l1->B * l2->B == 0

   . . . Mmm.. The function seems broken. I posted the fix for
   the existing version is posted, and line_perp() in the attched
   file will work fine.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: [HACKERS] [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
>  - line_eq looks too complex in the normal (not containing NANs)
>    cases.  We should avoid such complexity if possible.
>
>    One problem here is that comparison conceals NANness of
>    operands. Conversely arithmetics propagate it. We can converge
>    NANness into a number. The attached line_eq() doesn that. This
>    doesn't have almost no additional complexity when NAN is
>    involved. I believe it qbehaves in the same way
>    and shares a doubious behavior like this.
>
>    =# select '{nan, 1, nan}'::line = '{nan, 2, nan}'::line;
>     ?column?
>    ----------
>     t
>
>    But probably no point in fixing(?) it.

I think we should fix it.

>    The attached file contains line_eq, point_eq_point and
>    circle_same. I expect that line_eq is fast but other two are
>    doubious.

I haven't got an attachment.

>    . . . Mmm.. The function seems broken. I posted the fix for
>    the existing version is posted, and line_perp() in the attched
>    file will work fine.

I am incorporating the fix you have posted to the other thread to this patch.


Re: [HACKERS] [PATCH] Improve geometric types

От
Andres Freund
Дата:
Hi,

On 2018-02-07 16:46:38 +0100, Emre Hasegeli wrote:
> I am incorporating the fix you have posted to the other thread to this patch.

You've not posted a version fixing the issues in the surrounding thread,
do I see that right?

I'm a bit confused how this patch has gone through several revisions
since, but has been marked as "ready for committer" since 2017-09-05. Am
I missing something?

Greetings,

Andres Freund


Re: [PATCH] Improve geometric types

От
Aleksander Alekseev
Дата:
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:       not tested
Spec compliant:           not tested
Documentation:            not tested

Unfortunately according to http://commitfest.cputube.org/ this patch doesn't apply anymore.

The new status of this patch is: Waiting on Author

Re: [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:

Re: [HACKERS] [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> I'm a bit confused how this patch has gone through several revisions
> since, but has been marked as "ready for committer" since 2017-09-05. Am
> I missing something?

This is floating between commitfests for longer than a year.
Aleksander Alekseev set it to "ready for committer", but Kyotaro
HORIGUCHI continues his review.  I hope not many issues have remained.


Re: [PATCH] Improve geometric types

От
Kyotaro HORIGUCHI
Дата:
Hello, sorry to be late.

At Fri, 2 Mar 2018 15:26:25 +0100, Emre Hasegeli <emre@hasegeli.com> wrote in
<CAE2gYzzj6wgGeBY1vdAHBzzQDSDs-8NhD+=eTcgBKEsCBWpUXg@mail.gmail.com>
> > Unfortunately according to http://commitfest.cputube.org/ this patch doesn't apply anymore.
> New versions are attached.

Thank you for the revised version. This seems fine for me so
marked this as "Ready for Committer".

- This applies cleanly on the master head and regression passes.

- The new behavior looks sane (except for the EPSILON, which is
  out of the scope of this patch).

- Test is complete as long as I can see. At least far more
  completely filled than the previous state. Some test items
  might seem a bit big, but it seems to be needed to raise
  coverage on required combinaions of dimension values.

By the way I think that line_perp is back patched, could you
propose it for the older versions? (I already marked my entry as
Rejected)


Brief summary follows (almost same to the header of patch files):

- 0001 looks fine.
  > * Eliminate SQL level function calls
  > * Re-use more functions to implement others
  > * Unify internal function names and signatures
  > * Remove private functions from geo_decls.h
  > * Replace not-possible checks with Assert()
  > * Add comments to describe some functions
  > * Remove some unreachable code
  > * Define delimiter symbols of line datatype like the other ones
  > * Remove debugging print()s from the refactored functions

  And one bug fix.

- 0002 looks fine.

  Refactors adt/float.c and utils/float.h
    making float checking *macros* into inline functions.
    making float comparison operators more efficiently.

  others are the consequence of the above change.
  and fixes NaN problem of GiST.

- 0003 looks fine.

  just changes the usage of double to float8 as more proper type.
  uses operator functions instead of bare arithmetics to handle
  arithmetic errors more properly.

- 0004 looks fine. (Sorry for overlooking that this treats bugs)

  all overlooked failure cases seems to be fixed.
  
- 0005 looks fine.

  this unifies +-0.0 to +0.0 for the convenient of later processing.

- 0006 It seems cover the all types of operations.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> Thank you for the revised version. This seems fine for me so
> marked this as "Ready for Committer".

Thank you for bearing with me for longer than year.

> By the way I think that line_perp is back patched, could you
> propose it for the older versions? (I already marked my entry as
> Rejected)

Do you mean back-patching?

> Brief summary follows (almost same to the header of patch files):
>
> - 0001 looks fine.
>   > * Eliminate SQL level function calls
>   > * Re-use more functions to implement others
>   > * Unify internal function names and signatures
>   > * Remove private functions from geo_decls.h
>   > * Replace not-possible checks with Assert()
>   > * Add comments to describe some functions
>   > * Remove some unreachable code
>   > * Define delimiter symbols of line datatype like the other ones
>   > * Remove debugging print()s from the refactored functions
>
>   And one bug fix.

What is that one?  Should I incorporate it into the commit message?


Re: [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:

Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:
Hi Emre,

Thanks for the rebased patch. I remember reviewing the patch in the last 
CF, and it seems in a pretty good shape. I plan to look at it again in 
the next commitfest, but it seems to have been reviewed by other 
experienced people so I'm not worried about this part.

The main remaining question I have is what do do with back-branches. 
Shall we back-patch this or not?

The trouble is that while the patch is essentially a bugfix, it 
refactors quite significant amount of code to make the fixes practical. 
If it was possible to back-patch just the fixes without the refactoring, 
that would be ideal, but unfortunately that's not the case. Based on 
discussion with Emre in Ottawa that would be rather impractical due to 
the nature of the bugs and low code reuse.

I do believe we should back-patch - after all, it fixes real bugs. It's 
true the bugs were there for years and no one noticed/reported them, but 
it's still buggy and that's not fun.

Opinions?

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Tom Lane
Дата:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> The main remaining question I have is what do do with back-branches. 
> Shall we back-patch this or not?

Given the behavioral changes involved, I'd say "no way".  That's
reinforced by the lack of field complaints; if there were lots of
complaints, maybe we'd be willing to break backwards compatibility,
but ...

            regards, tom lane


Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:
On 06/03/2018 11:50 PM, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> The main remaining question I have is what do do with back-branches.
>> Shall we back-patch this or not?
> 
> Given the behavioral changes involved, I'd say "no way".  That's
> reinforced by the lack of field complaints; if there were lots of
> complaints, maybe we'd be willing to break backwards compatibility,
> but ...
> 

Fair enough, I tend to over-estimate importance of bugfixes and 
under-estimate breakage due to behavior change. But if we don't want to 
back-patch this, I'm fine with that. I was a bit worried about making 
future backpatches more painful, but this code received only ~20 commits 
over the past files, half of that due tot pgindent, so that seems to be 
a non-issue.

But now I'm wondering what does this mean for existing indexes? Doesn't 
this effectively mean those are unlikely to give meaningful responses 
(in the old or new semantics)?


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> But now I'm wondering what does this mean for existing indexes? Doesn't this
> effectively mean those are unlikely to give meaningful responses (in the old
> or new semantics)?

The patches shouldn't cause any change to the indexable operators.
The fixes are mostly around the lines and the line segments which
doesn't have index support.


Re: [PATCH] Improve geometric types

От
Thomas Munro
Дата:
On Sun, Jun 3, 2018 at 12:58 PM, Emre Hasegeli <emre@hasegeli.com> wrote:
> Rebased versions are attached.

Hi Emre,

This produces build errors on Windows[1][2]:

  C:\projects\postgresql\src\include\utils/float.h(136): warning
C4013: '_fpclass' undefined; assuming extern returning int
[C:\projects\postgresql\postgres.vcxproj]
  C:\projects\postgresql\src\include\utils/float.h(297): warning
C4013: '_isnan' undefined; assuming extern returning int
[C:\projects\postgresql\postgres.vcxproj]
  C:\projects\postgresql\src\include\utils/float.h(136): error C2065:
'_FPCLASS_PINF' : undeclared identifier
[C:\projects\postgresql\postgres.vcxproj]
  C:\projects\postgresql\src\include\utils/float.h(136): error C2065:
'_FPCLASS_NINF' : undeclared identifier
[C:\projects\postgresql\postgres.vcxproj]
2778

This is apparently coming from the expansion of the following macros
from src/include/port/win32_port.h:

#if (_MSC_VER < 1800)
#define isinf(x) ((_fpclass(x) == _FPCLASS_PINF) || (_fpclass(x) ==
_FPCLASS_NINF))
#define isnan(x) _isnan(x)
#endif

Those underscore-prefixed names are defined in Microsoft's
<float.h>[3][4].  So now I'm wondering if win32_port.h needs to
#include <float.h> if (_MSC_VER < 1800).

[1] http://cfbot.cputube.org/emre-hasegeli.html
[2] https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.1022
[3] https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/fpclass-fpclassf
[4] https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/isnan-isnan-isnanf

-- 
Thomas Munro
http://www.enterprisedb.com


Re: [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> Those underscore-prefixed names are defined in Microsoft's
> <float.h>[3][4].  So now I'm wondering if win32_port.h needs to
> #include <float.h> if (_MSC_VER < 1800).

I don't have the C experience to decide the correct way.  There are
currently many .c files that are including float.h conditionally or
unconditionally.  The condition they use is "#ifdef _MSC_VER" without
a version.

One idea is to include float.h from the new utils/float.h file
together with math.h, and remove those includes from the .c files
which would include utils/float.h.  We can do this only, or together
with what you suggest, or by also keeping the includes on the .c
files.  Which way do you think is the proper?


Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:

On 06/05/2018 06:32 PM, Emre Hasegeli wrote:
>> Those underscore-prefixed names are defined in Microsoft's
>> <float.h>[3][4].  So now I'm wondering if win32_port.h needs to
>> #include <float.h> if (_MSC_VER < 1800).
> 
> I don't have the C experience to decide the correct way.  There are
> currently many .c files that are including float.h conditionally or
> unconditionally.  The condition they use is "#ifdef _MSC_VER" without
> a version.
> 
> One idea is to include float.h from the new utils/float.h file
> together with math.h, and remove those includes from the .c files
> which would include utils/float.h.  We can do this only, or together
> with what you suggest, or by also keeping the includes on the .c
> files.  Which way do you think is the proper?
> 

Do we have any solution to the float.h include issues on Windows? I
don't have any Windows box at hand so I can't verify it, but just using
"#ifdef _MSC_VER" seems OK to me (and it's used elsewhere). Thomas, why
do you think the version number restriction is needed here? I don't see
the version mentioned in the MS docs you linked either.

Once this gets resolved, I'd like to get this committed ... so if you
have other objections, please speak now.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Thomas Munro
Дата:
On Tue, Jul 10, 2018 at 7:21 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> On 06/05/2018 06:32 PM, Emre Hasegeli wrote:
>>> Those underscore-prefixed names are defined in Microsoft's
>>> <float.h>[3][4].  So now I'm wondering if win32_port.h needs to
>>> #include <float.h> if (_MSC_VER < 1800).
>>
>> I don't have the C experience to decide the correct way.  There are
>> currently many .c files that are including float.h conditionally or
>> unconditionally.  The condition they use is "#ifdef _MSC_VER" without
>> a version.
>>
>> One idea is to include float.h from the new utils/float.h file
>> together with math.h, and remove those includes from the .c files
>> which would include utils/float.h.  We can do this only, or together
>> with what you suggest, or by also keeping the includes on the .c
>> files.  Which way do you think is the proper?
>>
>
> Do we have any solution to the float.h include issues on Windows? I
> don't have any Windows box at hand so I can't verify it, but just using
> "#ifdef _MSC_VER" seems OK to me (and it's used elsewhere). Thomas, why
> do you think the version number restriction is needed here? I don't see
> the version mentioned in the MS docs you linked either.

The version number restriction isn't strictly needed.  I only
suggested it because it'd match the #if that wraps the code that's
actually using those macros, introduced by commit cec8394b5ccd.  That
was presumably done because versions >= 1800 (= Visual Studio 2013)
have their own definitions of isinf() and isnan(), and I guess that
our definitions were probably breaking stuff on that compiler.

> Once this gets resolved, I'd like to get this committed ... so if you
> have other objections, please speak now.

+1, no objections.

-- 
Thomas Munro
http://www.enterprisedb.com


Re: [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> The version number restriction isn't strictly needed.  I only
> suggested it because it'd match the #if that wraps the code that's
> actually using those macros, introduced by commit cec8394b5ccd.  That
> was presumably done because versions >= 1800 (= Visual Studio 2013)
> have their own definitions of isinf() and isnan(), and I guess that
> our definitions were probably breaking stuff on that compiler.

Now I understand what you mean.  win32_port.h defines isnan(x) as
_isnan(x) if (_MSC_VER < 1800).  It doesn't look right to have the
definition in here but not include <float.h> as _isnan() is coming
from there.  I am preparing an additional patch to add the include and
remove it from files where it is obviously put to work around this
problem.


Re: [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> Now I understand what you mean.  win32_port.h defines isnan(x) as
> _isnan(x) if (_MSC_VER < 1800).  It doesn't look right to have the
> definition in here but not include <float.h> as _isnan() is coming
> from there.  I am preparing an additional patch to add the include and
> remove it from files where it is obviously put to work around this
> problem.

I posted this to another thread.  Until this is sorted out I made the
new float header patch include <float.h>, so they are not dependent.
New versions are attached.

Вложения

Re: [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
New versions are attached after the <float.h> patch got in.  I noticed
tests failing on Windows [1] and added alternative .out file.

[1] https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.5235

Вложения

Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:
On 07/11/2018 07:13 PM, Emre Hasegeli wrote:
> New versions are attached after the <float.h> patch got in.  I noticed
> tests failing on Windows [1] and added alternative .out file.
> 
> [1] https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.5235
> 

The version posted about two weeks ago is slightly broken - AFAICS the 
float.h includes in geo_ops.c and gistproc.c need to be part of 0002, 
not 0003. Attached is a version fixing that.

Barring objections, I'll get this committed over the next few days, once 
I review all the individual parts once more. I'm planning to commit the 
6 parts separately, as submitted. No backpatching, as discussed before.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: [PATCH] Improve geometric types

От
Kyotaro HORIGUCHI
Дата:
Thank you for taking this.

At Thu, 26 Jul 2018 17:12:50 +0200, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in
<672f4c42-6742-c1ec-e9a4-1994b815acc7@2ndquadrant.com>
> On 07/11/2018 07:13 PM, Emre Hasegeli wrote:
> > New versions are attached after the <float.h> patch got in.  I noticed
> > tests failing on Windows [1] and added alternative .out file.
> > [1]
> > https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.5235
> > 
> 
> The version posted about two weeks ago is slightly broken - AFAICS the
> float.h includes in geo_ops.c and gistproc.c need to be part of 0002,
> not 0003. Attached is a version fixing that.
> 
> Barring objections, I'll get this committed over the next few days,
> once I review all the individual parts once more. I'm planning to
> commit the 6 parts separately, as submitted. No backpatching, as
> discussed before.

+1 for no backpatching, and not merging into single big patch.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:

On 07/27/2018 10:20 AM, Kyotaro HORIGUCHI wrote:
> Thank you for taking this.
> 
> At Thu, 26 Jul 2018 17:12:50 +0200, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in
<672f4c42-6742-c1ec-e9a4-1994b815acc7@2ndquadrant.com>
>> On 07/11/2018 07:13 PM, Emre Hasegeli wrote:
>>> New versions are attached after the <float.h> patch got in.  I noticed
>>> tests failing on Windows [1] and added alternative .out file.
>>> [1]
>>> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.5235
>>>
>>
>> The version posted about two weeks ago is slightly broken - AFAICS the
>> float.h includes in geo_ops.c and gistproc.c need to be part of 0002,
>> not 0003. Attached is a version fixing that.
>>
>> Barring objections, I'll get this committed over the next few days,
>> once I review all the individual parts once more. I'm planning to
>> commit the 6 parts separately, as submitted. No backpatching, as
>> discussed before.
> 
> +1 for no backpatching, and not merging into single big patch.
> 

I've committed the first two parts, after a review and testing.

As these two parts were primarily refactoring (and quite extensive),
this seems like a good place to wait if the buildfarm is happy with it.
If yes, I'll continue applying the patches that do fix/change the
behavior in various ways.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:

On 07/29/2018 03:54 AM, Tomas Vondra wrote:
> 
> 
> On 07/27/2018 10:20 AM, Kyotaro HORIGUCHI wrote:
>> Thank you for taking this.
>>
>> At Thu, 26 Jul 2018 17:12:50 +0200, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in
<672f4c42-6742-c1ec-e9a4-1994b815acc7@2ndquadrant.com>
>>> On 07/11/2018 07:13 PM, Emre Hasegeli wrote:
>>>> New versions are attached after the <float.h> patch got in.  I noticed
>>>> tests failing on Windows [1] and added alternative .out file.
>>>> [1]
>>>> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.5235
>>>>
>>>
>>> The version posted about two weeks ago is slightly broken - AFAICS the
>>> float.h includes in geo_ops.c and gistproc.c need to be part of 0002,
>>> not 0003. Attached is a version fixing that.
>>>
>>> Barring objections, I'll get this committed over the next few days,
>>> once I review all the individual parts once more. I'm planning to
>>> commit the 6 parts separately, as submitted. No backpatching, as
>>> discussed before.
>>
>> +1 for no backpatching, and not merging into single big patch.
>>
> 
> I've committed the first two parts, after a review and testing.
> 
> As these two parts were primarily refactoring (and quite extensive),
> this seems like a good place to wait if the buildfarm is happy with it.
> If yes, I'll continue applying the patches that do fix/change the
> behavior in various ways.
> 

Seems there's a couple of failures like this, so far:

*** 42,48 ****
                        s
  ---------------------------------------------
   {1,-1,1}
!  {1,-1,0}
   {-0.4,-1,-6}
   {-0.000184615384615385,-1,15.3846153846154}
   {1,-1,11}
--- 42,48 ----
                        s
  ---------------------------------------------
   {1,-1,1}
!  {1,-1,-0}
   {-0.4,-1,-6}
   {-0.000184615384615385,-1,15.3846153846154}
   {1,-1,11}
***************

It's always 0/-0 difference, and it's limited to power machines. I'll
try to get access to such system and see what's wrong.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Thomas Munro
Дата:
On Sun, Jul 29, 2018 at 10:35 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> It's always 0/-0 difference, and it's limited to power machines. I'll
> try to get access to such system and see what's wrong.

This is suspicious:

        /* on some platforms, the preceding expression tends to produce -0 */
        if (line->C == 0.0)
            line->C = 0.0;

FWIW I tried this on our Linux/POWER8 box but it didn't show the problem.

-- 
Thomas Munro
http://www.enterprisedb.com


Re: [PATCH] Improve geometric types

От
Thomas Munro
Дата:
On Sun, Jul 29, 2018 at 10:57 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Sun, Jul 29, 2018 at 10:35 PM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>> It's always 0/-0 difference, and it's limited to power machines. I'll
>> try to get access to such system and see what's wrong.
>
> This is suspicious:
>
>         /* on some platforms, the preceding expression tends to produce -0 */
>         if (line->C == 0.0)
>             line->C = 0.0;

I mean, it's suspiciously absent from the new line_construct()
function.  It was introduced here:

commit 43fe90f66a0b200f6c32507428349afb45f661ca
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Fri Oct 25 15:55:15 2013 -0400

    Suppress -0 in the C field of lines computed by line_construct_pts().

    It's not entirely clear why some PPC machines are generating -0 here, since
    the underlying computation should be exactly 0 - 0.  Perhaps there's some
    wider-than-nominal-precision calculations happening?  Anyway, the best way
    to avoid platform-dependent results seems to be to explicitly reset -0 to
    regular zero.

-- 
Thomas Munro
http://www.enterprisedb.com


Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:

On 07/29/2018 01:28 PM, Thomas Munro wrote:
> On Sun, Jul 29, 2018 at 10:57 PM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> On Sun, Jul 29, 2018 at 10:35 PM, Tomas Vondra
>> <tomas.vondra@2ndquadrant.com> wrote:
>>> It's always 0/-0 difference, and it's limited to power machines. I'll
>>> try to get access to such system and see what's wrong.
>>
>> This is suspicious:
>>
>>         /* on some platforms, the preceding expression tends to produce -0 */
>>         if (line->C == 0.0)
>>             line->C = 0.0;
> 
> I mean, it's suspiciously absent from the new line_construct()
> function.  It was introduced here:
> 
> commit 43fe90f66a0b200f6c32507428349afb45f661ca
> Author: Tom Lane <tgl@sss.pgh.pa.us>
> Date:   Fri Oct 25 15:55:15 2013 -0400
> 
>     Suppress -0 in the C field of lines computed by line_construct_pts().
> 
>     It's not entirely clear why some PPC machines are generating -0 here, since
>     the underlying computation should be exactly 0 - 0.  Perhaps there's some
>     wider-than-nominal-precision calculations happening?  Anyway, the best way
>     to avoid platform-dependent results seems to be to explicitly reset -0 to
>     regular zero.
> 

Hmm, I see. I think adding it to the else branch should do the trick,
then, I guess. But I'd be much happier if I could test it somewhere
before the commit.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Jeff Janes
Дата:


On Sat, Jul 28, 2018 at 9:54 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:


I've committed the first two parts, after a review and testing.


I'm getting a compiler warning now:

geo_ops.c: In function 'line_closept_point':
geo_ops.c:2528:7: warning: variable 'retval' set but not used [-Wunused-but-set-variable]
  bool  retval;
 

gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.10)

Cheers,

Jeff
 
As these two parts were primarily refactoring (and quite extensive),
this seems like a good place to wait if the buildfarm is happy with it.
If yes, I'll continue applying the patches that do fix/change the
behavior in various ways.


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:

On 07/29/2018 04:31 PM, Jeff Janes wrote:
> 
> 
> On Sat, Jul 28, 2018 at 9:54 PM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote:
> 
> 
> 
>     I've committed the first two parts, after a review and testing.
> 
> 
> I'm getting a compiler warning now:
> 
> geo_ops.c: In function 'line_closept_point':
> geo_ops.c:2528:7: warning: variable 'retval' set but not used
> [-Wunused-but-set-variable]
>   bool  retval;
>  

Yeah, the variable is apparently only used in an assert. Will fix.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:
On 07/29/2018 02:03 PM, Tomas Vondra wrote:
> 
> 
> On 07/29/2018 01:28 PM, Thomas Munro wrote:
>> On Sun, Jul 29, 2018 at 10:57 PM, Thomas Munro
>> <thomas.munro@enterprisedb.com> wrote:
>>> On Sun, Jul 29, 2018 at 10:35 PM, Tomas Vondra
>>> <tomas.vondra@2ndquadrant.com> wrote:
>>>> It's always 0/-0 difference, and it's limited to power machines. I'll
>>>> try to get access to such system and see what's wrong.
>>>
>>> This is suspicious:
>>>
>>>         /* on some platforms, the preceding expression tends to produce -0 */
>>>         if (line->C == 0.0)
>>>             line->C = 0.0;
>>
>> I mean, it's suspiciously absent from the new line_construct()
>> function.  It was introduced here:
>>
>> commit 43fe90f66a0b200f6c32507428349afb45f661ca
>> Author: Tom Lane <tgl@sss.pgh.pa.us>
>> Date:   Fri Oct 25 15:55:15 2013 -0400
>>
>>     Suppress -0 in the C field of lines computed by line_construct_pts().
>>
>>     It's not entirely clear why some PPC machines are generating -0 here, since
>>     the underlying computation should be exactly 0 - 0.  Perhaps there's some
>>     wider-than-nominal-precision calculations happening?  Anyway, the best way
>>     to avoid platform-dependent results seems to be to explicitly reset -0 to
>>     regular zero.
>>
> 
> Hmm, I see. I think adding it to the else branch should do the trick,
> then, I guess. But I'd be much happier if I could test it somewhere
> before the commit.
> 

FWIW I think this should fix it. Can someone with access to an affected
machine confirm?


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:
On 07/29/2018 05:14 PM, Tomas Vondra wrote:
> On 07/29/2018 02:03 PM, Tomas Vondra wrote:
>>
>> ...
>>
>> Hmm, I see. I think adding it to the else branch should do the trick,
>> then, I guess. But I'd be much happier if I could test it somewhere
>> before the commit.
>>
> 
> FWIW I think this should fix it. Can someone with access to an affected
> machine confirm?
> 

Gah, shouldn't have posted before trying to compile it. Here is a fixed
version of the fix.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: [PATCH] Improve geometric types

От
Tom Lane
Дата:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> On 07/29/2018 05:14 PM, Tomas Vondra wrote:
>> FWIW I think this should fix it. Can someone with access to an affected
>> machine confirm?

> Gah, shouldn't have posted before trying to compile it. Here is a fixed
> version of the fix.

Sure, I'll try this on prairiedog.  It's slow though ...

            regards, tom lane


Re: [PATCH] Improve geometric types

От
Tom Lane
Дата:
I wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> On 07/29/2018 05:14 PM, Tomas Vondra wrote:
>>> FWIW I think this should fix it. Can someone with access to an affected
>>> machine confirm?

>> Gah, shouldn't have posted before trying to compile it. Here is a fixed
>> version of the fix.

> Sure, I'll try this on prairiedog.  It's slow though ...

Yup, this fixes the core regression tests on that machine.
I was too lazy to try contrib.

            regards, tom lane


Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:

On 07/29/2018 08:02 PM, Tom Lane wrote:
> I wrote:
>> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>>> On 07/29/2018 05:14 PM, Tomas Vondra wrote:
>>>> FWIW I think this should fix it. Can someone with access to an affected
>>>> machine confirm?
> 
>>> Gah, shouldn't have posted before trying to compile it. Here is a fixed
>>> version of the fix.
> 
>> Sure, I'll try this on prairiedog.  It's slow though ...
> 
> Yup, this fixes the core regression tests on that machine.
> I was too lazy to try contrib.
> 

OK, thanks for confirming. I'll get it committed and we'll see what the
animals think soon.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:
On 07/29/2018 05:11 PM, Tomas Vondra wrote:
> 
> 
> On 07/29/2018 04:31 PM, Jeff Janes wrote:
>>
>>
>> On Sat, Jul 28, 2018 at 9:54 PM, Tomas Vondra
>> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote:
>>
>>
>>
>>     I've committed the first two parts, after a review and testing.
>>
>>
>> I'm getting a compiler warning now:
>>
>> geo_ops.c: In function 'line_closept_point':
>> geo_ops.c:2528:7: warning: variable 'retval' set but not used
>> [-Wunused-but-set-variable]
>>   bool  retval;
>>  
> 
> Yeah, the variable is apparently only used in an assert. Will fix.
> 

This should fix it I guess, and it's how we deal with unused return
values elsewhere. I've considered using USE_ASSERT_CHECKING here, but it
seems rather ugly with that. I'll wait for Emre's opinion ...

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: [PATCH] Improve geometric types

От
Tom Lane
Дата:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> This should fix it I guess, and it's how we deal with unused return
> values elsewhere. I've considered using USE_ASSERT_CHECKING here, but it
> seems rather ugly with that. I'll wait for Emre's opinion ...

I think what you want is to mark the variable with
PG_USED_FOR_ASSERTS_ONLY.

            regards, tom lane


Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:
On 07/29/2018 10:57 PM, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> This should fix it I guess, and it's how we deal with unused return
>> values elsewhere. I've considered using USE_ASSERT_CHECKING here, but it
>> seems rather ugly with that. I'll wait for Emre's opinion ...
> 
> I think what you want is to mark the variable with
> PG_USED_FOR_ASSERTS_ONLY.
> 

Oh, good idea. I don't think I've ever used that macro and I've
completely forgotten about it. Committed that way.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> This should fix it I guess, and it's how we deal with unused return
> values elsewhere. I've considered using USE_ASSERT_CHECKING here, but it
> seems rather ugly with that. I'll wait for Emre's opinion ...

Assert() is the wrong thing to do in here.  Drawn-perpendicular lines
may not intersect because of precision loss.  We have to check it and
return NULL.  There a few of those that we crash, or return garbage,
or get NULL and fail in DirectFunctionCall()s.  The next patch
"line-fixes" fixes them.


Re: [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> OK, thanks for confirming. I'll get it committed and we'll see what the
> animals think soon.

Thank you for fixing this.  I wanted to preserve this code but wasn't
sure about the correct place or whether it is still necessary.

There are more places we produce -0.  The regression tests have
alternative results to cover them.  I have the "float-zero" patch for
this.  Although I am not sure if it is a correct fix.  I think we
should find the correct fix, and apply it globally to floating point
operations.  This can be only enabled for platforms which produce -0,
so the others don't have to pay the price.


Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:

On 07/30/2018 11:57 AM, Emre Hasegeli wrote:
>> OK, thanks for confirming. I'll get it committed and we'll see what the
>> animals think soon.
> 
> Thank you for fixing this.  I wanted to preserve this code but wasn't
> sure about the correct place or whether it is still necessary.
> 
> There are more places we produce -0.  The regression tests have
> alternative results to cover them.  I have the "float-zero" patch for
> this.  Although I am not sure if it is a correct fix.  I think we
> should find the correct fix, and apply it globally to floating point
> operations.  This can be only enabled for platforms which produce -0,
> so the others don't have to pay the price.
> 

Hmmm. It'll be difficult to review such patch without access to a 
platform exhibiting such behavior ... IIRC IBM offers free access to 
open-source devs, I wonder if that would be a way.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:
Hi Emre,

Now that the buildfarm is no longer complaining about 0001 and 0002, I'm 
working on reviewing and committing 0003. It seems quite straightforward 
but I do have couple of comment/questions:

1) common_entry_cmp is still handling 'delta' as double, although the 
CommonEntry was modified to use float8. IMHO it should also simply call 
float8_cmp_internal instead of doing comparisons.

2) gist_box_picksplit does this

    int     m = ceil(LIMIT_RATIO * (float8) nentries);

instead of

    int     m = ceil(LIMIT_RATIO * (double) nentries);

which seems rather unnecessary, considering the only point of the cast 
was probably to do more accurate multiplication. And it seems pointless 
to cast it to float8 but then not use float8_mul().

3) computeDistance does this:

     if (point->y > box->high.y)
         result = float8_mi(point->y, box->high.y);
     else if (point->y < box->low.y)
         result = float8_mi(box->low.y, point->y);

which seems suspicious. Shouldn't the comparisons be done by float8_lt 
and float8_gt too? That's what we do elsewhere.

4) I think we should just get rid of GEODEBUG entirely. The preceding 
patches removes about 20 out of 27 occurrences anyway, so let's ditch 
the remaining few.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> 1) common_entry_cmp is still handling 'delta' as double, although the
> CommonEntry was modified to use float8. IMHO it should also simply call
> float8_cmp_internal instead of doing comparisons.

I am changing it to define delta as "float8" and to use float8_cmp_internal().

> 2) gist_box_picksplit does this
>
>    int     m = ceil(LIMIT_RATIO * (float8) nentries);
>
> instead of
>
>    int     m = ceil(LIMIT_RATIO * (double) nentries);
>
> which seems rather unnecessary, considering the only point of the cast was
> probably to do more accurate multiplication. And it seems pointless to cast
> it to float8 but then not use float8_mul().

I am removing the cast.

> 3) computeDistance does this:
>
>     if (point->y > box->high.y)
>         result = float8_mi(point->y, box->high.y);
>     else if (point->y < box->low.y)
>         result = float8_mi(box->low.y, point->y);
>
> which seems suspicious. Shouldn't the comparisons be done by float8_lt and
> float8_gt too? That's what we do elsewhere.

I assumed the GiST code already handles NaNs correctly and tried not
to change its behavior.  It may be a good idea to revert existing NaN
handling in favour of using the inline functions every time.  Should I
do that?

> 4) I think we should just get rid of GEODEBUG entirely. The preceding
> patches removes about 20 out of 27 occurrences anyway, so let's ditch the
> remaining few.

I agree.  Shall I append it to this patch?


Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:

On 07/31/2018 05:14 PM, Emre Hasegeli wrote:
>> 1) common_entry_cmp is still handling 'delta' as double, although the
>> CommonEntry was modified to use float8. IMHO it should also simply call
>> float8_cmp_internal instead of doing comparisons.
> 
> I am changing it to define delta as "float8" and to use float8_cmp_internal().
> 
>> 2) gist_box_picksplit does this
>>
>>     int     m = ceil(LIMIT_RATIO * (float8) nentries);
>>
>> instead of
>>
>>     int     m = ceil(LIMIT_RATIO * (double) nentries);
>>
>> which seems rather unnecessary, considering the only point of the cast was
>> probably to do more accurate multiplication. And it seems pointless to cast
>> it to float8 but then not use float8_mul().
> 
> I am removing the cast.
> 
>> 3) computeDistance does this:
>>
>>      if (point->y > box->high.y)
>>          result = float8_mi(point->y, box->high.y);
>>      else if (point->y < box->low.y)
>>          result = float8_mi(box->low.y, point->y);
>>
>> which seems suspicious. Shouldn't the comparisons be done by float8_lt and
>> float8_gt too? That's what we do elsewhere.
> 
> I assumed the GiST code already handles NaNs correctly and tried not
> to change its behavior.  It may be a good idea to revert existing NaN
> handling in favour of using the inline functions every time.  Should I
> do that?

Ah, so there's an assumption that NaNs are handled earlier and never 
reach this place? That's probably a safe assumption. I haven't thought 
about that, it simply seemed suspicious that the code mixes direct 
comparisons and float8_mi() calls.

> 
>> 4) I think we should just get rid of GEODEBUG entirely. The preceding
>> patches removes about 20 out of 27 occurrences anyway, so let's ditch the
>> remaining few.
> 
> I agree.  Shall I append it to this patch?
> 

Not sure, I'll leave that up to you. I don't mind doing it in a separate 
patch (I'd probably prefer that over mixing it into unrelated patch).

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> Ah, so there's an assumption that NaNs are handled earlier and never reach
> this place? That's probably a safe assumption. I haven't thought about that,
> it simply seemed suspicious that the code mixes direct comparisons and
> float8_mi() calls.

The comparison functions handle NaNs.  The arithmetic functions handle
returning error on underflow, overflow and division by zero.  I
assumed we want to return error on those in any case, but we don't
want to handle NaNs at every place.

> Not sure, I'll leave that up to you. I don't mind doing it in a separate
> patch (I'd probably prefer that over mixing it into unrelated patch).

It is attached separately.

Вложения

Re: [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> Hmmm. It'll be difficult to review such patch without access to a platform
> exhibiting such behavior ... IIRC IBM offers free access to open-source
> devs, I wonder if that would be a way.

I don't have access to such platform either, and I don't know too much
about this business.  I left this patch out for now.  It is something
that should affect all float operations not only the geometric types
anyway.  Anybody who knows and cares about these platforms can pick
this up.

We can fix -0 issues in localized way, like you have done, if there
would be any more.


Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:

On 08/01/2018 01:40 PM, Emre Hasegeli wrote:
>> Ah, so there's an assumption that NaNs are handled earlier and never reach
>> this place? That's probably a safe assumption. I haven't thought about that,
>> it simply seemed suspicious that the code mixes direct comparisons and
>> float8_mi() calls.
> 
> The comparison functions handle NaNs.  The arithmetic functions handle
> returning error on underflow, overflow and division by zero.  I
> assumed we want to return error on those in any case, but we don't
> want to handle NaNs at every place.
> 
>> Not sure, I'll leave that up to you. I don't mind doing it in a separate
>> patch (I'd probably prefer that over mixing it into unrelated patch).
> 
> It is attached separately.
> 

OK, thanks.

So, have we reached conclusion about all the bits I mentioned on 7/31? 
The delta and float8/double cast are fixed, and for computeDistance 
(i.e. doing comparisons directly or using float8_lt), the code may seem 
a bit inconsistent, but it is in fact correct as the NaNs are handled 
elsewhere. That seems reasonable, but perhaps a comment pointing that 
out would be nice.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Kyotaro HORIGUCHI
Дата:
At Thu, 2 Aug 2018 11:50:55 +0200, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in
<ce3cf95a-4751-c168-54ae-636c486e06cd@2ndquadrant.com>
> 
> 
> On 08/01/2018 01:40 PM, Emre Hasegeli wrote:
> >> Ah, so there's an assumption that NaNs are handled earlier and never
> >> reach
> >> this place? That's probably a safe assumption. I haven't thought about
> >> that,
> >> it simply seemed suspicious that the code mixes direct comparisons and
> >> float8_mi() calls.
> > The comparison functions handle NaNs.  The arithmetic functions handle
> > returning error on underflow, overflow and division by zero.  I
> > assumed we want to return error on those in any case, but we don't
> > want to handle NaNs at every place.
> > 
> >> Not sure, I'll leave that up to you. I don't mind doing it in a
> >> separate
> >> patch (I'd probably prefer that over mixing it into unrelated patch).
> > It is attached separately.
> > 
> 
> OK, thanks.
> 
> So, have we reached conclusion about all the bits I mentioned on 7/31?
> The delta and float8/double cast are fixed, and for computeDistance
> (i.e. doing comparisons directly or using float8_lt), the code may
> seem a bit inconsistent, but it is in fact correct as the NaNs are
> handled elsewhere. That seems reasonable, but perhaps a comment
> pointing that out would be nice.

I'm not confident on replacing double to float8 partially in gist
code. After the 0002 patch applied, I see most of problematic
usage of double or bare arithmetic on dimentional values in
gistproc.c.

> static inline float
> non_negative(float val)
> {
>     if (val >= 0.0f)
>         return val;
>     else
>         return 0.0f;
> }

It is used as "non_negative(overlap)", where overlap is float4,
which is calculated using float8_mi.  Float4 makes sense only if
we need to store a large number of it to somewhere but they are
just working varialbles. Couldn't we eliminate float4 that
doesn't have a requirement to do so?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: [PATCH] Improve geometric types

От
Kyotaro HORIGUCHI
Дата:
At Fri, 03 Aug 2018 13:38:40 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in
<20180803.133840.180843182.horiguchi.kyotaro@lab.ntt.co.jp>
> At Thu, 2 Aug 2018 11:50:55 +0200, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in
<ce3cf95a-4751-c168-54ae-636c486e06cd@2ndquadrant.com>
> > 
> > 
> > On 08/01/2018 01:40 PM, Emre Hasegeli wrote:
> > >> Ah, so there's an assumption that NaNs are handled earlier and never
> > >> reach
> > >> this place? That's probably a safe assumption. I haven't thought about
> > >> that,
> > >> it simply seemed suspicious that the code mixes direct comparisons and
> > >> float8_mi() calls.
> > > The comparison functions handle NaNs.  The arithmetic functions handle
> > > returning error on underflow, overflow and division by zero.  I
> > > assumed we want to return error on those in any case, but we don't
> > > want to handle NaNs at every place.
> > > 
> > >> Not sure, I'll leave that up to you. I don't mind doing it in a
> > >> separate
> > >> patch (I'd probably prefer that over mixing it into unrelated patch).
> > > It is attached separately.
> > > 
> > 
> > OK, thanks.
> > 
> > So, have we reached conclusion about all the bits I mentioned on 7/31?
> > The delta and float8/double cast are fixed, and for computeDistance
> > (i.e. doing comparisons directly or using float8_lt), the code may
> > seem a bit inconsistent, but it is in fact correct as the NaNs are
> > handled elsewhere. That seems reasonable, but perhaps a comment
> > pointing that out would be nice.
> 
> I'm not confident on replacing double to float8 partially in gist
> code. After the 0002 patch applied, I see most of problematic
> usage of double or bare arithmetic on dimentional values in
> gistproc.c.
> 
> > static inline float
> > non_negative(float val)
> > {
> >     if (val >= 0.0f)
> >         return val;
> >     else
> >         return 0.0f;
> > }
> 
> This takes float(4) and it is used as "non_negative(overlap)",
> where overlap is float4, which is calculated using float8_mi.
> Float4 makes sense if we store a large number of values somewhere
> but they are just working varialbles. Couldn't we eliminate
> float(4) whthout any specifc requirement?

I'm fine with the 0002-geo-float-v14 except the above.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:
On 08/03/2018 06:40 AM, Kyotaro HORIGUCHI wrote:
> ...
> 
> I'm not confident on replacing double to float8 partially in gist
> code. After the 0002 patch applied, I see most of problematic
> usage of double or bare arithmetic on dimentional values in
> gistproc.c.
> 
>> static inline float
>> non_negative(float val)
>> {
>>     if (val >= 0.0f)
>>         return val;
>>     else
>>         return 0.0f;
>> }
> 
> It is used as "non_negative(overlap)", where overlap is float4,
> which is calculated using float8_mi.  Float4 makes sense only if
> we need to store a large number of it to somewhere but they are
> just working varialbles. Couldn't we eliminate float4 that
> doesn't have a requirement to do so?
> 

I'm not sure I follow. The patch does not modify non_negative() at all, 
and we still call it like this:

     if (non_negative(overlap) < non_negative(context->overlap) ||
         (range > context->range &&
          non_negative(overlap) <= non_negative(context->overlap)))
         selectthis = true;

where all the "overlap" values are still float4. The only thing that 
changed here is that instead of doing the arithmetic operations directly 
we call float8_mi/float8_div to benefit from the float8 handling.

So I'm not sure how does the patch beaks this? And what do you mean by 
'eliminate float4'?


thank you

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:

On 08/03/2018 02:39 PM, Tomas Vondra wrote:
> On 08/03/2018 06:40 AM, Kyotaro HORIGUCHI wrote:
>> ...
>>
>> I'm not confident on replacing double to float8 partially in gist
>> code. After the 0002 patch applied, I see most of problematic
>> usage of double or bare arithmetic on dimentional values in
>> gistproc.c.
>>
>>> static inline float
>>> non_negative(float val)
>>> {
>>>     if (val >= 0.0f)
>>>         return val;
>>>     else
>>>         return 0.0f;
>>> }
>>
>> It is used as "non_negative(overlap)", where overlap is float4,
>> which is calculated using float8_mi.  Float4 makes sense only if
>> we need to store a large number of it to somewhere but they are
>> just working varialbles. Couldn't we eliminate float4 that
>> doesn't have a requirement to do so?
>>
> 
> I'm not sure I follow. The patch does not modify non_negative() at all, 
> and we still call it like this:
> 
>      if (non_negative(overlap) < non_negative(context->overlap) ||
>          (range > context->range &&
>           non_negative(overlap) <= non_negative(context->overlap)))
>          selectthis = true;
> 
> where all the "overlap" values are still float4. The only thing that 
> changed here is that instead of doing the arithmetic operations directly 
> we call float8_mi/float8_div to benefit from the float8 handling.
> 
> So I'm not sure how does the patch beaks this? And what do you mean by 
> 'eliminate float4'?
> 

Kyotaro-san, can you explain what your concerns regarding this bit are? 
I'd like to get 0002 committed, but I'm not sure I understand your point 
about the changes in gist code, so I can't address them. And I certainly 
don't want to just ignore them ...

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Kyotaro HORIGUCHI
Дата:
At Wed, 8 Aug 2018 14:39:54 +0200, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in
<6ecb4f61-1fb1-08a1-31d6-e58e9c352374@2ndquadrant.com>
>
>
> On 08/03/2018 02:39 PM, Tomas Vondra wrote:
> > On 08/03/2018 06:40 AM, Kyotaro HORIGUCHI wrote:
> >> ...
> >>
> >> I'm not confident on replacing double to float8 partially in gist
> >> code. After the 0002 patch applied, I see most of problematic
> >> usage of double or bare arithmetic on dimentional values in
> >> gistproc.c.
> >>
> >>> static inline float
> >>> non_negative(float val)
> >>> {
> >>>     if (val >= 0.0f)
> >>>         return val;
> >>>     else
> >>>         return 0.0f;
> >>> }
> >>
> >> It is used as "non_negative(overlap)", where overlap is float4,
> >> which is calculated using float8_mi.  Float4 makes sense only if
> >> we need to store a large number of it to somewhere but they are
> >> just working varialbles. Couldn't we eliminate float4 that
> >> doesn't have a requirement to do so?
> >>
> > I'm not sure I follow. The patch does not modify non_negative() at
> > all, and we still call it like this:
> >      if (non_negative(overlap) < non_negative(context->overlap) ||
> >          (range > context->range &&
> >           non_negative(overlap) <= non_negative(context->overlap)))
> >          selectthis = true;
> > where all the "overlap" values are still float4. The only thing that
> > changed here is that instead of doing the arithmetic operations
> > directly we call float8_mi/float8_div to benefit from the float8
> > handling.
> > So I'm not sure how does the patch beaks this? And what do you mean by
> > 'eliminate float4'?
> >
>
> Kyotaro-san, can you explain what your concerns regarding this bit
> are? I'd like to get 0002 committed, but I'm not sure I understand
> your point about the changes in gist code, so I can't address
> them. And I certainly don't want to just ignore them ...

It doesn't break nothing so nothing must be done with this. Just
I was a bit uneasy to see meaninglessly used foat4. Sorry for
the unnecessary argument.

After all I don't object to commit it in this shape.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center



Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:
On 08/09/2018 11:42 AM, Kyotaro HORIGUCHI wrote:
> At Wed, 8 Aug 2018 14:39:54 +0200, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in
<6ecb4f61-1fb1-08a1-31d6-e58e9c352374@2ndquadrant.com>
>>
>>
>> On 08/03/2018 02:39 PM, Tomas Vondra wrote:
>>> On 08/03/2018 06:40 AM, Kyotaro HORIGUCHI wrote:
>>>> ...
>>>>
>>>> I'm not confident on replacing double to float8 partially in gist
>>>> code. After the 0002 patch applied, I see most of problematic
>>>> usage of double or bare arithmetic on dimentional values in
>>>> gistproc.c.
>>>>
>>>>> static inline float
>>>>> non_negative(float val)
>>>>> {
>>>>>     if (val >= 0.0f)
>>>>>         return val;
>>>>>     else
>>>>>         return 0.0f;
>>>>> }
>>>>
>>>> It is used as "non_negative(overlap)", where overlap is float4,
>>>> which is calculated using float8_mi.  Float4 makes sense only if
>>>> we need to store a large number of it to somewhere but they are
>>>> just working varialbles. Couldn't we eliminate float4 that
>>>> doesn't have a requirement to do so?
>>>>
>>> I'm not sure I follow. The patch does not modify non_negative() at
>>> all, and we still call it like this:
>>>      if (non_negative(overlap) < non_negative(context->overlap) ||
>>>          (range > context->range &&
>>>           non_negative(overlap) <= non_negative(context->overlap)))
>>>          selectthis = true;
>>> where all the "overlap" values are still float4. The only thing that
>>> changed here is that instead of doing the arithmetic operations
>>> directly we call float8_mi/float8_div to benefit from the float8
>>> handling.
>>> So I'm not sure how does the patch beaks this? And what do you mean by
>>> 'eliminate float4'?
>>>
>>
>> Kyotaro-san, can you explain what your concerns regarding this bit
>> are? I'd like to get 0002 committed, but I'm not sure I understand
>> your point about the changes in gist code, so I can't address
>> them. And I certainly don't want to just ignore them ...
> 
> It doesn't break nothing so nothing must be done with this. Just
> I was a bit uneasy to see meaninglessly used foat4. Sorry for
> the unnecessary argument.
> 
> After all I don't object to commit it in this shape.
> 

Understood. Thanks for the explanation.

I've pushed parts 0001 and 0002, as submitted on August 1. Let's see if
that upsets some of the buildfarm animals.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:
Hi,

the buildfarm seems to be mostly happy so far, so I've taken a quick
look at the remaining two parts. The patches still apply, but I'm
getting plenty of failures in regression tests, due to 0.0 being
replaced by -0.0.

This reminds me 74294c7301, except that these patches don't seem to
remove any such checks by mistake. Instead it seems to be caused by
simply switching to float8_ methods. The attached patch fixes the issue
for me, although I'm not claiming it's the right way to fix it.

Another thing I noticed is the last few lines from line_interpt_line are
actually unreachable, because there's now 'else return false' branch.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> the buildfarm seems to be mostly happy so far, so I've taken a quick
> look at the remaining two parts. The patches still apply, but I'm
> getting plenty of failures in regression tests, due to 0.0 being
> replaced by -0.0.

I think we are better off fixing them locally at the moment like your
patch does.  We should consider to eliminate -0 globally for all
floating point based datatypes later.  I simplified and incorporated
your change to line_interpt_line() into mine.

I am not sure about normalising -0s on point_construct().  We
currently allow points to be initialized with -0s.  I think it is fair
for us to return -0 when -x and 0 are multiplied.  That is the current
behavior and the behavior of the float datatypes.  I adjusted the
results of the new regression tests accordingly.

> Another thing I noticed is the last few lines from line_interpt_line are
> actually unreachable, because there's now 'else return false' branch.

Which lines do you mean exactly?  I don't see any being unreachable.

Вложения

Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:

On 08/17/2018 06:40 PM, Emre Hasegeli wrote:
>> the buildfarm seems to be mostly happy so far, so I've taken a quick
>> look at the remaining two parts. The patches still apply, but I'm
>> getting plenty of failures in regression tests, due to 0.0 being
>> replaced by -0.0.
> 
> I think we are better off fixing them locally at the moment like your
> patch does.  We should consider to eliminate -0 globally for all
> floating point based datatypes later.  I simplified and incorporated
> your change to line_interpt_line() into mine.
> 
> I am not sure about normalising -0s on point_construct().  We
> currently allow points to be initialized with -0s.  I think it is fair
> for us to return -0 when -x and 0 are multiplied.  That is the current
> behavior and the behavior of the float datatypes.  I adjusted the
> results of the new regression tests accordingly.
> 

Hmmm, I need to think about that a bit more.

BTW how did we end up with the regression differences? Presumably you've
tried that on your machine and it passed. So if we adjust the expected
file, won't it fail on some other machines?

>> Another thing I noticed is the last few lines from line_interpt_line are
>> actually unreachable, because there's now 'else return false' branch.
> 
> Which lines do you mean exactly?  I don't see any being unreachable.
> 

Apologies, I got confused - there are no unreachable lines.


-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Emre Hasegeli
Дата:
> BTW how did we end up with the regression differences? Presumably you've
> tried that on your machine and it passed. So if we adjust the expected
> file, won't it fail on some other machines?

I had another patch to check for -0 inside float{4,8}_{div,mul}().  I
dropped it on the last set of patches, so the tests were broken.  I
get -0 as a result of -x * 0 both on Mac and Linux.


Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:
On 08/17/2018 08:24 PM, Emre Hasegeli wrote:
>> BTW how did we end up with the regression differences? Presumably you've
>> tried that on your machine and it passed. So if we adjust the expected
>> file, won't it fail on some other machines?
> 
> I had another patch to check for -0 inside float{4,8}_{div,mul}().  I
> dropped it on the last set of patches, so the tests were broken.  I
> get -0 as a result of -x * 0 both on Mac and Linux.
> 

OK, that explains is. I won't have time to get this committed before CF
2018-09, but I'll pick it up in September.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Tom Lane
Дата:
Emre Hasegeli <emre@hasegeli.com> writes:
>> BTW how did we end up with the regression differences? Presumably you've
>> tried that on your machine and it passed. So if we adjust the expected
>> file, won't it fail on some other machines?

> I had another patch to check for -0 inside float{4,8}_{div,mul}().  I
> dropped it on the last set of patches, so the tests were broken.  I
> get -0 as a result of -x * 0 both on Mac and Linux.

I'll bet a good deal of money that you'll find that does not hold
true across the whole buildfarm.

            regards, tom lane


Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:
On 08/17/2018 08:56 PM, Tom Lane wrote:
> Emre Hasegeli <emre@hasegeli.com> writes:
>>> BTW how did we end up with the regression differences? Presumably you've
>>> tried that on your machine and it passed. So if we adjust the expected
>>> file, won't it fail on some other machines?
> 
>> I had another patch to check for -0 inside float{4,8}_{div,mul}().  I
>> dropped it on the last set of patches, so the tests were broken.  I
>> get -0 as a result of -x * 0 both on Mac and Linux.
> 
> I'll bet a good deal of money that you'll find that does not hold
> true across the whole buildfarm.
> 

Hmm, yeah. Based on past experience, the powerpc machines are likely to
stumble on this.

FWIW my understanding is that these failures actually happen in new
tests, it's not an issue introduced by this patch series.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Tom Lane
Дата:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> Hmm, yeah. Based on past experience, the powerpc machines are likely to
> stumble on this.

> FWIW my understanding is that these failures actually happen in new
> tests, it's not an issue introduced by this patch series.

Yeah, we've definitely hit such problems before.  The geometric logic
seems particularly prone to it because it's doing more and subtler
float arithmetic than the rest of the system ... but it's not the sole
source of minus-zero issues.

            regards, tom lane


Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:
On 08/17/2018 11:23 PM, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> Hmm, yeah. Based on past experience, the powerpc machines are likely to
>> stumble on this.
> 
>> FWIW my understanding is that these failures actually happen in new
>> tests, it's not an issue introduced by this patch series.
> 
> Yeah, we've definitely hit such problems before.  The geometric logic
> seems particularly prone to it because it's doing more and subtler
> float arithmetic than the rest of the system ... but it's not the sole
> source of minus-zero issues.
> 

It's not entirely clear to me what's the best way forward with this.

We can go through the patch and fix places with obvious -0 risks, but
then what? I don't have access to any powepc machines, so I can't check
if there are any other failures. So the only thing I could do is commit
and fix based on buildfarm failures ...

Emre, any better ideas?

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Tom Lane
Дата:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> On 08/17/2018 11:23 PM, Tom Lane wrote:
>> Yeah, we've definitely hit such problems before.  The geometric logic
>> seems particularly prone to it because it's doing more and subtler
>> float arithmetic than the rest of the system ... but it's not the sole
>> source of minus-zero issues.

> We can go through the patch and fix places with obvious -0 risks, but
> then what? I don't have access to any powepc machines, so I can't check
> if there are any other failures. So the only thing I could do is commit
> and fix based on buildfarm failures ...

That's the usual solution.  I don't see anything particularly wrong
with it.  If the buildfarm results suggest a whole mess of problems,
it might be appropriate to revert and rethink; but you shouldn't be
afraid to use the buildfarm as a testbed.

            regards, tom lane


Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:
On 09/03/2018 04:08 AM, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> On 08/17/2018 11:23 PM, Tom Lane wrote:
>>> Yeah, we've definitely hit such problems before.  The geometric logic
>>> seems particularly prone to it because it's doing more and subtler
>>> float arithmetic than the rest of the system ... but it's not the sole
>>> source of minus-zero issues.
> 
>> We can go through the patch and fix places with obvious -0 risks, but
>> then what? I don't have access to any powepc machines, so I can't check
>> if there are any other failures. So the only thing I could do is commit
>> and fix based on buildfarm failures ...
> 
> That's the usual solution.  I don't see anything particularly wrong
> with it.  If the buildfarm results suggest a whole mess of problems,
> it might be appropriate to revert and rethink; but you shouldn't be
> afraid to use the buildfarm as a testbed.
> 

FWIW I plan to get this committed sometime this week, when I'm available
to fix the expected buildfarm breakage.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:
On 09/23/2018 11:55 PM, Tomas Vondra wrote:
> On 09/03/2018 04:08 AM, Tom Lane wrote:
>> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>>> On 08/17/2018 11:23 PM, Tom Lane wrote:
>>>> Yeah, we've definitely hit such problems before.  The geometric logic
>>>> seems particularly prone to it because it's doing more and subtler
>>>> float arithmetic than the rest of the system ... but it's not the sole
>>>> source of minus-zero issues.
>>
>>> We can go through the patch and fix places with obvious -0 risks, but
>>> then what? I don't have access to any powepc machines, so I can't check
>>> if there are any other failures. So the only thing I could do is commit
>>> and fix based on buildfarm failures ...
>>
>> That's the usual solution.  I don't see anything particularly wrong
>> with it.  If the buildfarm results suggest a whole mess of problems,
>> it might be appropriate to revert and rethink; but you shouldn't be
>> afraid to use the buildfarm as a testbed.
>>
> 
> FWIW I plan to get this committed sometime this week, when I'm available
> to fix the expected buildfarm breakage.
> 

Pushed. Now let's wait for the buildfarm to complain ...


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Tom Lane
Дата:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> Pushed. Now let's wait for the buildfarm to complain ...

gaur's not happy, but rather surprisingly, it looks like we're
mostly OK elsewhere.  Do you need me to trace down exactly what's
going wrong on gaur?

            regards, tom lane


Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:
On 09/26/2018 06:45 PM, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> Pushed. Now let's wait for the buildfarm to complain ...
> 
> gaur's not happy, but rather surprisingly, it looks like we're
> mostly OK elsewhere.  Do you need me to trace down exactly what's
> going wrong on gaur?
> 

Hmmm, interesting. It seems both failures happen in the chunk that
multiplies paths with points, i.e. essentially point_mul_point. So it
seems most platforms end up with

    (0,0) * (-3,4) = (-0, 0)

while gaur apparently thinks it's (0,0). And indeed, that's what the
attached trivial program does - I'd bet if you run it on gaur, it'll
print 0.000000, not -0.000000.

Or you could just try doing

    select '(0,0)'::point * '(-3,4)'::point;

If this is what's going on, I'd say the best solution is to make it
produce (0,0) everywhere, so that we don't expect -0.0 anywhere.

We could do that either by adding the == 0.0 check to yet another place,
or to point_construct() directly. Adding it to point_construct() means
we'll pay the price always, but I guess there are few paths where we
know we don't need it. And if we add it to many places it's likely about
as expensive as adding it to point_construct.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: [PATCH] Improve geometric types

От
Tom Lane
Дата:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> Hmmm, interesting. It seems both failures happen in the chunk that
> multiplies paths with points, i.e. essentially point_mul_point. So it
> seems most platforms end up with

>     (0,0) * (-3,4) = (-0, 0)

> while gaur apparently thinks it's (0,0). And indeed, that's what the
> attached trivial program does - I'd bet if you run it on gaur, it'll
> print 0.000000, not -0.000000.

Nope, no cigar:

$ gcc -Wall -O2 test.c
$ ./a.out
-0.000000

(I tried a couple other -O levels to see if that affected anything,
but it didn't.)

I'll try to isolate the problem more closely, but it will take awhile.
That machine is slow :-(

            regards, tom lane


Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:

On 09/26/2018 10:58 PM, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> Hmmm, interesting. It seems both failures happen in the chunk that
>> multiplies paths with points, i.e. essentially point_mul_point. So it
>> seems most platforms end up with
> 
>>     (0,0) * (-3,4) = (-0, 0)
> 
>> while gaur apparently thinks it's (0,0). And indeed, that's what the
>> attached trivial program does - I'd bet if you run it on gaur, it'll
>> print 0.000000, not -0.000000.
> 
> Nope, no cigar:
> 
> $ gcc -Wall -O2 test.c
> $ ./a.out
> -0.000000
> 
> (I tried a couple other -O levels to see if that affected anything,
> but it didn't.)
> 

Interesting ...

> I'll try to isolate the problem more closely, but it will take awhile.
> That machine is slow :-(
> 

OK, thanks.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Tom Lane
Дата:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> On 09/26/2018 06:45 PM, Tom Lane wrote:
>> gaur's not happy, but rather surprisingly, it looks like we're
>> mostly OK elsewhere.  Do you need me to trace down exactly what's
>> going wrong on gaur?

> Or you could just try doing
>     select '(0,0)'::point * '(-3,4)'::point;
> If this is what's going on, I'd say the best solution is to make it
> produce (0,0) everywhere, so that we don't expect -0.0 anywhere.

Actually, it seems simpler than that: gaur produces plus zero already
from the multiplication:

regression=# select '-3'::float8 * '0'::float8;
 ?column? 
----------
        0
(1 row)

whereas I get -0 elsewhere.  I'm surprised that this doesn't create
more widely-visible regression failures, but there you have it.

> We could do that either by adding the == 0.0 check to yet another place,
> or to point_construct() directly. Adding it to point_construct() means
> we'll pay the price always, but I guess there are few paths where we
> know we don't need it. And if we add it to many places it's likely about
> as expensive as adding it to point_construct.

If gaur is the only machine showing this failure, which seems more
likely by the hour, I'm not sure that we should give up performance
across-the-board to make it happy.  Perhaps a variant expected-file
is a better answer; or we could remove these specific test cases.

Anyway, I'd counsel doing nothing for a day or so, till the buildfarm
breakage from the strerror/snprintf changes clears up.  Then we'll
have a better idea of whether any other machines are affected.

            regards, tom lane


Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:

On 09/27/2018 12:48 AM, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> On 09/26/2018 06:45 PM, Tom Lane wrote:
>>> gaur's not happy, but rather surprisingly, it looks like we're
>>> mostly OK elsewhere.  Do you need me to trace down exactly what's
>>> going wrong on gaur?
> 
>> Or you could just try doing
>>      select '(0,0)'::point * '(-3,4)'::point;
>> If this is what's going on, I'd say the best solution is to make it
>> produce (0,0) everywhere, so that we don't expect -0.0 anywhere.
> 
> Actually, it seems simpler than that: gaur produces plus zero already
> from the multiplication:
> 
> regression=# select '-3'::float8 * '0'::float8;
>   ?column?
> ----------
>          0
> (1 row)
> 
> whereas I get -0 elsewhere.  I'm surprised that this doesn't create
> more widely-visible regression failures, but there you have it.
> 

Hmmm, interesting. But I still don't quite understand why the test 
program still produced -0.000000 and not 0.000000. That seems like a 
direct contradiction to what we see in regression tests, doesn't it?

>> We could do that either by adding the == 0.0 check to yet another place,
>> or to point_construct() directly. Adding it to point_construct() means
>> we'll pay the price always, but I guess there are few paths where we
>> know we don't need it. And if we add it to many places it's likely about
>> as expensive as adding it to point_construct.
> 
> If gaur is the only machine showing this failure, which seems more
> likely by the hour, I'm not sure that we should give up performance
> across-the-board to make it happy.  Perhaps a variant expected-file
> is a better answer; or we could remove these specific test cases.
> 
> Anyway, I'd counsel doing nothing for a day or so, till the buildfarm
> breakage from the strerror/snprintf changes clears up.  Then we'll
> have a better idea of whether any other machines are affected.
> 

Yep, gaur seems to be the only animal affected by this, so no need to 
rush anyway.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Tom Lane
Дата:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> On 09/27/2018 12:48 AM, Tom Lane wrote:
>> Actually, it seems simpler than that: gaur produces plus zero already
>> from the multiplication:
>> regression=# select '-3'::float8 * '0'::float8;
>> ?column?
>> ----------
>> 0
>> (1 row)

> Hmmm, interesting. But I still don't quite understand why the test 
> program still produced -0.000000 and not 0.000000. That seems like a 
> direct contradiction to what we see in regression tests, doesn't it?

OK, so after poking at it for another hour and getting more and more
confused, I realized that gdb was lying to me by printing genuine
minus zero values as just "0".  Throw out everything I thought I knew
and start over ...

... and awhile later, this is the answer: on this machine,
printf with "%f" will show the sign of minus zero.  But printf
with "%g" will not.  Guess which format float8out uses.
(I'll bet that gdb does too, so that its lie wasn't its fault.)

AFAICT at the moment, gaur is doing the underlying IEEE float math
the same as everybody else, which is not very surprising because
HP bought into IEEE math pretty early.  Hex-dumping shows conclusively
that point_mul_point *does* emit (-0,0) in the case in question.
But we've got a platform-specific issue with whether the minus zero
gets printed as such.  I wonder whether similar effects explain some
of the other platform-specific oddities we've seen with minus zero.

Anyway, at this point I'd say let's just leave gaur broken so far as the
geometric tests are concerned, pending results from the concurrent thread
about possibly rewriting snprintf.c's float handling to not depend on the
platform's sprintf.  If that doesn't happen, we can revisit some sort
of narrower fix for this.  The narrow fix ought to be in snprintf.c
anyway, not anywhere near the geometric code.

I notice BTW that it's sort of accidental that snprintf.c behaves properly
for minus zero on most machines.  The test "value < 0" isn't true, so
it doesn't think there's a sign.  When sprintf outputs a "-" anyway,
that's effectively treated as a digit.  We'd do the wrong thing with a
format like "%+f", and maybe in other cases too.

            regards, tom lane


Re: [PATCH] Improve geometric types

От
Alvaro Herrera
Дата:
If you look at the differing results carefully, there's this one:

*** 3249,3255 ****
!  [(0,0),(3,0),(4,5),(1,6)] | (-5,-12)          | [(0,-0),(-15,-36),(40,-73),(67,-42)]
--- 3249,3255 ----
!  [(0,0),(3,0),(4,5),(1,6)] | (-5,-12)          | [(0,0),(-15,-36),(40,-73),(67,-42)]

(Third column is first multiplied by second).

I wonder why the expected file has a -0 only in the second position and
not both first and second.  These are both positive zeroes being
multiplied by a negative number.  Why is 0 * -12 = -0  yet  0 * -5 = 0?
What is going on?  Is the sign suppressed for negative zeros only in the
first coordinate?  I suppose this is just a side effect of how
float8_mi, _pl, _mul work (in point_mul_point).

Anyway maybe your test case should use more of the float8 op
combinations in order to show the difference.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:
On 09/27/2018 07:21 PM, Alvaro Herrera wrote:
> If you look at the differing results carefully, there's this one:
> 
> *** 3249,3255 ****
> !  [(0,0),(3,0),(4,5),(1,6)] | (-5,-12)          | [(0,-0),(-15,-36),(40,-73),(67,-42)]
> --- 3249,3255 ----
> !  [(0,0),(3,0),(4,5),(1,6)] | (-5,-12)          | [(0,0),(-15,-36),(40,-73),(67,-42)]
> 
> (Third column is first multiplied by second).
> 
> I wonder why the expected file has a -0 only in the second position and
> not both first and second.  These are both positive zeroes being
> multiplied by a negative number.  Why is 0 * -12 = -0  yet  0 * -5 = 0?
> What is going on?  Is the sign suppressed for negative zeros only in the
> first coordinate?  I suppose this is just a side effect of how
> float8_mi, _pl, _mul work (in point_mul_point).
> 
> Anyway maybe your test case should use more of the float8 op
> combinations in order to show the difference.
> 

I may be missing what you're saying, but point_mul_point is not just a
simple multiplication of coordinates, i.e.

    (x1,y1) * (x2,y2) != (x1*x2, y1*y2)

It essentially does this:

    ((x1 * x2 - y1 * y2), (x1 * y2 + x2 * y1))

so I wouldn't be surprised if this was a difference between _pl and _mi.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Tomas Vondra
Дата:
On 09/27/2018 07:05 PM, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> On 09/27/2018 12:48 AM, Tom Lane wrote:
>>> Actually, it seems simpler than that: gaur produces plus zero already
>>> from the multiplication:
>>> regression=# select '-3'::float8 * '0'::float8;
>>> ?column?
>>> ----------
>>> 0
>>> (1 row)
> 
>> Hmmm, interesting. But I still don't quite understand why the test 
>> program still produced -0.000000 and not 0.000000. That seems like a 
>> direct contradiction to what we see in regression tests, doesn't it?
> 
> OK, so after poking at it for another hour and getting more and more
> confused, I realized that gdb was lying to me by printing genuine
> minus zero values as just "0".  Throw out everything I thought I knew
> and start over ...
> 

Heh. A debugger lying to you just a wee bit is fun ...

> ... and awhile later, this is the answer: on this machine,
> printf with "%f" will show the sign of minus zero.  But printf
> with "%g" will not.  Guess which format float8out uses.
> (I'll bet that gdb does too, so that its lie wasn't its fault.)
> 
> AFAICT at the moment, gaur is doing the underlying IEEE float math
> the same as everybody else, which is not very surprising because
> HP bought into IEEE math pretty early.  Hex-dumping shows conclusively
> that point_mul_point *does* emit (-0,0) in the case in question.
> But we've got a platform-specific issue with whether the minus zero
> gets printed as such.  I wonder whether similar effects explain some
> of the other platform-specific oddities we've seen with minus zero.
> 
> Anyway, at this point I'd say let's just leave gaur broken so far as the
> geometric tests are concerned, pending results from the concurrent thread
> about possibly rewriting snprintf.c's float handling to not depend on the
> platform's sprintf.  If that doesn't happen, we can revisit some sort
> of narrower fix for this.  The narrow fix ought to be in snprintf.c
> anyway, not anywhere near the geometric code.
> 
> I notice BTW that it's sort of accidental that snprintf.c behaves properly
> for minus zero on most machines.  The test "value < 0" isn't true, so
> it doesn't think there's a sign.  When sprintf outputs a "-" anyway,
> that's effectively treated as a digit.  We'd do the wrong thing with a
> format like "%+f", and maybe in other cases too.
> 

OK, makes sense. Thanks for the investigation!


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH] Improve geometric types

От
Alvaro Herrera
Дата:
On 2018-Sep-27, Tomas Vondra wrote:

> I may be missing what you're saying, but point_mul_point is not just a
> simple multiplication of coordinates, i.e.
> 
>     (x1,y1) * (x2,y2) != (x1*x2, y1*y2)
> 
> It essentially does this:
> 
>     ((x1 * x2 - y1 * y2), (x1 * y2 + x2 * y1))
> 
> so I wouldn't be surprised if this was a difference between _pl and _mi.

Yeah, I had misinterpreted the operation before reading the code, then
when reading it I realized the formula is what you were saying, so I
updated the final part of my reply but failed to realize I had written
my misunderstanding in the first portion.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services