Обсуждение: pgsql: Improve handling of NULLs in KNN-GiST and KNN-SP-GiST

Поиск
Список
Период
Сортировка

pgsql: Improve handling of NULLs in KNN-GiST and KNN-SP-GiST

От
Alexander Korotkov
Дата:
Improve handling of NULLs in KNN-GiST and KNN-SP-GiST

This commit improves subject in two ways:

 * It removes ugliness of 02f90879e7, which stores distance values and null
   flags in two separate arrays after GISTSearchItem struct.  Instead we pack
   both distance value and null flag in IndexOrderByDistance struct.  Alignment
   overhead should be negligible, because we typically deal with at most few
   "col op const" expressions in ORDER BY clause.
 * It fixes handling of "col op NULL" expression in KNN-SP-GiST.  Now, these
   expression are not passed to support functions, which can't deal with them.
   Instead, NULL result is implicitly assumed.  It future we may decide to
   teach support functions to deal with NULL arguments, but current solution is
   bugfix suitable for backpatch.

Reported-by: Nikita Glukhov
Discussion: https://postgr.es/m/826f57ee-afc7-8977-c44c-6111d18b02ec%40postgrespro.ru
Author: Nikita Glukhov
Reviewed-by: Alexander Korotkov
Backpatch-through: 9.4

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/6cae9d2c10e151f741e7bc64a8b70bb2615c367c

Modified Files
--------------
src/backend/access/gist/gistget.c                 | 68 ++++++++-------------
src/backend/access/gist/gistscan.c                | 16 ++---
src/backend/access/index/indexam.c                | 22 +++----
src/backend/access/spgist/spgscan.c               | 74 +++++++++++++++++++----
src/include/access/genam.h                        | 10 ++-
src/include/access/gist_private.h                 | 27 ++-------
src/include/access/spgist_private.h               |  8 ++-
src/test/regress/expected/create_index_spgist.out | 10 +++
src/test/regress/sql/create_index_spgist.sql      |  5 ++
src/tools/pgindent/typedefs.list                  |  1 +
10 files changed, 140 insertions(+), 101 deletions(-)


Re: pgsql: Improve handling of NULLs in KNN-GiST and KNN-SP-GiST

От
Erik Rijkers
Дата:
On 2019-09-19 21:11, Alexander Korotkov wrote:
> Improve handling of NULLs in KNN-GiST and KNN-SP-GiST
> 

Oops:

      create_index                 ... ok          634 ms
      create_index_spgist          ... FAILED      438 ms
      create_view                  ... ok          329 ms






Re: pgsql: Improve handling of NULLs in KNN-GiST and KNN-SP-GiST

От
Tom Lane
Дата:
Erik Rijkers <er@xs4all.nl> writes:
> Oops:
>       create_index                 ... ok          634 ms
>       create_index_spgist          ... FAILED      438 ms
>       create_view                  ... ok          329 ms

I'm betting the issue is breaking the Datum abstraction here:

-                   scan->xs_orderbyvals[i] = Float8GetDatum(distanceValues[i]);
+                   scan->xs_orderbyvals[i] = item->distances[i].value;

AFAICS, item->distances[i].value is a double not a Datum, so dropping
the Float8GetDatum call is just wrong.

            regards, tom lane



Re: pgsql: Improve handling of NULLs in KNN-GiST and KNN-SP-GiST

От
Alexander Korotkov
Дата:
On Thu, Sep 19, 2019 at 11:13 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Erik Rijkers <er@xs4all.nl> writes:
> Oops:
>       create_index                 ... ok          634 ms
>       create_index_spgist          ... FAILED      438 ms
>       create_view                  ... ok          329 ms

I'm betting the issue is breaking the Datum abstraction here:

-                   scan->xs_orderbyvals[i] = Float8GetDatum(distanceValues[i]);
+                   scan->xs_orderbyvals[i] = item->distances[i].value;

Overseen by me.  Will fix immediately.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: pgsql: Improve handling of NULLs in KNN-GiST and KNN-SP-GiST

От
Alexander Korotkov
Дата:
On Thu, Sep 19, 2019 at 11:31 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
>
> On Thu, Sep 19, 2019 at 11:13 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> Erik Rijkers <er@xs4all.nl> writes:
>> > Oops:
>> >       create_index                 ... ok          634 ms
>> >       create_index_spgist          ... FAILED      438 ms
>> >       create_view                  ... ok          329 ms
>>
>> I'm betting the issue is breaking the Datum abstraction here:
>>
>> -                   scan->xs_orderbyvals[i] = Float8GetDatum(distanceValues[i]);
>> +                   scan->xs_orderbyvals[i] = item->distances[i].value;
>
>
> Overseen by me.  Will fix immediately.


Fix pushed from 11 to 9.5, where I made this error during backpatching.

However, I also see set of failures in master, which seems related to
this patch:
 * https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dromedary&dt=2019-09-19%2020%3A09%3A42
 * https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&dt=2019-09-19%2020%3A04%3A22
 * https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tern&dt=2019-09-19%2019%3A22%3A01

Will investigate them.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: pgsql: Improve handling of NULLs in KNN-GiST and KNN-SP-GiST

От
Alexander Korotkov
Дата:
On Thu, Sep 19, 2019 at 11:43 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> On Thu, Sep 19, 2019 at 11:31 PM Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
> >
> > On Thu, Sep 19, 2019 at 11:13 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >>
> >> Erik Rijkers <er@xs4all.nl> writes:
> >> > Oops:
> >> >       create_index                 ... ok          634 ms
> >> >       create_index_spgist          ... FAILED      438 ms
> >> >       create_view                  ... ok          329 ms
> >>
> >> I'm betting the issue is breaking the Datum abstraction here:
> >>
> >> -                   scan->xs_orderbyvals[i] = Float8GetDatum(distanceValues[i]);
> >> +                   scan->xs_orderbyvals[i] = item->distances[i].value;
> >
> >
> > Overseen by me.  Will fix immediately.
>
>
> Fix pushed from 11 to 9.5, where I made this error during backpatching.
>
> However, I also see set of failures in master, which seems related to
> this patch:
>  * https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dromedary&dt=2019-09-19%2020%3A09%3A42
>  * https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&dt=2019-09-19%2020%3A04%3A22
>  * https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tern&dt=2019-09-19%2019%3A22%3A01
>
> Will investigate them.

Both dromedary and tern, where segfault happened, are 32-bit.  Bug
seems related to USE_FLOAT8_BYVAL or something.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: pgsql: Improve handling of NULLs in KNN-GiST and KNN-SP-GiST

От
Tom Lane
Дата:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> Both dromedary and tern, where segfault happened, are 32-bit.  Bug
> seems related to USE_FLOAT8_BYVAL or something.

Yeah, I was just about to write back that there's an independent problem.
Look at the logic inside the loop in index_store_float8_orderby_distances:

        scan->xs_orderbynulls[i] = distances[i].isnull;

        if (scan->xs_orderbynulls[i])
            scan->xs_orderbyvals[i] = (Datum) 0;

        if (orderByTypes[i] == FLOAT8OID)
        {
#ifndef USE_FLOAT8_BYVAL
            /* must free any old value to avoid memory leakage */
            if (!scan->xs_orderbynulls[i])
                pfree(DatumGetPointer(scan->xs_orderbyvals[i]));
#endif
            if (!scan->xs_orderbynulls[i])
                scan->xs_orderbyvals[i] = Float8GetDatum(distances[i].value);
        }

The pfree is being done way too late, as you've already stomped on
both the isnull flag and the pointer value.  I think the first
three lines quoted above need to be moved to after the #ifndef
stanza (and, hence, duplicated for the float4 case), like

#ifndef USE_FLOAT8_BYVAL
            /* must free any old value to avoid memory leakage */
            if (!scan->xs_orderbynulls[i])
                pfree(DatumGetPointer(scan->xs_orderbyvals[i]));
#endif
            scan->xs_orderbynulls[i] = distances[i].isnull;
            if (scan->xs_orderbynulls[i])
                scan->xs_orderbyvals[i] = (Datum) 0;
            else
                scan->xs_orderbyvals[i] = Float8GetDatum(distances[i].value);

Another issue here is that the short-circuit path above (for !distances)
leaks memory, in not-passbyval cases.  I'd be inclined to get rid of the
short circuit and just handle the case within the main loop, so really
that'd be more like

            if (distances)
            {
                scan->xs_orderbynulls[i] = distances[i].isnull;
                if (scan->xs_orderbynulls[i])
                    scan->xs_orderbyvals[i] = (Datum) 0;
                else
                    scan->xs_orderbyvals[i] = Float8GetDatum(distances[i].value);
            }
            else
            {
                scan->xs_orderbynulls[i] = true;
                scan->xs_orderbyvals[i] = (Datum) 0;
            }


            regards, tom lane