Обсуждение: Bug in GiST paring heap comparator
Hi! Andrey Borodin noticed me that results of some KNN-GIST tests are obviously wrong and don't match results of sort node. SELECT * FROM point_tbl ORDER BY f1 <-> '0,1'; f1 ------------------- (10,10) (NaN,NaN) (0,0) (1e-300,-1e-300) (-3,4) (-10,0) (-5,-12) (5.1,34.5) (1e+300,Infinity) (10 rows) It appears to be related to implementation of comparison function in pairing heap used as priority queue for KNN. It used plain float comparison, which doesn't handle Inf and Nan values well. Attached patch replaced it with float8_cmp_internal(). Also, note that with patch KNN results still don't fully match results of sort node. SELECT * FROM point_tbl ORDER BY f1 <-> '0,1'; f1 ------------------- (0,0) (1e-300,-1e-300) (-3,4) (-10,0) (10,10) (-5,-12) (5.1,34.5) (1e+300,Infinity) (NaN,NaN) (10 rows) NULL and '(NaN,NaN)' are swapped. It happens so, because we assume distance to NULL to be Inf, while float8_cmp_internal() assumes NaN to be greater than NULL. If even we would assume distance to NULL to be Nan, it doesn't guarantee that NULLs would be last. It looks like we can handle this only by introduction array of "distance is null" flags to GISTSearchItem. But does it worth it? ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Вложения
> 2 сент. 2019 г., в 9:54, Alexander Korotkov <a.korotkov@postgrespro.ru> написал(а): > > It appears to be related to implementation of comparison function in > pairing heap used as priority queue for KNN. It used plain float > comparison, which doesn't handle Inf and Nan values well. Attached > patch replaced it with float8_cmp_internal(). Thanks! This patch fixes tests of my new GiST build :) While patch looks good to me, I want to add that that there's a lot of <= and > comparisons in gistproc.c in function: static float8 computeDistance(bool isLeaf, BOX *box, Point *point) Should we fix this too? Or add comment why current code is safe. Thanks! Best regards, Andrey Borodin.
On 02/09/2019 07:54, Alexander Korotkov wrote: > Andrey Borodin noticed me that results of some KNN-GIST tests are > obviously wrong and don't match results of sort node. > > SELECT * FROM point_tbl ORDER BY f1 <-> '0,1'; > f1 > ------------------- > (10,10) > (NaN,NaN) > (0,0) > (1e-300,-1e-300) > (-3,4) > (-10,0) > (-5,-12) > (5.1,34.5) > > (1e+300,Infinity) > (10 rows) So we've memorized incorrect results in the expected output. Ouch! > It appears to be related to implementation of comparison function in > pairing heap used as priority queue for KNN. It used plain float > comparison, which doesn't handle Inf and Nan values well. Attached > patch replaced it with float8_cmp_internal(). > > Also, note that with patch KNN results still don't fully match results > of sort node. > > SELECT * FROM point_tbl ORDER BY f1 <-> '0,1'; > f1 > ------------------- > (0,0) > (1e-300,-1e-300) > (-3,4) > (-10,0) > (10,10) > (-5,-12) > (5.1,34.5) > (1e+300,Infinity) > > (NaN,NaN) > (10 rows) > > NULL and '(NaN,NaN)' are swapped. It happens so, because we assume > distance to NULL to be Inf, while float8_cmp_internal() assumes NaN to > be greater than NULL. If even we would assume distance to NULL to be > Nan, it doesn't guarantee that NULLs would be last. It looks like we > can handle this only by introduction array of "distance is null" flags > to GISTSearchItem. But does it worth it? I don't think we have much choice, returning wrong results is not an option. At first I thought we could set the "recheck" flag if we see NULLs or NaNs, but that won't work because rechecking is not supported with Index-Only Scans. Looking at the corresponding SP-GiST code, pairingheap_SpGistSearchItem_cmp() gets this right. I'd suggest copy-pasting the implementation from there, so that they would be as similar as possible. (SP-GiST gets away with just one isNull flag, because it doesn't support multi-column indexes. In GiST, you need an array, as you said.) - Heikki
On Mon, Sep 2, 2019 at 9:28 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote: > On 02/09/2019 07:54, Alexander Korotkov wrote: > > NULL and '(NaN,NaN)' are swapped. It happens so, because we assume > > distance to NULL to be Inf, while float8_cmp_internal() assumes NaN to > > be greater than NULL. If even we would assume distance to NULL to be > > Nan, it doesn't guarantee that NULLs would be last. It looks like we > > can handle this only by introduction array of "distance is null" flags > > to GISTSearchItem. But does it worth it? > > I don't think we have much choice, returning wrong results is not an > option. At first I thought we could set the "recheck" flag if we see > NULLs or NaNs, but that won't work because rechecking is not supported > with Index-Only Scans. > > Looking at the corresponding SP-GiST code, > pairingheap_SpGistSearchItem_cmp() gets this right. I'd suggest > copy-pasting the implementation from there, so that they would be as > similar as possible. (SP-GiST gets away with just one isNull flag, > because it doesn't support multi-column indexes. In GiST, you need an > array, as you said.) Thank you for clarifying my doubts. Second of attached patches solves this problem. In this patch GISTSearchItem have both array of distances and array of null flags at the end. They are accessed by a bit cumbersome GISTSearchItemDistanceValues() and GISTSearchItemDistanceNulls() macros. I came up to this, because I didn't like to allocate null flags or distances in separate chunk of memory. Also I had idea to wrap distance and null flag into struct, but I didn't like to change signature of index_store_float8_orderby_distances() that much. I'm going to push both if no objections. ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Вложения
On Fri, Sep 6, 2019 at 5:44 PM Alexander Korotkov <a.korotkov@postgrespro.ru> wrote: > I'm going to push both if no objections. So, pushed! ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 08.09.2019 22:32, Alexander Korotkov wrote: > On Fri, Sep 6, 2019 at 5:44 PM Alexander Korotkov > <a.korotkov@postgrespro.ru> wrote: >> I'm going to push both if no objections. > So, pushed! Two years ago there was a similar patch for this issue: https://www.postgresql.org/message-id/1499c9d0-075a-3014-d2aa-ba59121b3728%40postgrespro.ru Sorry that I looked at this thread too late. -- Nikita Glukhov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Mon, Sep 9, 2019 at 8:32 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote: > On 08.09.2019 22:32, Alexander Korotkov wrote: > > > On Fri, Sep 6, 2019 at 5:44 PM Alexander Korotkov > > <a.korotkov@postgrespro.ru> wrote: > >> I'm going to push both if no objections. > > So, pushed! > > Two years ago there was a similar patch for this issue: > https://www.postgresql.org/message-id/1499c9d0-075a-3014-d2aa-ba59121b3728%40postgrespro.ru > > Sorry that I looked at this thread too late. I see, thanks. You patch seems a bit less cumbersome thanks to using GISTDistance struct. You don't have to introduce separate array with null flags. But it's harder too keep index_store_float8_orderby_distances() used by both GiST and SP-GiST. What do you think? You can rebase your patch can propose that as refactoring. ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 09.09.2019 22:47, Alexander Korotkov wrote:
On Mon, Sep 9, 2019 at 8:32 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:On 08.09.2019 22:32, Alexander Korotkov wrote:On Fri, Sep 6, 2019 at 5:44 PM Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:I'm going to push both if no objections.So, pushed!Two years ago there was a similar patch for this issue: https://www.postgresql.org/message-id/1499c9d0-075a-3014-d2aa-ba59121b3728%40postgrespro.ru Sorry that I looked at this thread too late.I see, thanks. You patch seems a bit less cumbersome thanks to using GISTDistance struct. You don't have to introduce separate array with null flags. But it's harder too keep index_store_float8_orderby_distances() used by both GiST and SP-GiST. What do you think? You can rebase your patch can propose that as refactoring.
Rebased patch with refactoring is attached.
During this rebase I found that SP-GiST code has similar problems with NULLs. All SP-GiST functions do not check SK_ISNULL flag of ordering ScanKeys, and that leads to segfaults. In the following test, index is searched with non-NULL key first and then with NULL key, that leads to a crash:
CREATE TABLE t AS SELECT point(x, y) p FROM generate_series(0, 100) x, generate_series(0, 100) y; CREATE INDEX ON t USING spgist (p); -- crash SELECT (SELECT p FROM t ORDER BY p <-> pt LIMIT 1) FROM (VALUES (point '1,2'), (NULL)) pts(pt); After adding SK_ISNULL checks and starting to produce NULL distances, we need to store NULL flags for distance somewhere. Existing singleton flag for the whole SPGistSearchItem is not sufficient anymore. So, I introduced structure IndexOrderByDistance and used it everywhere in the GiST and SP-GiST code instead of raw double distances. SP-GiST order-by-distance code can be further refactored so that user functions do not have to worry about SK_ISNULL checks.
--
Вложения
On Wed, Sep 11, 2019 at 3:34 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote: > On 09.09.2019 22:47, Alexander Korotkov wrote: > > On Mon, Sep 9, 2019 at 8:32 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote: > > On 08.09.2019 22:32, Alexander Korotkov wrote: > > On Fri, Sep 6, 2019 at 5:44 PM Alexander Korotkov > <a.korotkov@postgrespro.ru> wrote: > > I'm going to push both if no objections. > > So, pushed! > > Two years ago there was a similar patch for this issue: > https://www.postgresql.org/message-id/1499c9d0-075a-3014-d2aa-ba59121b3728%40postgrespro.ru > > Sorry that I looked at this thread too late. > > I see, thanks. > > You patch seems a bit less cumbersome thanks to using GISTDistance > struct. You don't have to introduce separate array with null flags. > But it's harder too keep index_store_float8_orderby_distances() used > by both GiST and SP-GiST. > > What do you think? You can rebase your patch can propose that as refactoring. > > Rebased patch with refactoring is attached. > > During this rebase I found that SP-GiST code has similar problems with NULLs. > All SP-GiST functions do not check SK_ISNULL flag of ordering ScanKeys, and > that leads to segfaults. In the following test, index is searched with > non-NULL key first and then with NULL key, that leads to a crash: > > CREATE TABLE t AS SELECT point(x, y) p > FROM generate_series(0, 100) x, generate_series(0, 100) y; > > CREATE INDEX ON t USING spgist (p); > > -- crash > SELECT (SELECT p FROM t ORDER BY p <-> pt LIMIT 1) > FROM (VALUES (point '1,2'), (NULL)) pts(pt); > > > After adding SK_ISNULL checks and starting to produce NULL distances, we need to > store NULL flags for distance somewhere. Existing singleton flag for the whole > SPGistSearchItem is not sufficient anymore. > > > So, I introduced structure IndexOrderByDistance and used it everywhere in the > GiST and SP-GiST code instead of raw double distances. > > > SP-GiST order-by-distance code can be further refactored so that user functions > do not have to worry about SK_ISNULL checks. It doesn't seem SP-GiST inner_consistent and leaf_consistent functions can handle NULL scan keys for now. So should we let them handle NULL orderby keys? If we assume that NULL arguments produce NULL distances, we can evade changing the code of opclasses. Also, I've noticed IndexOrderByDistance is missed in typedefs.list. ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 12.09.2019 16:45, Alexander Korotkov wrote: > On Wed, Sep 11, 2019 at 3:34 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote: >> On 09.09.2019 22:47, Alexander Korotkov wrote: >> >> On Mon, Sep 9, 2019 at 8:32 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote: >> >> On 08.09.2019 22:32, Alexander Korotkov wrote: >> >> On Fri, Sep 6, 2019 at 5:44 PM Alexander Korotkov >> <a.korotkov@postgrespro.ru> wrote: >> >> I'm going to push both if no objections. >> >> So, pushed! >> >> Two years ago there was a similar patch for this issue: >> https://www.postgresql.org/message-id/1499c9d0-075a-3014-d2aa-ba59121b3728%40postgrespro.ru >> >> Sorry that I looked at this thread too late. >> >> I see, thanks. >> >> You patch seems a bit less cumbersome thanks to using GISTDistance >> struct. You don't have to introduce separate array with null flags. >> But it's harder too keep index_store_float8_orderby_distances() used >> by both GiST and SP-GiST. >> >> What do you think? You can rebase your patch can propose that as refactoring. >> >> Rebased patch with refactoring is attached. >> >> During this rebase I found that SP-GiST code has similar problems with NULLs. >> All SP-GiST functions do not check SK_ISNULL flag of ordering ScanKeys, and >> that leads to segfaults. In the following test, index is searched with >> non-NULL key first and then with NULL key, that leads to a crash: >> >> CREATE TABLE t AS SELECT point(x, y) p >> FROM generate_series(0, 100) x, generate_series(0, 100) y; >> >> CREATE INDEX ON t USING spgist (p); >> >> -- crash >> SELECT (SELECT p FROM t ORDER BY p <-> pt LIMIT 1) >> FROM (VALUES (point '1,2'), (NULL)) pts(pt); >> >> >> After adding SK_ISNULL checks and starting to produce NULL distances, we need to >> store NULL flags for distance somewhere. Existing singleton flag for the whole >> SPGistSearchItem is not sufficient anymore. >> >> >> So, I introduced structure IndexOrderByDistance and used it everywhere in the >> GiST and SP-GiST code instead of raw double distances. >> >> >> SP-GiST order-by-distance code can be further refactored so that user functions >> do not have to worry about SK_ISNULL checks. > It doesn't seem SP-GiST inner_consistent and leaf_consistent functions > can handle NULL scan keys for now. So should we let them handle NULL > orderby keys? If we assume that NULL arguments produce NULL > distances, we can evade changing the code of opclasses. I have moved handling of NULL ordering keys from opclasses to the common SP-GiST code, but really I don't like how it is implemented now. Maybe it's worth to move handling of NULL order-by keys to the even more higher level so, that AM don't have to worry about NULLs? Also I leaved usages of IndexOrderByDistance in opclasses. I think, that may help to minimize opclass changes in the future. > Also, I've noticed IndexOrderByDistance is missed in typedefs.list. Fixed. -- Nikita Glukhov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Вложения
On Fri, Sep 13, 2019 at 5:23 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote: > I have moved handling of NULL ordering keys from opclasses to the common > SP-GiST code, but really I don't like how it is implemented now. Maybe it's > worth to move handling of NULL order-by keys to the even more higher > level so, > that AM don't have to worry about NULLs? Yes, optimizer could remove away "col op NULL" clauses from ORDER BY if op is strict operator. And then such clauses wouldn't be passed to AM. But I see this as future improvement. For backpatching we should solve this at AM side. > Also I leaved usages of IndexOrderByDistance in opclasses. I think, that > may > help to minimize opclass changes in the future. Could you please extract this as a separate patch. We can consider this for master, but we shouldn't backpatch this. ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 13.09.2019 20:17, Alexander Korotkov wrote:
On Fri, Sep 13, 2019 at 5:23 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:I have moved handling of NULL ordering keys from opclasses to the common SP-GiST code, but really I don't like how it is implemented now. Maybe it's worth to move handling of NULL order-by keys to the even more higher level so, that AM don't have to worry about NULLs?Yes, optimizer could remove away "col op NULL" clauses from ORDER BY if op is strict operator. And then such clauses wouldn't be passed to AM. But I see this as future improvement. For backpatching we should solve this at AM side.Also I leaved usages of IndexOrderByDistance in opclasses. I think, that may help to minimize opclass changes in the future.Could you please extract this as a separate patch. We can consider this for master, but we shouldn't backpatch this.
Propagation of IndexOrderByDistance in SP-GiST was extracted into a separate patch #2.--
Вложения
On Mon, Sep 16, 2019 at 3:47 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote: > On 13.09.2019 20:17, Alexander Korotkov wrote: > > On Fri, Sep 13, 2019 at 5:23 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote: > > I have moved handling of NULL ordering keys from opclasses to the common > SP-GiST code, but really I don't like how it is implemented now. Maybe it's > worth to move handling of NULL order-by keys to the even more higher > level so, > that AM don't have to worry about NULLs? > > Yes, optimizer could remove away "col op NULL" clauses from ORDER BY > if op is strict operator. And then such clauses wouldn't be passed to > AM. But I see this as future improvement. For backpatching we should > solve this at AM side. > > Also I leaved usages of IndexOrderByDistance in opclasses. I think, that > may help to minimize opclass changes in the future. > > Could you please extract this as a separate patch. We can consider > this for master, but we shouldn't backpatch this. > > Propagation of IndexOrderByDistance in SP-GiST was extracted into a separate > patch #2. First patch is minor editing from me and commit message is attached. I'm going to push it if no objections. ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Вложения
On Mon, Sep 16, 2019 at 10:30 PM Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:
On Mon, Sep 16, 2019 at 3:47 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
> On 13.09.2019 20:17, Alexander Korotkov wrote:
>
> On Fri, Sep 13, 2019 at 5:23 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
>
> I have moved handling of NULL ordering keys from opclasses to the common
> SP-GiST code, but really I don't like how it is implemented now. Maybe it's
> worth to move handling of NULL order-by keys to the even more higher
> level so,
> that AM don't have to worry about NULLs?
>
> Yes, optimizer could remove away "col op NULL" clauses from ORDER BY
> if op is strict operator. And then such clauses wouldn't be passed to
> AM. But I see this as future improvement. For backpatching we should
> solve this at AM side.
>
> Also I leaved usages of IndexOrderByDistance in opclasses. I think, that
> may help to minimize opclass changes in the future.
>
> Could you please extract this as a separate patch. We can consider
> this for master, but we shouldn't backpatch this.
>
> Propagation of IndexOrderByDistance in SP-GiST was extracted into a separate
> patch #2.
First patch is minor editing from me and commit message is attached.
I'm going to push it if no objections.
Pushed.
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 19.09.2019 22:14, Alexander Korotkov wrote:
Pushed.
Attached patch fixes premature xs_orderbynulls[] assignment. The old value of NULL flag, not the new, should be checked before pfree()ing the old value.
Вложения
On 20.09.2019 0:15, Nikita Glukhov wrote:
On 19.09.2019 22:14, Alexander Korotkov wrote:Pushed.Attached patch fixes premature xs_orderbynulls[] assignment. The old value of NULL flag, not the new, should be checked before pfree()ing the old value.
Attached another one-line patch that fixes incorrect number of distances used in pairingheap_SpGistSearchItem_cmp():
- for (i = 0; i < so->numberOfOrderBys; i++) + for (i = 0; i < so->numberOfNonNullOrderBys; i++) This change was present in my original commit, but it seems to have been missed after the rebase. Sorry.
--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Вложения
On Wed, Sep 25, 2019 at 1:22 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote: > Attached another one-line patch that fixes incorrect number of distances used > in pairingheap_SpGistSearchItem_cmp(): > > - for (i = 0; i < so->numberOfOrderBys; i++) > + for (i = 0; i < so->numberOfNonNullOrderBys; i++) > > > This change was present in my original commit, but it seems to have been > missed after the rebase. Sorry. Pushed, thanks. ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company