Обсуждение: BUG #6283: About the behavior of indexscan in case there are some NULL values.
BUG #6283: About the behavior of indexscan in case there are some NULL values.
От
"Naoya Anzai"
Дата:
The following bug has been logged online:
Bug reference: 6283
Logged by: Naoya Anzai
Email address: anzai-naoya@mxu.nes.nec.co.jp
PostgreSQL version: 9.1.1
Operating system: RHEL5.5
Description: About the behavior of indexscan in case there are some
NULL values.
Details:
Hello,
In the newest PostgreSQL(9.1.1) or lower, Index-Scan's actual time may
increase unexpectedly.
I think that this is for scanning All NULL VALUES when performing an
indexscan
even if they does not need to be scanned.
I think that the cause is here.
[src/backend/access/nbtree/nbtutils.c(L963-L996) in PostgreSQL9.1.1]
--------
if (isNull)
{
if (key->sk_flags & SK_BT_NULLS_FIRST)
{
/*
* Since NULLs are sorted before non-NULLs, we know we have
* reached the lower limit of the range of values for this
* index attr. On a backward scan, we can stop if this qual
* is one of the "must match" subset. On a forward scan,
* however, we should keep going.
*/
if ((key->sk_flags & SK_BT_REQBKWD) &&
ScanDirectionIsBackward(dir))
*continuescan = false;
}
else
{
/*
* Since NULLs are sorted after non-NULLs, we know we have
* reached the upper limit of the range of values for this
* index attr. On a forward scan, we can stop if this qual is
* one of the "must match" subset. On a backward scan,
* however, we should keep going.
*/
if ((key->sk_flags & SK_BT_REQFWD) &&
ScanDirectionIsForward(dir))
*continuescan = false;
}
/*
* In any case, this indextuple doesn't match the qual.
*/
return false;
}
---------
For example, with NULLS_LAST, GREATER THAN scan key('value > scankey' etc.),
and FORWARD SCAN conditions,
even if scan have reached a NULL value, continuescan is still true all the
time.
If it rewrites as follows, I think that this problem is solved, but how is
it?
--------
--- nbtutils.c 2011-11-02 14:10:55.000000000 +0900
+++ nbtutils.c.new 2011-11-02 14:11:38.000000000 +0900
@@ -971,8 +971,7 @@
* is one of the "must match" subset. On a
forward scan,
* however, we should keep going.
*/
- if ((key->sk_flags & SK_BT_REQBKWD) &&
- ScanDirectionIsBackward(dir))
+ if (ScanDirectionIsBackward(dir))
*continuescan = false;
}
else
@@ -984,8 +983,7 @@
* one of the "must match" subset. On a
backward scan,
* however, we should keep going.
*/
- if ((key->sk_flags & SK_BT_REQFWD) &&
- ScanDirectionIsForward(dir))
+ if (ScanDirectionIsForward(dir))
*continuescan = false;
}
---------
Regards,
Naoya Anzai
>>
>> Hello,
>>
>> In the newest PostgreSQL(9.1.1) or lower, Index-Scan's actual time may
>> increase unexpectedly.
>>
>> I think that this is for scanning All NULL VALUES when performing an
>> indexscan
>> even if they does not need to be scanned.
>
>I think this was just fixed. Please check latest source code.
I have checked latest source code.
But, backward scan doesn't work correctly...
==========================================
[naoya@nesitcspg03 ~]$ psql
psql (9.2devel)
Type "help" for help.
naoya=# create table hoge(id integer,id2 integer);
CREATE TABLE
naoya=# insert into hoge select generate_series(1,10);
INSERT 0 10
naoya=# update hoge set id2=1 where id=5;
UPDATE 1
naoya=# update hoge set id2=10 where id=7;
UPDATE 1
naoya=# create index hoge_idx on hoge(id2);
CREATE INDEX
naoya=# analyze hoge;
ANALYZE
naoya=# set enable_bitmapscan to off;
SET
naoya=# set enable_seqscan to off;
SET
naoya=# select * from hoge;id | id2
----+----- 1 | 2 | 3 | 4 | 6 | 8 | 9 |10 | 5 | 1 7 | 10
(10 rows)
naoya=# explain analyze select * from hoge where id2>0; QUERY PLAN
---------------------------------------------------------------------------------------------------------------Index
Scanusing hoge_idx on hoge (cost=0.00..8.29 rows=2 width=8) (actual time=0.010..0.012 rows=2 loops=1) Index Cond:
(id2> 0)Total runtime: 0.065 ms
(3 rows)
naoya=# explain analyze select * from hoge where id2>0 order by id2 desc;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------Index
ScanBackward using hoge_idx on hoge (cost=0.00..8.29 rows=2 width=8) (actual time=0.005..0.005 rows=0 loops=1) Index
Cond:(id2 > 0)Total runtime: 0.035 ms
(3 rows)
naoya=# select * from hoge where id2>0;id | id2
----+----- 5 | 1 7 | 10
(2 rows)
naoya=# select * from hoge where id2>0 order by id2 desc;id | id2
----+-----
(0 rows)
==========================================
Regards.
---
Naoya Anzai
---
安西 直也 <anzai-naoya@mxu.nes.nec.co.jp> writes:
> I have checked latest source code.
> But, backward scan doesn't work correctly...
[ pokes at that... ] Hmm, the patches I applied a couple days ago
assumed that we are stepping forward or back from a place where the
WHERE clauses are satisfied. But in this example, the system just
applies _bt_endpoint to descend to the right-hand end of the index,
since there is no upper-bound qual with which to do anything different.
So we start from a place where the clauses aren't satisfied. That also
means that we haven't really fixed the original performance complaint:
there could be lots of nulls to be stepped over before we reach the
first matching row.
I think that the right fix for this is probably to make
_bt_preprocess_keys explicitly generate the "id is not null" qual that's
implied by "id > 0", so that it will have what amounts to a range
condition on the index contents (since for NULLS LAST, "id is not null"
amounts to "id is less than null", as it were). Then, instead of applying
_bt_endpoint, it will use the less-than key to descend the btree to the
last non-null entry, and we'll be good for both correctness and
performance.
I don't see any big problem in doing this in HEAD, but it's getting past
what seems like a sane back-patch. So probably we should revert the
back-branch versions of the prior patch, and just say that the
performance problem is only going to be addressed in HEAD.
regards, tom lane