Обсуждение: ScanKey representation for RowCompare index conditions

Поиск
Список
Период
Сортировка

ScanKey representation for RowCompare index conditions

От
Tom Lane
Дата:
There's one nontrivial decision still to make about how to implement
proper per-spec row-comparison operations, namely: how a row
comparison ought to be represented in the index access method API.
The current representation of index conditions in the AM API is an
array of ScanKey structs, one per "indexcol op constant" index
condition, with implicit ANDing across all the conditions.  There
are various bits of code that require the ScanKeys to appear in order
by index column, though this isn't inherent in the data structure
itself.

Short of a fundamental redesign of the data structure, I can see two
plausible approaches to adding row-wise comparisons:

A. Include all the elements of the row comparison as separate entries
in the ScanKey array, and mark them (probably via sk_flags bits) to
show that they form a row condition rather than independent tests.
Unless we want to add more fields to ScanKey, we'd have to rely on
the order of the ScanKey entries to show the relationship of the
conditions (ie, which condition is part of which row comparison,
and what its column position is within the row).

B. Place a single entry for the row comparison in the main ScanKey
array, with a special sk_func pointer pointing to a function that
does a row-wise comparison.  The sk_argument field would point to
a subsidiary ScanKey array containing the actual index columns and
data values for the row elements.  sk_attno would reference the first
(leftmost) index column used by the row comparison.  We'd still want
an sk_flags bit to indicate that this is a row comparison, probably.

I'm currently leaning to plan B, on the grounds that:

1. It would require no changes in _bt_checkkeys(), which is the only
user of the data structure that is particularly performance-critical.
With plan A we'd be adding at least a few cycles to _bt_checkkeys in
all cases.  Plan B avoids that at the cost of an extra level of function
call to do a row comparison.  Given the relative frequency of the two
sorts of index conditions, this has to be the better tradeoff to make.

2. There's quite a bit of logic in btree indexscan setup that would find
it convenient to treat a row comparison as if it were a single condition
on just its leftmost column.  This may end up being nearly a wash, but
I think that plan A would make the setup code a shade more complex than
plan B would.  In particular, the rules about valid orderings of the
ScanKey entries would be complicated under plan A, whereas under plan
B it's still clear where everything belongs.

Any thoughts?  Anyone see a good plan C, or a serious flaw that I'm
missing in either of these ideas?
        regards, tom lane


Re: ScanKey representation for RowCompare index

От
Simon Riggs
Дата:
On Sun, 2006-01-15 at 18:03 -0500, Tom Lane wrote:
> There's one nontrivial decision still to make about how to implement
> proper per-spec row-comparison operations, namely: how a row
> comparison ought to be represented in the index access method API.

I'm not sure I understand why. Surely a row wise comparison can take
advantage of indexes on some of its columns? When would it be
advantageous to create an index on the whole row anyway? Wouldn't the
key likely be excessively long? So why would we support this at all?

> I'm currently leaning to plan B, on the grounds that:
> 
> 1. It would require no changes in _bt_checkkeys(), which is the only
> user of the data structure that is particularly performance-critical.
> With plan A we'd be adding at least a few cycles to _bt_checkkeys in
> all cases.  Plan B avoids that at the cost of an extra level of function
> call to do a row comparison.  Given the relative frequency of the two
> sorts of index conditions, this has to be the better tradeoff to make.
> 
> 2. There's quite a bit of logic in btree indexscan setup that would find
> it convenient to treat a row comparison as if it were a single condition
> on just its leftmost column.  This may end up being nearly a wash, but
> I think that plan A would make the setup code a shade more complex than
> plan B would.  In particular, the rules about valid orderings of the
> ScanKey entries would be complicated under plan A, whereas under plan
> B it's still clear where everything belongs.
> 
> Any thoughts?  Anyone see a good plan C, or a serious flaw that I'm
> missing in either of these ideas?

That looks sound. 

Best Regards, Simon Riggs



Re: ScanKey representation for RowCompare index conditions

От
Martijn van Oosterhout
Дата:
On Sun, Jan 15, 2006 at 06:03:12PM -0500, Tom Lane wrote:
> There's one nontrivial decision still to make about how to implement
> proper per-spec row-comparison operations, namely: how a row
> comparison ought to be represented in the index access method API.
> The current representation of index conditions in the AM API is an
> array of ScanKey structs, one per "indexcol op constant" index
> condition, with implicit ANDing across all the conditions.  There
> are various bits of code that require the ScanKeys to appear in order
> by index column, though this isn't inherent in the data structure
> itself.

ISTM that row-wise comparisons, as far as indexes are concerned are
actually simpler than normal scan-keys. For example, if you have the
condition (a,b) >= (5,1) then once the index has found that point,
every subsequent entry in the index matches (except possibly NULLs). So
you really don't actually need a row-comparison at any point after
than.

Now, if you have bracketing conditions like (a,b) BETWEEN (5,1) and
(7,0) it gets more interesting. Ideally you'd want to pass both of
these to the index so it knows when to stop. This really suggests using
your plan B since you then have the possibility of passing both, which
you don't really have with plan A. The latter also allows you to give
more auxilliary conditions like b>4.

So it seems like a good idea. No plan C I can think of...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: ScanKey representation for RowCompare index conditions

От
Tom Lane
Дата:
Simon Riggs <simon@2ndquadrant.com> writes:
> On Sun, 2006-01-15 at 18:03 -0500, Tom Lane wrote:
>> There's one nontrivial decision still to make about how to implement
>> proper per-spec row-comparison operations, namely: how a row
>> comparison ought to be represented in the index access method API.

> I'm not sure I understand why. Surely a row wise comparison can take
> advantage of indexes on some of its columns?

Not until we write the code to let it do so.  Currently, what the system
can match to an index on (a,b) is conditions like WHERE a >= x AND b >= y,
which is quite a different animal both semantically and representationally
from WHERE (a,b) >= (x,y).  (At least, assuming that one imputes the
correct SQL-spec meaning to the latter, viz a > x OR (a = x AND b >= y).)
Because of the semantic difference, we have to preserve the difference
when passing the condition to the index AM, thus the need for my
question.

> So why would we support this at all?

Example applications here and here:
http://archives.postgresql.org/pgsql-performance/2004-07/msg00188.php
http://archives.postgresql.org/pgsql-performance/2005-12/msg00590.php
(which demonstrate BTW the reasons for not trying to handle this by just
expanding out the OR/AND equivalents).
        regards, tom lane


Re: ScanKey representation for RowCompare index conditions

От
Tom Lane
Дата:
Martijn van Oosterhout <kleptog@svana.org> writes:
> ISTM that row-wise comparisons, as far as indexes are concerned are
> actually simpler than normal scan-keys. For example, if you have the
> condition (a,b) >= (5,1) then once the index has found that point,
> every subsequent entry in the index matches (except possibly NULLs). So
> you really don't actually need a row-comparison at any point after
> than.

Well, yeah you do, eg if the caller starts demanding backward fetch;
you need to be able to tell where to stop.  In any case it's certainly
necessary to pass down the whole row-condition into the index AM;
without that, the closest you could get would be an indexscan on a >= 5
which might have to scan an undesirably large number of rows before
reaching the first one with b >= 1.

> Now, if you have bracketing conditions like (a,b) BETWEEN (5,1) and
> (7,0) it gets more interesting. Ideally you'd want to pass both of
> these to the index so it knows when to stop. This really suggests using
> your plan B since you then have the possibility of passing both, which
> you don't really have with plan A. The latter also allows you to give
> more auxilliary conditions like b>4.

No, you misunderstood my plan A --- it's not giving up any of these
options.  But it's paying for it in complexity.  I was envisioning a
case like the above as being represented this way:
sk_flags    sk_attno    sk_func        sk_datum
ROW_START    a        >=        5ROW_END        b        >=        1ROW_START    a        <=        7ROW_END        b
    <=        0normal        b        >        4
 

Hence my comments about weird rules for the ordering of entries under
plan A --- the normal and ROW_START entries would be in order by attno,
but the row continuation/end entries wouldn't appear to follow this global
ordering.  They'd follow a local order within each row condition,
instead.

Since you didn't understand what I was saying, I suspect that plan A is
too confusing ...
        regards, tom lane


Re: ScanKey representation for RowCompare index conditions

От
Martijn van Oosterhout
Дата:
On Mon, Jan 16, 2006 at 12:07:44PM -0500, Tom Lane wrote:
> Since you didn't understand what I was saying, I suspect that plan A is
> too confusing ...

Umm, yeah. Now you've explained it I think it should be excluded on the
basis that it'll be a source of bugs. For all the places that matter a
row-condition needs to be treated as a whole and storing parts in a
top level ScanKey array just seems like asking for trouble.

Given no plan C, I guess plan B is the way to go...?
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: ScanKey representation for RowCompare index conditions

От
Tom Lane
Дата:
Martijn van Oosterhout <kleptog@svana.org> writes:
> On Mon, Jan 16, 2006 at 12:07:44PM -0500, Tom Lane wrote:
>> Since you didn't understand what I was saying, I suspect that plan A is
>> too confusing ...

> Umm, yeah. Now you've explained it I think it should be excluded on the
> basis that it'll be a source of bugs. For all the places that matter a
> row-condition needs to be treated as a whole and storing parts in a
> top level ScanKey array just seems like asking for trouble.

Actually, I'd just been coming back to plan A, after realizing that it's
not possible to do this with zero change in _bt_checkkeys() after all.
The amount of information passed down to an ordinary sk_func isn't
sufficient to do what the row-compare subroutine would need to do:
it needs access to the whole indextuple, not only the first column's
value, plus access to the direction and continuescan variables.  We
could add extra parameters to what gets passed, but that's certainly
no cheaper than adding a test on some sk_flags bits.  And having the
row comparison logic inside bt_checkkeys is probably more understandable
than having a separate subroutine with strange calling conventions.

I think given adequate documentation in skey.h, either representation
is about the same as far as the source-of-bugs argument goes.  The main
risk you get from a Plan-B approach is that it's easier to fail to
notice bugs of omission, where a piece of code looking at a ScanKey
needs to do something special with a row-compare key and omits to do so.
At least that's what I'm guessing at the moment --- I haven't actually
tried to code up any of the changes yet.  In any case, since we are only
intending to support this for searches of btree indexes, there are not
that many pieces of code that will be concerned with it.  I think it's
just _bt_first, _bt_preprocess_keys, _bt_checkkeys, and
ExecIndexBuildScanKeys.  Everyplace else will be dealing only in
"simple" ScanKeys with no row comparisons.
        regards, tom lane