Re: B-Tree support function number 3 (strxfrm() optimization)

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: B-Tree support function number 3 (strxfrm() optimization)
Дата
Msg-id CAM3SWZQCDCnfWd3qzoO4QmY4G8oKHUqyrd26bBLa7FL2x-nTjg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: B-Tree support function number 3 (strxfrm() optimization)  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: B-Tree support function number 3 (strxfrm() optimization)  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Tue, Sep 9, 2014 at 2:25 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> I like that I don't have to care about every combination, and can
>> treat abbreviation abortion as the special case with the extra step,
>> in line with how I think of the optimization conceptually. Does that
>> make sense?
>
> No.  comparetup_heap() is hip-deep in this optimization as it stands,
> and what I proposed - if done correctly - isn't going to make that
> significantly worse.  In fact, it really ought to make things better:
> you should be able to set things up so that ssup->comparator is always
> the test that should be applied first, regardless of whether we're
> aborted or not-aborted or not doing this in the first place; and then
> ssup->tiebreak_comparator, if not NULL, can be applied after that.

I'm not following here. Isn't that at least equally true of what I've
proposed? Sure, I'm checking "if (!state->abbrevAborted)" first, but
that's irrelevant to the non-abbreviated case. It has nothing to
abort. Also, AFAICT we cannot abort and still call ssup->comparator()
indifferently, since sorttuple.datum1 fields are perhaps abbreviated
keys in half of all cases (i.e. pre-abort tuples), and uninitialized
garbage the other half of the time (i.e. post-abort tuples).

Where is the heap_getattr() stuff supposed to happen for the first
attribute to get an authoritative comparison in the event of aborting
(if we're calling ssup->comparator() on datum1 indifferently)? We
decide that we're going to use abbreviated keys within datum1 fields
up-front. When we abort, we cannot use datum1 fields at all (which
doesn't appear to matter for text -- the datum1 optimization has
historically only benefited pass-by-value types).

I'd mentioned that I'd hacked together a patch that doesn't
necessitate a separate state (if only to save a very small amount of
memory), but it is definitely messier within comparetup_heap(). I'm
still tweaking it. FYI, it does this within comparetup_heap():

+       if (!sortKey->abbrev_comparator)
+       {
+               /*
+                * There are no abbreviated keys to begin with (i.e.
no opclass support
+                * exists).  Compare the leading sort key, assuming an
authoritative
+                * representation.
+                */
+               compare = ApplySortComparator(a->datum1, a->isnull1,
+  b->datum1, b->isnull1,
+  sortKey);
+               if (compare != 0)
+                       return compare;
+
+               sortKey++;
+               nkey = 1;
+       }
+       else if (!state->abbrevAborted && sortKey->abbrev_comparator)
+       {
+               /*
+                * Leading attribute has abbreviated key representation, and
+                * abbreviation was not aborted when copying.  Compare
the leading sort
+                * key using abbreviated representation.
+                */
+               compare = ApplySortAbbrevComparator(a->datum1, a->isnull1,
+                b->datum1, b->isnull1,
+                sortKey);
+               if (compare != 0)
+                       return compare;
+
+               /*
+                * Since abbreviated comparison returned 0, call
tie-breaker comparator
+                * using original, authoritative representation, which
may break tie
+                * when differences were not captured within
abbreviated representation
+                */
+               nkey = 0;
+       }
+       else
+       {
+               /*
+                * Started with abbreviated keys, but aborted during
conversion/tuple
+                * copying -- check each attribute from scratch.  It's
not safe to make
+                * any assumption about the state of individual datum1 fields.
+                */
+               nkey = 0;
+       }

Is doing all this worth the small saving in memory? Personally, I
don't think that it is, but I defer to you.
-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Incorrect initialization of sentPtr in walsender.c
Следующее
От: Kouhei Kaigai
Дата:
Сообщение: Re: [v9.5] Custom Plan API