Re: [HACKERS] [PATCH] SortSupport for macaddr type

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: [HACKERS] [PATCH] SortSupport for macaddr type
Дата
Msg-id CAH2-WznDkX59FmxD_rRh3orzfk-5mG4AiByvsz4VvCQSyyeKjw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] [PATCH] SortSupport for macaddr type  (Heikki Linnakangas <hlinnaka@iki.fi>)
Ответы Re: [HACKERS] [PATCH] SortSupport for macaddr type  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers
On Tue, Mar 14, 2017 at 3:25 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> Looks straightforward at a quick read-through. I have just a couple of
> questions. How much of the performance gain comes from avoiding the
> FunctionCallInvoke overhead, by simply having SortSupport with a comparison
> function, and how much comes from the "abbreviation"?

I assume it's very similar to the existing performance characteristics
that UUID SortSupport has.

> Also, is it worth the cycles and code to have the hyperloglog estimator, and
> aborting the abbreviation if there are only a few distinct values. Or put
> another way, how much does the abbreviation slow things down in the worst
> case?

Postgres doesn't support passing 6 byte types by-value due to a lack
of support from macros in places like postgres.h. There is a
GET_4_BYTES(), and a GET_8_BYTES(), but no GET_6_BYTES(). It doesn't
really seem worth adding those, just like it didn't seem worth it back
when I pointed out that a TID sort is very slow in the context of
CREATE INDEX CONCURRENTLY. (We ended up coming up with an ad-hoc int8
encoding scheme there, which allowed us to use the int8 default
opclass rather than the TID default opclass, you'll recall.)

It's a bit odd that macaddr abbreviated keys are actually always
smaller (not including padding) than sizeof(Datum) on 64-bit machines,
but this guarantees that abbreviated comparisons will reliably
indicate what an authoritative comparison would indicate, since of
course all relevant information is right there in the abbreviated key.
This will be the only time where that's generally true with
abbreviated keys for any type. This seems fine to me, especially
because it lets us compare macaddr using simple 3-way unsigned int
comparisons, which isn't otherwise the case.

I doubt that you'll ever see a real need to abort abbreviation.
Arguably it shouldn't be considered, but maybe it's worth being
consistent. It might matter on 32-bit platforms, where there is only a
single NIC-specific byte to differentiate MAC addresses, but even that
seems like a stretch. With a type like macaddr, where there is no
major cost like strxfrm() to worry about when abbreviating, the only
cost of abbreviating is that tie-breakers will have an extra
index_getattr() or similar. If there was no abbreviation, or if it
aborted, then the index_getattr() can happen once per SortTuple,
up-front.

Nitpick: the patch should probably not refer to 32-bit or 64-bit
systems. Rather, it should refer to Datum size only.

-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeff Janes
Дата:
Сообщение: Re: [HACKERS] free space map and visibility map
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] mat views stats