Обсуждение: Sort support for macaddr8

Поиск

Список

Период

Сортировка

Sort support for macaddr8

От

Melanie Plageman

Дата:

03 июня 2019 г., 22:23:33

Peter and I implemented this small (attached) patch to extend
abbreviated key compare sort to macaddr8 datatype (currently supported
for macaddr).

I tried checking to see if there is a performance difference using the
attached DDL based on src/test/regress/sql/macaddr8.sql. I found
that the sort function is only exercised when creating an index (not,
for example, when doing some type of aggregation).

With the patch applied to current master and using the DDL attached,
the timing for creating the index hovered around 20 ms for master and
15 ms for the patched version.

Machine and version specs: PostgreSQL 12beta1 on x86_64-pc-linux-gnu
compiled by gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0, 64-bit

I think that that seems like an improvement. I was thinking of

registering this patch for the next commitfest. Is that okay?

I was just wondering what the accepted way to test and share
performance numbers is for a small patch like this. Is sharing DDL
enough? Do I need to use pg_bench?

--

Melanie Plageman

Вложения

Re: Sort support for macaddr8

От

Chapman Flack

Дата:

04 июня 2019 г., 00:03:16

On 6/3/19 3:23 PM, Melanie Plageman wrote:
> Peter and I implemented this small (attached) patch to extend
> abbreviated key compare sort to macaddr8 datatype (currently supported
> for macaddr).

Am I going cross-eyed, or would the memset be serving more of a purpose
if it were in the SIZEOF_DATUM != 8 branch?

Regards,
-Chap

Re: Sort support for macaddr8

От

Peter Geoghegan

Дата:

04 июня 2019 г., 00:39:30

On Mon, Jun 3, 2019 at 1:17 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
> I tried checking to see if there is a performance difference using the
> attached DDL based on src/test/regress/sql/macaddr8.sql. I found
> that the sort function is only exercised when creating an index (not,
> for example, when doing some type of aggregation).

As you know, it's a bit weird that we're proposing adding sort support
with abbreviated keys for a type that is 8 bytes, since you'd expect
it to also be pass-by-value on most platforms, which largely defeats
the purpose of having abbreviated keys (though sort support could
still make sense, for the same reason it makes sense to have it for
int8). However, macaddr8 isn't actually pass-by-value, and it seems
too late to do anything about that now, so abbreviated keys actually
make sense.

> With the patch applied to current master and using the DDL attached,
> the timing for creating the index hovered around 20 ms for master and
> 15 ms for the patched version.

I would expect a sufficiently large sort to execute in about half the
time with abbreviation, based on previous experience. However, I think
that this patch can be justified in a relatively straightforward way.
It extends sort support for macaddr to macaddr8, since these two types
are almost identical in every other way. We should still validate the
performance out of an abundance of caution, but I would be very
surprised if there was much difference between the macaddr and
macaddr8 cases.

In short, users should not be surprised by the big gap in performance
between macaddr and macaddr8. It's worth being consistent there.

> I think that that seems like an improvement. I was thinking of
> registering this patch for the next commitfest. Is that okay?

Definitely, yes.

> I was just wondering what the accepted way to test and share
> performance numbers is for a small patch like this. Is sharing DDL
> enough? Do I need to use pg_bench?

I've always used custom microbenchmarks for stuff like this.
Repeatedly executing a particular query and taking the median
execution time as representative seems to be the best approach.

--
Peter Geoghegan

Re: Sort support for macaddr8

От

Peter Geoghegan

Дата:

04 июня 2019 г., 00:48:50

On Mon, Jun 3, 2019 at 2:03 PM Chapman Flack <chap@anastigmatix.net> wrote:
> Am I going cross-eyed, or would the memset be serving more of a purpose
> if it were in the SIZEOF_DATUM != 8 branch?

No, it wouldn't -- that's the correct place for it with the macaddr
type. However, it isn't actually necessary to memset() at the
equivalent point for macaddr8, since we cannot "run out of bytes from
the authoritative representation" that go in the Datum/abbreviated
key. I suppose that the memset() should simply be removed, since it is
superfluous here.

-- 
Peter Geoghegan

Re: Sort support for macaddr8

От

Chapman Flack

Дата:

04 июня 2019 г., 00:59:13

On 6/3/19 5:03 PM, Chapman Flack wrote:
> On 6/3/19 3:23 PM, Melanie Plageman wrote:
>> Peter and I implemented this small (attached) patch to extend
>> abbreviated key compare sort to macaddr8 datatype (currently supported
>> for macaddr).
> 
> Am I going cross-eyed, or would the memset be serving more of a purpose
> if it were in the SIZEOF_DATUM != 8 branch?

It looks like a copy-pasto coming from mac.c, where the size of
the thing to be compared isn't itself 8 bytes.

With sizeof(macaddr) being 6, that original code may have had
these cases in mind:

- SIZEOF_DATUM is something smaller than 6 (likely 4). The whole key
  doesn't fit, but that's ok, because abbreviated "equality" just means
  to recheck with the authoritative routine.
- SIZEOF_DATUM is exactly 6. Probably not a thing.
- SIZEOF_DATUM is anything larger than 6 (likely 8). Needs the memset.
  Also, in this case, abbreviated "equality" could be taken as true
  equality, never needing the authoritative fallback.

For macaddr8, the cases morph into these:

- SIZEOF_DATUM is something smaller than 8 (likely 4). Ok; it's
  just an abbreviation.
- SIZEOF_DATUM is exactly 8. Now an actual thing, even likely.
- SIZEOF_DATUM is larger than 8. Our flying cars run postgres, and
  we need the memset to make sure they don't crash.

This leaves me with a couple of questions:

1. (This one seems like a bug.) In the little-endian case, if
   SIZEOF_DATUM is smaller than the type, I'm not convinced by doing
   the DatumBigEndianToNative() after the memcpy(). Seems like that's
   too late to make sure the most-significant bytes got copied.

2. (This one seems like an API opportunity.) If it becomes common to
   add abbreviation support for smallish types such that (as here,
   when SIZEOF_DATUM >= 8), an abbreviated "equality" result is in fact
   authoritative, would it be worthwhile to have some way for the sort
   support routine to announce that fact to the caller? That could
   spare the caller the effort of re-checking with the authoritative
   routine. It could also (by making the equality case less costly)
   end up changing the weight assigned to the cardinality estimate in
   deciding whether to abbrev..

Regards,
-Chap

Re: Sort support for macaddr8

От

Chapman Flack

Дата:

04 июня 2019 г., 01:04:29

On 6/3/19 5:59 PM, Chapman Flack wrote:
> On 6/3/19 5:03 PM, Chapman Flack wrote:
> 1. (This one seems like a bug.) In the little-endian case, if
>    SIZEOF_DATUM is smaller than the type, I'm not convinced by doing
>    the DatumBigEndianToNative() after the memcpy(). Seems like that's
>    too late to make sure the most-significant bytes got copied.

Wait, I definitely was cross-eyed for that one. It's the abbreviated
copy whose endianness varies. Never mind.

Regards,
-Chap

Re: Sort support for macaddr8

От

Peter Geoghegan

Дата:

04 июня 2019 г., 01:05:55

On Mon, Jun 3, 2019 at 2:59 PM Chapman Flack <chap@anastigmatix.net> wrote:
> 1. (This one seems like a bug.) In the little-endian case, if
>    SIZEOF_DATUM is smaller than the type, I'm not convinced by doing
>    the DatumBigEndianToNative() after the memcpy(). Seems like that's
>    too late to make sure the most-significant bytes got copied.

Uh, when else would you do it? Before the memcpy()?

> 2. (This one seems like an API opportunity.) If it becomes common to
>    add abbreviation support for smallish types such that (as here,
>    when SIZEOF_DATUM >= 8), an abbreviated "equality" result is in fact
>    authoritative, would it be worthwhile to have some way for the sort
>    support routine to announce that fact to the caller? That could
>    spare the caller the effort of re-checking with the authoritative
>    routine.

It's possible that that would make sense, but I don't think that this
patch needs to do that. There is at least one pre-existing cases that
does this -- macaddr.

-- 
Peter Geoghegan

Re: Sort support for macaddr8

От

Stephen Frost

Дата:

04 июня 2019 г., 20:49:23

Greetings,

* Melanie Plageman (melanieplageman@gmail.com) wrote:
> Peter and I implemented this small (attached) patch to extend
> abbreviated key compare sort to macaddr8 datatype (currently supported
> for macaddr).

Nice.

> I think that that seems like an improvement. I was thinking of
> registering this patch for the next commitfest. Is that okay?

Sure.

> I was just wondering what the accepted way to test and share
> performance numbers is for a small patch like this. Is sharing DDL
> enough? Do I need to use pg_bench?

Detailed (enough...  doesn't need to include timing of every indivudal
query, but something like the average timing across 5 runs or similar
would be good) results posted to this list, with enough information
about how to reproduce the tests, would be the way to go.

The idea is to let others also test and make sure that they come up with
similar results to yours, and if they don't, ideally have enough
information to narrow down what's happening / what's different.

Thanks!

Stephen

Вложения

signature.asc

Re: Sort support for macaddr8

От

Melanie Plageman

Дата:

04 июня 2019 г., 21:33:18

On Mon, Jun 3, 2019 at 2:39 PM Peter Geoghegan <pg@bowt.ie> wrote:

As you know, it's a bit weird that we're proposing adding sort support
with abbreviated keys for a type that is 8 bytes, since you'd expect
it to also be pass-by-value on most platforms, which largely defeats
the purpose of having abbreviated keys (though sort support could
still make sense, for the same reason it makes sense to have it for
int8). However, macaddr8 isn't actually pass-by-value, and it seems
too late to do anything about that now, so abbreviated keys actually
make sense.

so, if what you say is true and it is either not worth it or
potentially a breaking change to make macaddr8 pass-by-value and
adding abbreviated sort support has the potential to avoid "pointer
chasing" and guarantee equivalent relative performance for macaddr8
and macaddr, then that seems worth it.

With regard to macaddr8_abbrev_convert() and memset(), I attached a patch
with the memset() removed, since it is superfluous here.

macaddr8_cmp_internal() already existed before this patch and I noticed
that it explicitly returns int32 whereas the return type of
macaddr_cmp_internal() is just specified as an int. I was wondering why.

I also noticed that the prototype for macaddr8_cmp_internal() was not
at the top of the file with the other static function prototypes. I
added it there, but I wasn't sure if there was some reason that it was
like that to begin with.

--

Melanie Plageman

Вложения

v2-0001-Implement-SortSupport-for-macaddr8-data-type.patch

Re: Sort support for macaddr8

От

Tomas Vondra

Дата:

05 июня 2019 г., 00:30:03

On Tue, Jun 04, 2019 at 01:49:23PM -0400, Stephen Frost wrote:
>Greetings,
>
>* Melanie Plageman (melanieplageman@gmail.com) wrote:
>> Peter and I implemented this small (attached) patch to extend
>> abbreviated key compare sort to macaddr8 datatype (currently supported
>> for macaddr).
>
>Nice.
>
>> I think that that seems like an improvement. I was thinking of
>> registering this patch for the next commitfest. Is that okay?
>
>Sure.
>
>> I was just wondering what the accepted way to test and share
>> performance numbers is for a small patch like this. Is sharing DDL
>> enough? Do I need to use pg_bench?
>
>Detailed (enough...  doesn't need to include timing of every indivudal
>query, but something like the average timing across 5 runs or similar
>would be good) results posted to this list, with enough information
>about how to reproduce the tests, would be the way to go.
>
>The idea is to let others also test and make sure that they come up with
>similar results to yours, and if they don't, ideally have enough
>information to narrow down what's happening / what's different.
>

Yeah, there's no "approved way" to do performance tests the contributors
would have to follow. That applies both to tooling and how detailed the
data need/should be. Ultimately, the goal is to convince others (and
yourself) that the change is an improvement. Does a simple pgbench
script achieve that? Cool, use that. Do you need something more complex?
Sure, do a shell script or something like that.

As long as others can reasonably reproduce your tests, it's fine.

For me, the most critical part of benchmarking a change is deciding what
to test - which queries, data sets, what amounts of data, config, etc.

For example, the data set you used has ~12k rows. Does the behavior
change with 10x or 100x that? It probably does not make sense to go
above available RAM (the I/O costs are likely to make everything else
mostly irrelevant), but CPU caches may matter a lot. Different work_mem
(and maintenance_work_mem) values may be useful too.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Sort support for macaddr8

От

Alvaro Herrera

Дата:

05 июня 2019 г., 00:37:35

On 2019-Jun-03, Peter Geoghegan wrote:

> As you know, it's a bit weird that we're proposing adding sort support
> with abbreviated keys for a type that is 8 bytes, since you'd expect
> it to also be pass-by-value on most platforms, which largely defeats
> the purpose of having abbreviated keys (though sort support could
> still make sense, for the same reason it makes sense to have it for
> int8). However, macaddr8 isn't actually pass-by-value, and it seems
> too late to do anything about that now, so abbreviated keys actually
> make sense.

I'm not sure I understand why you say it's too late to change now.
Surely the on-disk representation doesn't actually change, so it is not
impossible to change?  And you make it sound like doing that change is
worthwhile, performance-wise.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Sort support for macaddr8

От

Andres Freund

Дата:

05 июня 2019 г., 00:55:16

Hi,

On 2019-06-04 17:37:35 -0400, Alvaro Herrera wrote:
> On 2019-Jun-03, Peter Geoghegan wrote:
> > As you know, it's a bit weird that we're proposing adding sort support
> > with abbreviated keys for a type that is 8 bytes, since you'd expect
> > it to also be pass-by-value on most platforms, which largely defeats
> > the purpose of having abbreviated keys (though sort support could
> > still make sense, for the same reason it makes sense to have it for
> > int8). However, macaddr8 isn't actually pass-by-value, and it seems
> > too late to do anything about that now, so abbreviated keys actually
> > make sense.
> 
> I'm not sure I understand why you say it's too late to change now.
> Surely the on-disk representation doesn't actually change, so it is not
> impossible to change?  And you make it sound like doing that change is
> worthwhile, performance-wise.

Yea, I don't immediately see a problem with doing that on a major
version boundary. Obviously that'd only be possible for sizeof(Datum) ==
8 == sizeof(macaddr8) platforms, but that's the vast majority these
days. And generally, I think it's just about *always* worth to go for a
pass-by-value for the cases where that doesn't imply space wastage.

I think it might be worthwhile to optimize things so that all typlen > 0
&& typlen <= sizeof(Datum) are allowed for byval datums.

SELECT typname, typlen FROM pg_type WHERE typlen > 0 AND typlen <= 8 AND NOT typbyval;
┌──────────┬────────┐
│ typname  │ typlen │
├──────────┼────────┤
│ tid      │      6 │
│ macaddr  │      6 │
│ macaddr8 │      8 │
└──────────┴────────┘
(3 rows)

Seems like adding byval support for sizes outside of 1/2/4/8 bytes would
be worthwhile for tid alone. Not sure whether there's extensions with
signifcant use that have fixed-width types <= 8 bytes.

Greetings,

Andres Freund

Re: Sort support for macaddr8

От

Andres Freund

Дата:

05 июня 2019 г., 00:59:52

Hi,

On 2019-06-04 14:55:16 -0700, Andres Freund wrote:
> On 2019-06-04 17:37:35 -0400, Alvaro Herrera wrote:
> I think it might be worthwhile to optimize things so that all typlen > 0
> && typlen <= sizeof(Datum) are allowed for byval datums.

Maybe even just deprecate specifying byval at CREATE TYPE time, and
instead automatically infer it from the type length. We've had a number
of blunders around this, and I can't really see any reason for
specifying byval = false when we internally could treat it as a byval.

Greetings,

Andres Freund

Re: Sort support for macaddr8

От

Peter Geoghegan

Дата:

05 июня 2019 г., 01:10:07

On Tue, Jun 4, 2019 at 2:55 PM Andres Freund <andres@anarazel.de> wrote:
> Yea, I don't immediately see a problem with doing that on a major
> version boundary. Obviously that'd only be possible for sizeof(Datum) ==
> 8 == sizeof(macaddr8) platforms, but that's the vast majority these
> days. And generally, I think it's just about *always* worth to go for a
> pass-by-value for the cases where that doesn't imply space wastage.

It would be faster to do it that way, I think. You would need a more
complicated comparator than a classic abbreviated comparator (i.e. a
3-way unsigned int comparator) that way, but it would very probably be
faster on balance.

I'm glad to hear that it isn't *obviously* a problem from a
compatibility perspective -- I really wasn't sure about that, since
retrofitting a type to be pass-by-value like this is something that
may never have been attempted before now (at least not since we
started to care about pg_upgrade).

> I think it might be worthwhile to optimize things so that all typlen > 0
> && typlen <= sizeof(Datum) are allowed for byval datums.
>
> SELECT typname, typlen FROM pg_type WHERE typlen > 0 AND typlen <= 8 AND NOT typbyval;
> ┌──────────┬────────┐
> │ typname  │ typlen │
> ├──────────┼────────┤
> │ tid      │      6 │
> │ macaddr  │      6 │
> │ macaddr8 │      8 │
> └──────────┴────────┘
> (3 rows)

This is half the reason why I ended up implementing itemptr_encode()
to accelerate the TID sort used by CREATE INDEX CONCURRENTLY some
years back -- TID is 6 bytes wide, which doesn't have the necessary
macro support within postgres.h. There is no reason why that couldn't
be added for the benefit of both TID and macaddr types, though it
probably wouldn't be worth it. And, as long as we're not going to
those lengths, there may be some value in keeping the macaddr8 code in
line with macaddr code -- the two types are currently almost the same
(the glaring difference is the lack of macaddr8 sort support).

We'll need to draw the line somewhere, and that is likely to be a bit
arbitrary. This was what I meant by "weird".

--
Peter Geoghegan

Re: Sort support for macaddr8

От

Andres Freund

Дата:

05 июня 2019 г., 01:23:29

Hi,

On 2019-06-04 15:10:07 -0700, Peter Geoghegan wrote:
> On Tue, Jun 4, 2019 at 2:55 PM Andres Freund <andres@anarazel.de> wrote:
> > Yea, I don't immediately see a problem with doing that on a major
> > version boundary. Obviously that'd only be possible for sizeof(Datum) ==
> > 8 == sizeof(macaddr8) platforms, but that's the vast majority these
> > days. And generally, I think it's just about *always* worth to go for a
> > pass-by-value for the cases where that doesn't imply space wastage.
> 
> It would be faster to do it that way, I think. You would need a more
> complicated comparator than a classic abbreviated comparator (i.e. a
> 3-way unsigned int comparator) that way, but it would very probably be
> faster on balance.

I'd be surprised if it weren't.


> I'm glad to hear that it isn't *obviously* a problem from a
> compatibility perspective -- I really wasn't sure about that, since
> retrofitting a type to be pass-by-value like this is something that
> may never have been attempted before now (at least not since we
> started to care about pg_upgrade).

Obviously we have to test it, but I don't really see any compat
problems. Both have the same size on disk, after all. We couldn't make
such a change in a minor version, as DatumGetMacaddr*,
DatumGetItemPointer obviously need to change, but it ought to otherwise
be transparent.  It would, I think, be different if we still supported
v0 calling conventions, but we don't...


> This is half the reason why I ended up implementing itemptr_encode()
> to accelerate the TID sort used by CREATE INDEX CONCURRENTLY some
> years back -- TID is 6 bytes wide, which doesn't have the necessary
> macro support within postgres.h. There is no reason why that couldn't
> be added for the benefit of both TID and macaddr types, though it
> probably wouldn't be worth it.

I think we should definitely do that. It seems not unlikely that other
people want to write new fixed width types, and we shouldn't have them
deal with all this complexity unnecessarily.

Greetings,

Andres Freund

Re: Sort support for macaddr8

От

Peter Geoghegan

Дата:

05 июня 2019 г., 01:49:59

On Tue, Jun 4, 2019 at 3:23 PM Andres Freund <andres@anarazel.de> wrote:
> > This is half the reason why I ended up implementing itemptr_encode()
> > to accelerate the TID sort used by CREATE INDEX CONCURRENTLY some
> > years back -- TID is 6 bytes wide, which doesn't have the necessary
> > macro support within postgres.h. There is no reason why that couldn't
> > be added for the benefit of both TID and macaddr types, though it
> > probably wouldn't be worth it.
>
> I think we should definitely do that. It seems not unlikely that other
> people want to write new fixed width types, and we shouldn't have them
> deal with all this complexity unnecessarily.

On second thought, maybe there is something to be said for being
exhaustive here.

It seems like there is a preference for making macaddr8 pass-by-value
instead of adding abbreviated keys support to macaddr8, and possibly
doing the same with the original macaddr type.

Do you think that you'll be able to work on the project with this
expanded scope, Melanie?

-- 
Peter Geoghegan

Re: Sort support for macaddr8

От

Melanie Plageman

Дата:

05 июня 2019 г., 19:18:34

On Tue, Jun 4, 2019 at 3:50 PM Peter Geoghegan <pg@bowt.ie> wrote:

On Tue, Jun 4, 2019 at 3:23 PM Andres Freund <andres@anarazel.de> wrote:
> > This is half the reason why I ended up implementing itemptr_encode()
> > to accelerate the TID sort used by CREATE INDEX CONCURRENTLY some
> > years back -- TID is 6 bytes wide, which doesn't have the necessary
> > macro support within postgres.h. There is no reason why that couldn't
> > be added for the benefit of both TID and macaddr types, though it
> > probably wouldn't be worth it.
>
> I think we should definitely do that. It seems not unlikely that other
> people want to write new fixed width types, and we shouldn't have them
> deal with all this complexity unnecessarily.

On second thought, maybe there is something to be said for being
exhaustive here.

It seems like there is a preference for making macaddr8 pass-by-value
instead of adding abbreviated keys support to macaddr8, and possibly
doing the same with the original macaddr type.

Do you think that you'll be able to work on the project with this
expanded scope, Melanie?

I can take on making macaddr8 pass-by-value
I tinkered a bit last night and got in/out mostly working (I think).
I'm not sure about macaddr, TID, and user-defined types.

--

Melanie Plageman

Re: Sort support for macaddr8

От

Alvaro Herrera

Дата:

05 июня 2019 г., 20:41:46

On 2019-Jun-05, Melanie Plageman wrote:

> I can take on making macaddr8 pass-by-value
> I tinkered a bit last night and got in/out mostly working (I think).
> I'm not sure about macaddr, TID, and user-defined types.

Yeah, let's see what macaddr8 looks like, and we can move from there --
I suppose adapting for macaddr would not be terribly different, but we
don't have to do both in a single commit.  I don't expect that TID would
necessarily be similar since we have lots of bespoke code for that in
lots of places; it might not affect anything (it should not!) but then
it might.  No reason not to move forward incrementally.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Sort support for macaddr8

От

Andres Freund

Дата:

05 июня 2019 г., 21:55:35

Hi,

On 2019-06-05 09:18:34 -0700, Melanie Plageman wrote:
> On Tue, Jun 4, 2019 at 3:50 PM Peter Geoghegan <pg@bowt.ie> wrote:
> 
> > On Tue, Jun 4, 2019 at 3:23 PM Andres Freund <andres@anarazel.de> wrote:
> > > > This is half the reason why I ended up implementing itemptr_encode()
> > > > to accelerate the TID sort used by CREATE INDEX CONCURRENTLY some
> > > > years back -- TID is 6 bytes wide, which doesn't have the necessary
> > > > macro support within postgres.h. There is no reason why that couldn't
> > > > be added for the benefit of both TID and macaddr types, though it
> > > > probably wouldn't be worth it.
> > >
> > > I think we should definitely do that. It seems not unlikely that other
> > > people want to write new fixed width types, and we shouldn't have them
> > > deal with all this complexity unnecessarily.
> >
> > On second thought, maybe there is something to be said for being
> > exhaustive here.
> >
> > It seems like there is a preference for making macaddr8 pass-by-value
> > instead of adding abbreviated keys support to macaddr8, and possibly
> > doing the same with the original macaddr type.
> >
> > Do you think that you'll be able to work on the project with this
> > expanded scope, Melanie?
> >
> >
> I can take on making macaddr8 pass-by-value
> I tinkered a bit last night and got in/out mostly working (I think).
> I'm not sure about macaddr, TID, and user-defined types.

I'd much rather see this tackled in a general way than fiddling with
individual datatypes. I think we should:

1) make fetch_att(), store_att_byval() etc support datums of any length
   between 1 and <= sizeof(Datum). Probably also convert them to inline
   functions. There's a few more functions to be adjusted, but not many,
   I think.

2) Remove ability to pass PASSEDBYVALUE to CREATE TYPE, but instead
   compute whether attyval is possible, solely based on INTERNALLENGTH
   (when INTERNALLENGTH > 0 obviously).

3) Fix the fallout, by fixing a few of the Datum<->type conversion
   functions affected by this change. That'll require a bit of work, but
   not too much. We should write those conversion routines in a way
   that'll keep them working for the scenarios where the type is
   actually passable by value, and not (required for > 4 byte datums).

Greetings,

Andres Freund

Re: Sort support for macaddr8

От

Alvaro Herrera

Дата:

05 июня 2019 г., 22:14:42

On 2019-Jun-05, Andres Freund wrote:

> I'd much rather see this tackled in a general way than fiddling with
> individual datatypes. I think we should:
> 
> 1) make fetch_att(), store_att_byval() etc support datums of any length
>    between 1 and <= sizeof(Datum). Probably also convert them to inline
>    functions. There's a few more functions to be adjusted, but not many,
>    I think.

Does this mean that datatypes that are >4 and <=8 bytes need to handle
both cases?  Is it possible for them to detect the current environment?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Sort support for macaddr8

От

Andres Freund

Дата:

05 июня 2019 г., 22:17:51

Hi,

On June 5, 2019 12:14:42 PM PDT, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>On 2019-Jun-05, Andres Freund wrote:
>
>> I'd much rather see this tackled in a general way than fiddling with
>> individual datatypes. I think we should:
>>
>> 1) make fetch_att(), store_att_byval() etc support datums of any
>length
>>    between 1 and <= sizeof(Datum). Probably also convert them to
>inline
>>    functions. There's a few more functions to be adjusted, but not
>many,
>>    I think.
>
>Does this mean that datatypes that are >4 and <=8 bytes need to handle
>both cases?  Is it possible for them to detect the current environment?

Well, the conversion macros need to know. You can look at float8 for an example of the difference - it's pretty
centralized.We should provide a few helper macros to abstract that away. 

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: Sort support for macaddr8

От

Tom Lane

Дата:

06 июня 2019 г., 20:39:50

Andres Freund <andres@anarazel.de> writes:
> On June 5, 2019 12:14:42 PM PDT, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>> Does this mean that datatypes that are >4 and <=8 bytes need to handle
>> both cases?  Is it possible for them to detect the current environment?

> Well, the conversion macros need to know. You can look at float8 for an example of the difference - it's pretty
centralized.We should provide a few helper macros to abstract that away. 

FWIW, I disagree with Andres on this being a reasonable way to proceed.

The fact that we support both pass-by-value and pass-by-ref for float8
and int8 is because those are important data types that are worth taking
extra pains to optimize.  It's a very long way from there to insisting
that every datatype between 5 and 8 bytes long must get the same
treatment, and even further to Andres' apparent position that we should
force third-party types to do it whether they care about
micro-optimization or not.

And I *entirely* fail to get the point of adding such support for
datatypes of 5 or 7 bytes.  No such types exist, or are on the horizon
AFAIK.

Lastly, I don't think adding additional allowed widths of pass-by-value
types would be cost-free, because it would add cycles to the inner loops
of the tuple forming and deforming functions.  (No, I don't believe that
JIT makes that an ignorable concern.)

I'm not really sure that either macaddr or macaddr8 are used widely
enough to justify expending optimization effort on them.  But if they
are, let's just do that, not move the goal posts into the next stadium.

            regards, tom lane

Re: Sort support for macaddr8

От

Andres Freund

Дата:

06 июня 2019 г., 22:14:40

Hi,

On 2019-06-06 13:39:50 -0400, Tom Lane wrote:
> Lastly, I don't think adding additional allowed widths of pass-by-value
> types would be cost-free, because it would add cycles to the inner loops
> of the tuple forming and deforming functions.

I'm not sure I quite buy that.

I think that we have branches over a fixed number of lengths is largely
unnecessary. att_addlength_pointer() doesn't care - it just uses the
length. And I think we should just consider doing the same for
fetch_att(). E.g. by using memcpy().

That'd also have the advantage that we'd not be *forced* to rely
alignment of byval types. The only reason we actually need that is the
heaptuple-to-struct mapping for catalogs. Outside of that we don't have
pointers to individual byval tuples, and waste a fair bit of padding due
to that.

Additionally we'd get rid of needing separate versions for SIZEOF_DATUM
!= 8/not.

> (No, I don't believe that JIT makes that an ignorable concern.)

Obviously not.

Greetings,

Andres Freund