Обсуждение: pgsql: Multirange datatypes
Multirange datatypes Multiranges are basically sorted arrays of non-overlapping ranges with set-theoretic operations defined over them. Since v14, each range type automatically gets a corresponding multirange datatype. There are both manual and automatic mechanisms for naming multirange types. Once can specify multirange type name using multirange_type_name attribute in CREATE TYPE. Otherwise, a multirange type name is generated automatically. If the range type name contains "range" then we change that to "multirange". Otherwise, we add "_multirange" to the end. Implementation of multiranges comes with a space-efficient internal representation format, which evades extra paddings and duplicated storage of oids. Altogether this format allows fetching a particular range by its index in O(n). Statistic gathering and selectivity estimation are implemented for multiranges. For this purpose, stored multirange is approximated as union range without gaps. This field will likely need improvements in the future. Catversion is bumped. Discussion: https://postgr.es/m/CALNJ-vSUpQ_Y%3DjXvTxt1VYFztaBSsWVXeF1y6gTYQ4bOiWDLgQ%40mail.gmail.com Discussion: https://postgr.es/m/a0b8026459d1e6167933be2104a6174e7d40d0ab.camel%40j-davis.com#fe7218c83b08068bfffb0c5293eceda0 Author: Paul Jungwirth, revised by me Reviewed-by: David Fetter, Corey Huinker, Jeff Davis, Pavel Stehule Reviewed-by: Alvaro Herrera, Tom Lane, Isaac Morland, David G. Johnston Reviewed-by: Zhihong Yu, Alexander Korotkov Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/6df7a9698bb036610c1e8c6d375e1be38cb26d5f Modified Files -------------- doc/src/sgml/catalogs.sgml | 15 +- doc/src/sgml/datatype.sgml | 23 + doc/src/sgml/extend.sgml | 59 +- doc/src/sgml/func.sgml | 656 +++++- doc/src/sgml/rangetypes.sgml | 67 +- doc/src/sgml/ref/create_type.sgml | 21 + src/backend/catalog/pg_range.c | 10 +- src/backend/catalog/pg_type.c | 118 +- src/backend/commands/typecmds.c | 357 ++- src/backend/executor/functions.c | 3 +- src/backend/parser/parse_coerce.c | 350 ++- src/backend/utils/adt/Makefile | 2 + src/backend/utils/adt/multirangetypes.c | 2679 ++++++++++++++++++++++ src/backend/utils/adt/multirangetypes_selfuncs.c | 1320 +++++++++++ src/backend/utils/adt/pg_upgrade_support.c | 22 + src/backend/utils/adt/pseudotypes.c | 38 + src/backend/utils/adt/rangetypes.c | 177 +- src/backend/utils/adt/rangetypes_typanalyze.c | 78 +- src/backend/utils/cache/lsyscache.c | 62 +- src/backend/utils/cache/syscache.c | 12 + src/backend/utils/cache/typcache.c | 98 +- src/backend/utils/fmgr/funcapi.c | 180 +- src/bin/pg_dump/pg_dump.c | 166 +- src/bin/pg_dump/pg_dump.h | 1 + src/bin/pg_dump/t/002_pg_dump.pl | 2 + src/include/access/tupmacs.h | 4 +- src/include/catalog/binary_upgrade.h | 2 + src/include/catalog/catversion.h | 2 +- src/include/catalog/pg_aggregate.dat | 11 + src/include/catalog/pg_amop.dat | 22 + src/include/catalog/pg_amproc.dat | 12 +- src/include/catalog/pg_cast.dat | 13 + src/include/catalog/pg_opclass.dat | 4 + src/include/catalog/pg_operator.dat | 169 ++ src/include/catalog/pg_opfamily.dat | 4 + src/include/catalog/pg_proc.dat | 282 +++ src/include/catalog/pg_range.dat | 15 +- src/include/catalog/pg_range.h | 9 +- src/include/catalog/pg_type.dat | 45 + src/include/catalog/pg_type.h | 10 +- src/include/commands/typecmds.h | 2 + src/include/utils/lsyscache.h | 3 + src/include/utils/multirangetypes.h | 116 + src/include/utils/rangetypes.h | 12 + src/include/utils/selfuncs.h | 3 + src/include/utils/syscache.h | 1 + src/include/utils/typcache.h | 38 +- src/pl/plpgsql/src/pl_comp.c | 6 + src/test/regress/expected/dependency.out | 1 + src/test/regress/expected/hash_func.out | 13 + src/test/regress/expected/multirangetypes.out | 2466 ++++++++++++++++++++ src/test/regress/expected/opr_sanity.out | 32 +- src/test/regress/expected/plpgsql.out | 4 +- src/test/regress/expected/polymorphism.out | 125 +- src/test/regress/expected/rangefuncs.out | 2 +- src/test/regress/expected/rangetypes.out | 42 +- src/test/regress/expected/sanity_check.out | 2 + src/test/regress/expected/type_sanity.out | 46 +- src/test/regress/parallel_schedule | 6 +- src/test/regress/serial_schedule | 1 + src/test/regress/sql/hash_func.sql | 10 + src/test/regress/sql/multirangetypes.sql | 677 ++++++ src/test/regress/sql/opr_sanity.sql | 24 +- src/test/regress/sql/polymorphism.sql | 52 + src/test/regress/sql/rangetypes.sql | 14 + src/test/regress/sql/type_sanity.sql | 16 +- src/tools/pgindent/typedefs.list | 4 + 67 files changed, 10568 insertions(+), 270 deletions(-)
Alexander Korotkov <akorotkov@postgresql.org> writes: > Multirange datatypes crake's not happy with this. Looks like you did not think through what pg_dump has to do for back-branch servers. regards, tom lane
On Sun, Dec 20, 2020 at 8:06 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Alexander Korotkov <akorotkov@postgresql.org> writes: > > Multirange datatypes > > crake's not happy with this. Looks like you did not think through > what pg_dump has to do for back-branch servers. Thank you. Should be fixed by 8344d72ccc. ------ Regards, Alexander Korotkov
On Sun, Dec 20, 2020 at 7:21 AM Alexander Korotkov <akorotkov@postgresql.org> wrote: > Implementation of multiranges comes with a space-efficient internal > representation format, which evades extra paddings and duplicated storage of > oids. Altogether this format allows fetching a particular range by its index > in O(n). BTW, I mean O(1) here. O(n) would be hardly an achievement. ------ Regards, Alexander Korotkov
On 2020-12-20 05:21, Alexander Korotkov wrote: > Multirange datatypes > > Multiranges are basically sorted arrays of non-overlapping ranges with > set-theoretic operations defined over them. > With --enable-cassert gcc 10.1.0 gives me: typecmds.c: In function ‘DefineRange’: typecmds.c:1367:16: warning: variable ‘mltrngaddress’ set but not used [-Wunused-but-set-variable] 1367 | ObjectAddress mltrngaddress; | ^~~~~~~~~~~~~ (Build on debian.) Thanks, Eik Rijkers
On Sun, Dec 20, 2020 at 9:06 AM Erik Rijkers <er@xs4all.nl> wrote: > On 2020-12-20 05:21, Alexander Korotkov wrote: > > Multirange datatypes > > > > Multiranges are basically sorted arrays of non-overlapping ranges with > > set-theoretic operations defined over them. > > > > With --enable-cassert gcc 10.1.0 gives me: > > typecmds.c: In function ‘DefineRange’: > typecmds.c:1367:16: warning: variable ‘mltrngaddress’ set but not used > [-Wunused-but-set-variable] > 1367 | ObjectAddress mltrngaddress; > | ^~~~~~~~~~~~~ Thank you. But are you sure that happened with --enable-cassert? I've reproduced this only without --enable-cassert. And that seems reasonable, because there is an assert, which uses this variable. ------ Regards, Alexander Korotkov
On Sun, Dec 20, 2020 at 4:18 PM Alexander Korotkov <aekorotkov@gmail.com> wrote: > On Sun, Dec 20, 2020 at 9:06 AM Erik Rijkers <er@xs4all.nl> wrote: > > On 2020-12-20 05:21, Alexander Korotkov wrote: > > > Multirange datatypes > > > > > > Multiranges are basically sorted arrays of non-overlapping ranges with > > > set-theoretic operations defined over them. > > > > > > > With --enable-cassert gcc 10.1.0 gives me: > > > > typecmds.c: In function ‘DefineRange’: > > typecmds.c:1367:16: warning: variable ‘mltrngaddress’ set but not used > > [-Wunused-but-set-variable] > > 1367 | ObjectAddress mltrngaddress; > > | ^~~~~~~~~~~~~ > > Thank you. But are you sure that happened with --enable-cassert? > I've reproduced this only without --enable-cassert. And that seems > reasonable, because there is an assert, which uses this variable. Warning arising without --enable-cassert should be fixed by 11072e8693. Let me know if the problem persists for you. ------ Regards, Alexander Korotkov
On 2020-12-20 14:18, Alexander Korotkov wrote: > On Sun, Dec 20, 2020 at 9:06 AM Erik Rijkers <er@xs4all.nl> wrote: >> On 2020-12-20 05:21, Alexander Korotkov wrote: >> > Multirange datatypes >> > >> > Multiranges are basically sorted arrays of non-overlapping ranges with >> > set-theoretic operations defined over them. >> > >> >> With --enable-cassert gcc 10.1.0 gives me: >> >> typecmds.c: In function ‘DefineRange’: >> typecmds.c:1367:16: warning: variable ‘mltrngaddress’ set but not used >> [-Wunused-but-set-variable] >> 1367 | ObjectAddress mltrngaddress; >> | ^~~~~~~~~~~~~ > > Thank you. But are you sure that happened with --enable-cassert? > I've reproduced this only without --enable-cassert. And that seems > reasonable, because there is an assert, which uses this variable. Sorry, you're right, of course. I misread my output. It occurs only without --enable-cassert > > ------ > Regards, > Alexander Korotkov
On Sun, Dec 20, 2020 at 8:16 AM Alexander Korotkov <aekorotkov@gmail.com> wrote: > On Sun, Dec 20, 2020 at 8:06 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Alexander Korotkov <akorotkov@postgresql.org> writes: > > > Multirange datatypes > > > > crake's not happy with this. Looks like you did not think through > > what pg_dump has to do for back-branch servers. > > Thank you. Should be fixed by 8344d72ccc. There also appears to be some floating bug appearing here and there. https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2020-12-20%2007%3A40%3A25 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dory&dt=2020-12-20%2004%3A25%3A18 I'm investigating it. ------ Regards, Alexander Korotkov
Hi! On Sun, Dec 20, 2020 at 4:37 PM Alexander Korotkov <aekorotkov@gmail.com> wrote: > > On Sun, Dec 20, 2020 at 8:16 AM Alexander Korotkov <aekorotkov@gmail.com> wrote: > > On Sun, Dec 20, 2020 at 8:06 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > Alexander Korotkov <akorotkov@postgresql.org> writes: > > > > Multirange datatypes > > > > > > crake's not happy with this. Looks like you did not think through > > > what pg_dump has to do for back-branch servers. > > > > Thank you. Should be fixed by 8344d72ccc. > > There also appears to be some floating bug appearing here and there. > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2020-12-20%2007%3A40%3A25 > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dory&dt=2020-12-20%2004%3A25%3A18 > I'm investigating it. It seems that my commit is reliably crashing gharial. Could I get access to gharial to investigate this issue? ------ Regards, Alexander Korotkov
On Sun, Dec 20, 2020 at 5:56 PM Alexander Korotkov <aekorotkov@gmail.com> wrote: > On Sun, Dec 20, 2020 at 4:37 PM Alexander Korotkov <aekorotkov@gmail.com> wrote: > > On Sun, Dec 20, 2020 at 8:16 AM Alexander Korotkov <aekorotkov@gmail.com> wrote: > > > On Sun, Dec 20, 2020 at 8:06 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > Alexander Korotkov <akorotkov@postgresql.org> writes: > > > > > Multirange datatypes > > > > > > > > crake's not happy with this. Looks like you did not think through > > > > what pg_dump has to do for back-branch servers. > > > > > > Thank you. Should be fixed by 8344d72ccc. > > > > There also appears to be some floating bug appearing here and there. > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2020-12-20%2007%3A40%3A25 > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dory&dt=2020-12-20%2004%3A25%3A18 > > I'm investigating it. > > It seems that my commit is reliably crashing gharial. Could I get > access to gharial to investigate this issue? No more need for this. I've managed to reproduce this locally. It fails during selectivity estimation. Still investigating... ------ Regards, Alexander Korotkov
Hi
On Mon, Dec 21, 2020 at 1:32 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Sun, Dec 20, 2020 at 5:56 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> On Sun, Dec 20, 2020 at 4:37 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > On Sun, Dec 20, 2020 at 8:16 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > > On Sun, Dec 20, 2020 at 8:06 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > > Alexander Korotkov <akorotkov@postgresql.org> writes:
> > > > > Multirange datatypes
> > > >
> > > > crake's not happy with this. Looks like you did not think through
> > > > what pg_dump has to do for back-branch servers.
> > >
> > > Thank you. Should be fixed by 8344d72ccc.
> >
> > There also appears to be some floating bug appearing here and there.
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2020-12-20%2007%3A40%3A25
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dory&dt=2020-12-20%2004%3A25%3A18
> > I'm investigating it.
>
> It seems that my commit is reliably crashing gharial. Could I get
> access to gharial to investigate this issue?
No more need for this. I've managed to reproduce this locally. It
fails during selectivity estimation. Still investigating...
Sure, if you need anything from me then do let me know.
------
Regards,
Alexander Korotkov
Sandeep Thakkar
Hi Alexander, On Sun, Dec 20, 2020 at 04:21:16AM +0000, Alexander Korotkov wrote: > Multirange datatypes > > Multiranges are basically sorted arrays of non-overlapping ranges with > set-theoretic operations defined over them. This commit is creating a compilation warning on Windows: multirangetypes.c(1033): warning C4715: 'multirange_constructor0' : not all control paths return a value See for example woodlouse: https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=woodlouse&dt=2020-12-21%2001%3A42%3A28&stg=make Looking at the code, it is obvious that the compiler cannot understand that this should always return. Please find attached a suggestion of patch. Thanks, -- Michael
Вложения
Hi, Michael! On Mon, Dec 21, 2020 at 10:34 AM Michael Paquier <michael@paquier.xyz> wrote: > On Sun, Dec 20, 2020 at 04:21:16AM +0000, Alexander Korotkov wrote: > > Multirange datatypes > > > > Multiranges are basically sorted arrays of non-overlapping ranges with > > set-theoretic operations defined over them. > > This commit is creating a compilation warning on Windows: > multirangetypes.c(1033): warning C4715: 'multirange_constructor0' : > not all control paths return a value > > See for example woodlouse: > https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=woodlouse&dt=2020-12-21%2001%3A42%3A28&stg=make > > Looking at the code, it is obvious that the compiler cannot understand > that this should always return. Please find attached a suggestion of > patch. Thank you for catching this! Pushed. ------ Regards, Alexander Korotkov
On Sun, Dec 20, 2020 at 11:02 PM Alexander Korotkov <aekorotkov@gmail.com> wrote: > On Sun, Dec 20, 2020 at 5:56 PM Alexander Korotkov <aekorotkov@gmail.com> wrote: > > On Sun, Dec 20, 2020 at 4:37 PM Alexander Korotkov <aekorotkov@gmail.com> wrote: > > > On Sun, Dec 20, 2020 at 8:16 AM Alexander Korotkov <aekorotkov@gmail.com> wrote: > > > There also appears to be some floating bug appearing here and there. > > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2020-12-20%2007%3A40%3A25 > > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dory&dt=2020-12-20%2004%3A25%3A18 > > > I'm investigating it. > > > > It seems that my commit is reliably crashing gharial. Could I get > > access to gharial to investigate this issue? > > No more need for this. I've managed to reproduce this locally. It > fails during selectivity estimation. Still investigating... Should be fixed by 4e1ee79e31. ------ Regards, Alexander Korotkov
On Mon, Dec 21, 2020 at 02:31:44PM +0300, Alexander Korotkov wrote: > Thank you for catching this! Pushed. Thanks for the commit, Alexander. -- Michael
Вложения
Alexander Korotkov <akorotkov@postgresql.org> writes: > Multirange datatypes Multiple buildfarm members are issuing similar warnings about multirangetypes.c: calliphoridae | 2020-12-29 23:10:11 | /home/andres/build/buildfarm-calliphoridae/HEAD/pgsql.build/../pgsql/src/backend/utils/adt/multirangetypes.c:218:37: warning:suggest braces around empty body in an 'else' statement [-Wempty-body] calliphoridae | 2020-12-29 23:10:11 | /home/andres/build/buildfarm-calliphoridae/HEAD/pgsql.build/../pgsql/src/backend/utils/adt/multirangetypes.c:236:37: warning:suggest braces around empty body in an 'else' statement [-Wempty-body] These are evidently unhappy with code like if ... ... } else /* include it in range_str */ ; So apparently preferred style is more like else { /* include it in range_str */ } I don't particularly care for the wording of this comment, either: it looks like it's describing an action that's about to be taken, only there's no action there. Could it say something more like "we already included it in range_str", or "we'll include it in range_str below the switch", or whatever's appropriate? regards, tom lane
On Wed, Dec 30, 2020 at 4:11 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Alexander Korotkov <akorotkov@postgresql.org> writes: > > Multirange datatypes > > Multiple buildfarm members are issuing similar warnings about > multirangetypes.c: > > calliphoridae | 2020-12-29 23:10:11 | /home/andres/build/buildfarm-calliphoridae/HEAD/pgsql.build/../pgsql/src/backend/utils/adt/multirangetypes.c:218:37: warning:suggest braces around empty body in an 'else' statement [-Wempty-body] > calliphoridae | 2020-12-29 23:10:11 | /home/andres/build/buildfarm-calliphoridae/HEAD/pgsql.build/../pgsql/src/backend/utils/adt/multirangetypes.c:236:37: warning:suggest braces around empty body in an 'else' statement [-Wempty-body] > > These are evidently unhappy with code like > > if ... > ... > } > else > /* include it in range_str */ ; > > So apparently preferred style is more like > > else > { > /* include it in range_str */ > } > > I don't particularly care for the wording of this comment, either: > it looks like it's describing an action that's about to be taken, > only there's no action there. Could it say something more like > "we already included it in range_str", or "we'll include it in > range_str below the switch", or whatever's appropriate? Thank you for noticing. I've slightly refactored this function in 16d531a30a. ------ Regards, Alexander Korotkov
Alexander Korotkov <aekorotkov@gmail.com> writes: > On Wed, Dec 30, 2020 at 4:11 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Multiple buildfarm members are issuing similar warnings about >> multirangetypes.c: >> calliphoridae | 2020-12-29 23:10:11 | /home/andres/build/buildfarm-calliphoridae/HEAD/pgsql.build/../pgsql/src/backend/utils/adt/multirangetypes.c:218:37: warning:suggest braces around empty body in an 'else' statement [-Wempty-body] >> calliphoridae | 2020-12-29 23:10:11 | /home/andres/build/buildfarm-calliphoridae/HEAD/pgsql.build/../pgsql/src/backend/utils/adt/multirangetypes.c:236:37: warning:suggest braces around empty body in an 'else' statement [-Wempty-body] > Thank you for noticing. I've slightly refactored this function in 16d531a30a. Looks good, thanks! regards, tom lane