Обсуждение: DOMAIN/composite TYPE vs. base TYPE
Hello, I'm considering creating a TYPE for what may be called a "possibly imprecise date" (pidate). The most obvious use is for recording dates such as births or deaths of historical individuals, where we may know that someone died precisely on a given year-month-day, but the birth may only be known down to year-month or just the year (or perhaps we know precisely the baptism date [Adam Smith], but not the actual birth, so we want to record the former but qualified so it can be annotated on display). Another use is for publications, like magazines that are issued on a monthly basis or journals that are issued on a quarterly or seasonal basis. We currently have two instances of this kind, using a standard DATE column plus a CHAR(1) column that encodes (on a limited basis for now) the YMD, YM or Y level of precision, and a simple SQL function to return a textual representation of the pidate. It would be nice to generalize this before going further. The first option I explored was creating a composite type with the two attributes, but that doesn't allow specification of DEFAULTs, NOT NULL or CHECK expressions on the precision code attribute. It seems I'd have to create a DOMAIN first, then use DATE and that domain to create a composite TYPE, to finally use the latter in actual tables. That layering looks cumbersome. Another option, which I havent't tried, is to subvert PG by creating an empty table, since that creates a "record type", but even if possible that would be a hack. Finally there's the base TYPE. This entails writing some seven functions "in C or another low-level language" (does PG support *any* other such language?), plus installing a library with those functions in a production environment. Doable, yes, but not very friendly either. Am I overlooking something or is the practice of creating abstractions in object-relational databases mostly unchanged? Regards, Joe
> On Sep 28, 2020, at 3:14 PM, Joe Abbate <jma@freedomcircle.com> wrote: > > Hello, > > I'm considering creating a TYPE for what may be called a "possibly imprecise date" (pidate). The most obvious use is forrecording dates such as births or deaths of historical individuals, where we may know that someone died precisely on agiven year-month-day, but the birth may only be known down to year-month or just the year (or perhaps we know preciselythe baptism date [Adam Smith], but not the actual birth, so we want to record the former but qualified so it canbe annotated on display). Another use is for publications, like magazines that are issued on a monthly basis or journalsthat are issued on a quarterly or seasonal basis. > > We currently have two instances of this kind, using a standard DATE column plus a CHAR(1) column that encodes (on a limitedbasis for now) the YMD, YM or Y level of precision, and a simple SQL function to return a textual representation ofthe pidate. It would be nice to generalize this before going further. > > The first option I explored was creating a composite type with the two attributes, but that doesn't allow specificationof DEFAULTs, NOT NULL or CHECK expressions on the precision code attribute. It seems I'd have to create a DOMAINfirst, then use DATE and that domain to create a composite TYPE, to finally use the latter in actual tables. Thatlayering looks cumbersome. > > Another option, which I havent't tried, is to subvert PG by creating an empty table, since that creates a "record type",but even if possible that would be a hack. > > Finally there's the base TYPE. This entails writing some seven functions "in C or another low-level language" (does PGsupport *any* other such language?), plus installing a library with those functions in a production environment. Doable,yes, but not very friendly either. > > Am I overlooking something or is the practice of creating abstractions in object-relational databases mostly unchanged? > > Regards, > > Joe > > just record all three fields (day, month, year) with nulls and do the to-date as needed.
Joe Abbate <jma@freedomcircle.com> writes: > I'm considering creating a TYPE for what may be called a "possibly > imprecise date" (pidate). > The first option I explored was creating a composite type with the two > attributes, but that doesn't allow specification of DEFAULTs, NOT NULL > or CHECK expressions on the precision code attribute. It seems I'd have > to create a DOMAIN first, then use DATE and that domain to create a > composite TYPE, to finally use the latter in actual tables. That > layering looks cumbersome. Agreed. > Another option, which I havent't tried, is to subvert PG by creating an > empty table, since that creates a "record type", but even if possible > that would be a hack. Won't help. Even if the table has constraints, when its rowtype is used in a standalone context, it only has the features that a standalone composite type would have (ie, no constraints). > Am I overlooking something or is the practice of creating abstractions > in object-relational databases mostly unchanged? Domain-over-composite might be a slightly simpler answer than your first one. It's only available in relatively late-model PG, and I'm not sure about its performance relative to your other design, but it is an alternative to think about. Note that attaching NOT NULL constraints at the domain level is almost never a good idea, because then you find yourself with a semantically impossible situation when, say, a column of that type is on the nullable side of an outer join. We allow such constraints, but they will be nominally violated in cases like that. regards, tom lane
Hello Rob, On 28/9/20 17:17, Rob Sargent wrote: > just record all three fields (day, month, year) with nulls and do the to-date as needed. That is not sufficient. An earlier implementation had something like a CHAR(8) to record YYYYMMDD, but how can you indicate, for example, an issue date of a bimonthly magazine, say July-Aug 2020? We can store 2020-07-01 in the DATE attribute, but we need another attribute to indicate it's really two months. Also, by storing three separate columns, you loose the beauty of the PG DATE abstraction. Joe
On 9/28/20 4:31 PM, Joe Abbate wrote:
The Gramps genealogy program has figured it out; maybe it's source code can lend you some clues.
Hello Rob,
On 28/9/20 17:17, Rob Sargent wrote:just record all three fields (day, month, year) with nulls and do the to-date as needed.
That is not sufficient. An earlier implementation had something like a CHAR(8) to record YYYYMMDD, but how can you indicate, for example, an issue date of a bimonthly magazine, say July-Aug 2020? We can store 2020-07-01 in the DATE attribute, but we need another attribute to indicate it's really two months. Also, by storing three separate columns, you loose the beauty of the PG DATE abstraction.
The Gramps genealogy program has figured it out; maybe it's source code can lend you some clues.
--
Angular momentum makes the world go 'round.
Angular momentum makes the world go 'round.
On 29 Sep 2020, at 7:31, Joe Abbate wrote: > Hello Rob, > > On 28/9/20 17:17, Rob Sargent wrote: >> just record all three fields (day, month, year) with nulls and do the >> to-date as needed. > > That is not sufficient. An earlier implementation had something like > a CHAR(8) to record YYYYMMDD, but how can you indicate, for example, > an issue date of a bimonthly magazine, say July-Aug 2020? We can > store 2020-07-01 in the DATE attribute, but we need another attribute > to indicate it's really two months. Also, by storing three separate > columns, you loose the beauty of the PG DATE abstraction. > This is only a partial “fix” and goes nowhere near solving the full wrapper/abstraction problem… Consider expressing all the component fields as a range. This allows you the ability to be a precise as you need and still have the benefits of well defined comparison functions. Regards Gavan Schneider —— Gavan Schneider, Sodwalls, NSW, Australia Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong. The ancients, in the case at bar, laid the blame upon the gods: sometimes they were remote and surly, and sometimes they were kind. In the Middle Ages lesser powers took a hand in the matter, and so one reads of works of art inspired by Our Lady, by the Blessed Saints, by the souls of the departed, and even by the devil. H. L. Mencken, 1920
Hello Tom, On 28/9/20 17:25, Tom Lane wrote: > Domain-over-composite might be a slightly simpler answer than your first > one. It's only available in relatively late-model PG, and I'm not sure > about its performance relative to your other design, but it is an > alternative to think about. "Domain-over-composite" meaning create a TYPE first (DATE, CHAR(1)) and then a DOMAIN based on that type? (1) How late model are we talking? The DOMAIN syntax doesn't seem changed from PG 11 to PG 13? (2) Can a CHECK constraint specify attributes of the composite? > Note that attaching NOT NULL constraints at the domain level is almost > never a good idea, because then you find yourself with a semantically > impossible situation when, say, a column of that type is on the nullable > side of an outer join. We allow such constraints, but they will be > nominally violated in cases like that. NULLs: Tony Hoare's "billion dollars of pain and damage" transported to SQL. Joe
Hello Gavan, On 28/9/20 17:52, Gavan Schneider wrote: > Consider expressing all the component fields as a range. This allows you > the ability to be a precise as you need and still have the benefits of > well defined comparison functions. I did consider that, but it's a tradeoff between 80% of the cases being a single precise date, 18% being a single date with some imprecision around a single month, and the rest with "ranges" of months or other intervals. Joe
On 9/28/20 2:58 PM, Joe Abbate wrote: > Hello Tom, > > On 28/9/20 17:25, Tom Lane wrote: >> Domain-over-composite might be a slightly simpler answer than your first >> one. It's only available in relatively late-model PG, and I'm not sure >> about its performance relative to your other design, but it is an >> alternative to think about. > > "Domain-over-composite" meaning create a TYPE first (DATE, CHAR(1)) and > then a DOMAIN based on that type? (1) How late model are we talking? > The DOMAIN syntax doesn't seem changed from PG 11 to PG 13? (2) Can a > CHECK constraint specify attributes of the composite? > >> Note that attaching NOT NULL constraints at the domain level is almost >> never a good idea, because then you find yourself with a semantically >> impossible situation when, say, a column of that type is on the nullable >> side of an outer join. We allow such constraints, but they will be >> nominally violated in cases like that. > > NULLs: Tony Hoare's "billion dollars of pain and damage" transported to > SQL. Except that the case Tom is talking about would occur due to something like: select table_a left join table_b on table_a.id = table_b.id where table_b.id is null; That has been very useful to me and I'm not sure that how anything you replace NULL with to represent 'unknown' would change the situation. > > Joe > > -- Adrian Klaver adrian.klaver@aklaver.com
Joe Abbate <jma@freedomcircle.com> writes: > On 28/9/20 17:25, Tom Lane wrote: >> Domain-over-composite might be a slightly simpler answer than your first >> one. It's only available in relatively late-model PG, and I'm not sure >> about its performance relative to your other design, but it is an >> alternative to think about. > "Domain-over-composite" meaning create a TYPE first (DATE, CHAR(1)) and > then a DOMAIN based on that type? Right. regression=# create type t1 as (d date, t char(1)); CREATE TYPE regression=# create domain dt1 as t1 check((value).t in ('a', 'b')); CREATE DOMAIN > (1) How late model are we talking? > The DOMAIN syntax doesn't seem changed from PG 11 to PG 13? Back to 11, looks like. The syntax didn't change, but v10 complains ERROR: "t1" is not a valid base type for a domain >> Note that attaching NOT NULL constraints at the domain level is almost >> never a good idea, because then you find yourself with a semantically >> impossible situation when, say, a column of that type is on the nullable >> side of an outer join. We allow such constraints, but they will be >> nominally violated in cases like that. > NULLs: Tony Hoare's "billion dollars of pain and damage" transported to SQL. I dunno, outer joins are awfully useful. It is true that the SQL committee has stuck too many not-quite-consistent meanings on NULL, but on the other hand, several different kinds of NULL might be worse. regards, tom lane