Обсуждение: Does Type Have = Operator?
Hackers, pgTAP has a function that compares two values of a given type, which it uses for comparing column defaults. It looks likethis: CREATE OR REPLACE FUNCTION _def_is( TEXT, TEXT, anyelement, TEXT ) RETURNS TEXT AS $$ DECLARE thing text; BEGIN IF $1 ~ '^[^'']+[(]' THEN -- It's a functional default. RETURN is( $1, $3, $4 ); ENDIF; EXECUTE 'SELECT is(' || COALESCE($1, 'NULL' || '::' || $2) || '::' || $2 || ', ' ||COALESCE(quote_literal($3), 'NULL') || '::' || $2 || ', ' || COALESCE(quote_literal($4), 'NULL') ||')' INTO thing; RETURN thing; END; $$ LANGUAGE plpgsql; The is() function does an IS DISTINCT FROM to compare the two values passed to it. This has been working pretty well foryears, but one place it doesn’t work is with JSON values. I get: LINE 1: SELECT NOT $1 IS DISTINCT FROM $2 ^ HINT: No operator matches the given name and argumenttype(s). You might need to add explicit type casts. QUERY: SELECT NOT $1 IS DISTINCT FROM $2 This makes sense, of course, and I could fix it by comparing text values instead of json values when the values are JSON.But of course the lack of a = operator is not limited to JSON. So I’m wondering if there’s an interface at the SQL levelto tell me whether a type has an = operator? That way I could always use text values in those situations. Thanks, David
Em terça-feira, 10 de maio de 2016, David E. Wheeler <david@justatheory.com> escreveu:
Hackers,
pgTAP has a function that compares two values of a given type, which it uses for comparing column defaults. It looks like this:
CREATE OR REPLACE FUNCTION _def_is( TEXT, TEXT, anyelement, TEXT )
RETURNS TEXT AS $$
DECLARE
thing text;
BEGIN
IF $1 ~ '^[^'']+[(]' THEN
-- It's a functional default.
RETURN is( $1, $3, $4 );
END IF;
EXECUTE 'SELECT is('
|| COALESCE($1, 'NULL' || '::' || $2) || '::' || $2 || ', '
|| COALESCE(quote_literal($3), 'NULL') || '::' || $2 || ', '
|| COALESCE(quote_literal($4), 'NULL')
|| ')' INTO thing;
RETURN thing;
END;
$$ LANGUAGE plpgsql;
The is() function does an IS DISTINCT FROM to compare the two values passed to it. This has been working pretty well for years, but one place it doesn’t work is with JSON values. I get:
LINE 1: SELECT NOT $1 IS DISTINCT FROM $2
^
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
QUERY: SELECT NOT $1 IS DISTINCT FROM $2
This makes sense, of course, and I could fix it by comparing text values instead of json values when the values are JSON. But of course the lack of a = operator is not limited to JSON. So I’m wondering if there’s an interface at the SQL level to tell me whether a type has an = operator? That way I could always use text values in those situations.
Searching for the operator in pg_operator catalog isn't enought?
Regards,
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello
"David E. Wheeler" <david@justatheory.com> writes: > pgTAP has a function that compares two values of a given type, which it uses for comparing column defaults. It looks likethis: > CREATE OR REPLACE FUNCTION _def_is( TEXT, TEXT, anyelement, TEXT ) > RETURNS TEXT AS $$ Given that you're coercing both one input value and the result to text, I don't understand why you don't just compare the text representations. I'm also not very clear on what you mean by "comparing column defaults". A column default is an expression (in the general case anyway), not just a value of the type. Maybe if you'd shown us the is() function, as well as a typical usage of _def_is(), this would be less opaque. regards, tom lane
On Tuesday, May 10, 2016, David E. Wheeler <david@justatheory.com> wrote:
This makes sense, of course, and I could fix it by comparing text values instead of json values when the values are JSON. But of course the lack of a = operator is not limited to JSON. So I’m wondering if there’s an interface at the SQL level to tell me whether a type has an = operator? That way I could always use text values in those situations.
Brute force: you'd have to query pg_amop and note the absence of a row with a btree (maybe hash too...) family strategy 3 (1 for hash) [equality] where the left and right types are the same and match the type in question.
There is likely more to it - though absence is pretty much a given I'd be concerned about false negatives due to ignoring other factors like "amoppurpose".
In theory you should be able to trade off convenience for correctness by calling:
to_regoperator('=(type,type)')
But I've never tried it and it assumes that = is the equality operator and that its presence is sufficient. I'm also guessing on the text type name syntax.
This option is a young one from what I remember.
David J.
On 10-05-2016 21:12, David E. Wheeler wrote: > This makes sense, of course, and I could fix it by comparing text > values instead of json values when the values are JSON. But of course > the lack of a = operator is not limited to JSON. So I’m wondering if > there’s an interface at the SQL level to tell me whether a type has > an = operator? That way I could always use text values in those > situations. > There isn't an equality notation at catalogs. You could try "SELECT oprname FROM pg_operator WHERE oprcode::text ~ 'eq'" but it is too fragile. You could also try oprname, oprrest or oprjoin but the result is worse than the former solution. You definitely need a hack. Also, IS DISTINCT FROM is an alias for = operator per standard IIRC. -- Euler Taveira Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte24x7 e Treinamento
On Tuesday, May 10, 2016, Euler Taveira <euler@timbira.com.br> wrote:
http://www.postgresql.org/docs/9.5/interactive/functions-comparison.html
Also, IS DISTINCT FROM is an alias for = operator per standard IIRC.
Technically "is not distinct from" would be more correct. Alias implies exact while in the presence of nulls the two behave differently.
"is distinct from" ~ "<>" which is the canonical form (alias) for "!="
David J.
On 10-05-2016 22:28, David G. Johnston wrote: > Technically "is not distinct from" would be more correct. > Ooops. Fat fingered the statement. Also, forgot to consider null case. euler=# \pset null 'NULL' Null display is "NULL". euler=# select x.a, y.b, x.a IS NOT DISTINCT FROM y.b AS "INDF", x.a = y.b AS "=" FROM (VALUES (3), (6), (NULL)) AS x (a), (VALUES (3), (6), (NULL)) AS y (b); a | b | INDF | = ------+------+------+------ 3 | 3 | t | t 3 | 6 | f | f 3 | NULL | f | NULL 6 | 3 | f | f 6 | 6 | t | t 6 | NULL | f | NULLNULL | 3 | f | NULLNULL | 6 | f | NULLNULL | NULL | t | NULL (9 rows) -- Euler Taveira Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte24x7 e Treinamento
On 5/10/16 9:16 PM, David G. Johnston wrote: > Brute force: you'd have to query pg_amop and note the absence of a row > with a btree (maybe hash too...) family strategy 3 (1 for hash) > [equality] where the left and right types are the same and match the > type in question. While these are good thoughts, the implementation of DISTINCT actually looks for an operator that is literally named "=". -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On May 10, 2016, at 6:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Given that you're coercing both one input value and the result to text, > I don't understand why you don't just compare the text representations. Because sometimes the text is not equal when the casted text is. Consider 'foo'::citext = 'FOO':citext > I'm also not very clear on what you mean by "comparing column defaults". > A column default is an expression (in the general case anyway), not just > a value of the type. Yeah, the pgTAP column_default_is() function takes a string representation of an expression. > Maybe if you'd shown us the is() function, as well as a typical usage > of _def_is(), this would be less opaque. Here’s is(): CREATE OR REPLACE FUNCTION is (anyelement, anyelement, text) RETURNS TEXT AS $$ DECLARE result BOOLEAN; output TEXT; BEGIN -- Would prefer $1 IS NOT DISTINCT FROM, but that's not supported by 8.1. result := NOT$1 IS DISTINCT FROM $2; output := ok( result, $3 ); RETURN output || CASE result WHEN TRUE THEN '' ELSE E'\n'|| diag( ' have: ' || CASE WHEN $1 IS NULL THEN 'NULL' ELSE $1::text END || E'\n want: ' || CASE WHEN $2 IS NULL THEN 'NULL' ELSE $2::text END ) END; END; $$ LANGUAGE plpgsql; _def_is() is called by another function, which effectively is: CREATE OR REPLACE FUNCTION _cdi ( NAME, NAME, NAME, anyelement, TEXT ) RETURNS TEXT AS $$ BEGIN RETURN _def_is( pg_catalog.pg_get_expr(d.adbin, d.adrelid), pg_catalog.format_type(a.atttypid, a.atttypmod), $4, $5 ) FROM pg_catalog.pg_namespace n, pg_catalog.pg_class c, pg_catalog.pg_attribute a, pg_catalog.pg_attrdef d WHERE n.oid = c.relnamespace AND c.oid = a.attrelid AND a.atthasdef AND a.attrelid= d.adrelid AND a.attnum = d.adnum AND n.nspname = $1 AND c.relname = $2 AND a.attnum > 0 AND NOT a.attisdropped AND a.attname = $3; END; $$ LANGUAGE plpgsql; That function si called like this: _cdi( :schema, :table, :column, :default, :description ); Best, David
On May 10, 2016, at 5:56 PM, Fabrízio de Royes Mello <fabriziomello@gmail.com> wrote: > Searching for the operator in pg_operator catalog isn't enought? Seems like overkill, but will do if there’s nothing else. Best, David
On Tue, May 10, 2016 at 9:16 PM, David G. Johnston <david.g.johnston@gmail.com> wrote: > Brute force: you'd have to query pg_amop and note the absence of a row with > a btree (maybe hash too...) family strategy 3 (1 for hash) [equality] where > the left and right types are the same and match the type in question. The core system uses this kind of thing to find equality operators in a number of cases. We often assume that the operator which implements equality for the type's default btree operator class is the canonical one for some purpose. Ditto for the default hash operator class. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, May 10, 2016 at 9:16 PM, David G. Johnston
<david.g.johnston@gmail.com> wrote:
> Brute force: you'd have to query pg_amop and note the absence of a row with
> a btree (maybe hash too...) family strategy 3 (1 for hash) [equality] where
> the left and right types are the same and match the type in question.
The core system uses this kind of thing to find equality operators in
a number of cases.
We often assume that the operator which implements equality for the
type's default btree operator class is the canonical one for some
purpose. Ditto for the default hash operator class.
Yeah, the user-facing documentation covers it pretty deeply if not in one central location.
But apparently the core system also uses the fact that "=", if present, is an equality operator and, less so, that no other operator is expected
to be used for equality.
I suspect that such an expectation is not enforced though - e.g., someone could define "==" to mean equality if they so choose (the lesser property). Its hard to imagine defining "=" to mean something different in logic, though, without intentionally trying to be cryptic.
David J.
On Wed, May 11, 2016 at 12:01 PM, David G. Johnston <david.g.johnston@gmail.com> wrote: > Its hard to imagine defining "=" to mean something different in logic, > though, without intentionally trying to be cryptic. As long as you don't assume too much about *what* is equal. test=# select '(1,1)(2,2)'::box = '(-4.5,1000)(-2.5,1000.5)'::box;?column? ----------t (1 row) -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On May 11, 2016, at 10:19 AM, Kevin Grittner <kgrittn@gmail.com> wrote: > As long as you don't assume too much about *what* is equal. > > test=# select '(1,1)(2,2)'::box = '(-4.5,1000)(-2.5,1000.5)'::box; > ?column? > ---------- > t > (1 row) Oh, well crap. Maybe I’d be better off just comparing the plain text of the expressions as Tom suggested. David
On Wed, May 11, 2016 at 12:23 PM, David E. Wheeler <david@justatheory.com> wrote: > Oh, well crap. Maybe I’d be better off just comparing the plain > text of the expressions as Tom suggested. At the other extreme are the row comparison operators that only consider values equal if they have the same storage value. See the last paragraph of: http://www.postgresql.org/docs/9.5/static/functions-comparisons.html#COMPOSITE-TYPE-COMPARISON | To support matching of rows which include elements without a | default B-tree operator class, the following operators are | defined for composite type comparison: *=, *<>, *<, *<=, *>, and | *>=. These operators compare the internal binary representation | of the two rows. Two rows might have a different binary | representation even though comparisons of the two rows with the | equality operator is true. The ordering of rows under these | comparison operators is deterministic but not otherwise | meaningful. These operators are used internally for materialized | views and might be useful for other specialized purposes such as | replication but are not intended to be generally useful for | writing queries. I'm not clear enough on your intended usage to know whether these operators are a good fit, but they are sitting there waiting to be used if they do fit. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, May 11, 2016 at 2:07 AM, David E. Wheeler <david@justatheory.com> wrote:
>
> On May 10, 2016, at 5:56 PM, Fabrízio de Royes Mello <fabriziomello@gmail.com> wrote:
>
> > Searching for the operator in pg_operator catalog isn't enought?
>
> Seems like overkill, but will do if there’s nothing else.
>
Regards,
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello
On May 11, 2016, at 10:34 AM, Kevin Grittner <kgrittn@gmail.com> wrote: > I'm not clear enough on your intended usage to know whether these > operators are a good fit, but they are sitting there waiting to be > used if they do fit. Huh. I haven’t had any problems with IS DISTINCT FROM for rows, except for the situation in which a failure is thrown becausethe types vary, say between TEXT and CITEXT. That can drive the tester crazy, since it says something like: Results differ beginning at row 3: have: (44,Anna) want: (44,Anna) But overall I think that’s okay; the tester really does want to make sure the type is correct. Thanks, David
On May 11, 2016, at 11:01 AM, Fabrízio de Royes Mello <fabriziomello@gmail.com> wrote: > I know... but you can do that just in case the current behaviour fail by cathing it with "begin...exception...", so you'llminimize the looking for process on catalog. Yeah, I guess. Honestly 90% of this issue would go away for me if there was a `json = json` operator. I know there are acouple different ways to interpret JSON equality, though. David
On Wed, May 11, 2016 at 9:09 PM, David E. Wheeler <david@justatheory.com> wrote:
>
> On May 11, 2016, at 11:01 AM, Fabrízio de Royes Mello <fabriziomello@gmail.com> wrote:
>
> > I know... but you can do that just in case the current behaviour fail by cathing it with "begin...exception...", so you'll minimize the looking for process on catalog.
>
> Yeah, I guess. Honestly 90% of this issue would go away for me if there was a `json = json` operator. I know there are a couple different ways to interpret JSON equality, though.
>
CREATE OR REPLACE FUNCTION json_equals_to_json(first JSON, second JSON)
RETURNS boolean AS
$$
BEGIN
RETURN first::TEXT IS NOT DISTINCT FROM second::TEXT;
END
$$
LANGUAGE plpgsql IMMUTABLE;
CREATE OPERATOR = (
LEFTARG = json,
RIGHTARG = json,
PROCEDURE = json_equals_to_json
);
Regards,
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello
On May 12, 2016, at 11:19 AM, Fabrízio de Royes Mello <fabriziomello@gmail.com> wrote: > Yeah.. it's ugly but you can do something like that: I could, but I won’t, since this is pgTAP and users of the library might have defined their own json operators. Andrew Dunstan has done the yeoman’s work of creating such operators, BTW: https://bitbucket.org/adunstan/jsoncmp Some might argue that it ought to compare JSON objects, effectively be the equivalent of ::jsonb = ::jsonb, rather than ::text= ::text. But as Andrew points out to me offlist, “if that's what they want why aren't they using jsonb in the firstplace?” So I think that, up to the introduction of JSONB, it was important not to side one way or the other and put a JSON = operatorin core. But now what we have JSONB, perhaps it makes sense to finally take sides and intoduce JSON = that does plaintext comparison. Thoughts? Best, David
"David E. Wheeler" <david@justatheory.com> writes: > Some might argue that it ought to compare JSON objects, effectively be the equivalent of ::jsonb = ::jsonb, rather than::text = ::text. But as Andrew points out to me offlist, “if that's what they want why aren't they using jsonb in thefirst place?” > So I think that, up to the introduction of JSONB, it was important not to side one way or the other and put a JSON = operatorin core. But now what we have JSONB, perhaps it makes sense to finally take sides and intoduce JSON = that does plaintext comparison. Thoughts? Meh. Right now, if you want to compare values of type JSON, you have to either cast them to text or to jsonb, and that effectively declares which comparison semantics you want. I'm not sure that prejudging that is a good thing for us to do, especially when the argument that text semantics are what you would probably want is so weak. Andrew mentions in the extension you pointed to that providing a default comparison operator would enable people to do UNION, DISTINCT, etc on JSON columns without thinking about it. I'm not convinced that "without thinking about it" is a good thing here. But if we were going to enable that, I'd feel better about making it default to jsonb semantics ... regards, tom lane
On May 12, 2016, at 12:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andrew mentions in the extension you pointed to that providing a default > comparison operator would enable people to do UNION, DISTINCT, etc on JSON > columns without thinking about it. I'm not convinced that "without > thinking about it" is a good thing here. But if we were going to enable > that, I'd feel better about making it default to jsonb semantics ... If you want the JSONB semantics, why wouldn’t you use JSONB instead of JSON? Best, David
On 05/12/2016 03:02 PM, Tom Lane wrote: > "David E. Wheeler" <david@justatheory.com> writes: >> Some might argue that it ought to compare JSON objects, effectively be the equivalent of ::jsonb = ::jsonb, rather than::text = ::text. But as Andrew points out to me offlist, “if that's what they want why aren't they using jsonb in thefirst place?� >> So I think that, up to the introduction of JSONB, it was important not to side one way or the other and put a JSON = operatorin core. But now what we have JSONB, perhaps it makes sense to finally take sides and intoduce JSON = that does plaintext comparison. Thoughts? > Meh. Right now, if you want to compare values of type JSON, you have to > either cast them to text or to jsonb, and that effectively declares which > comparison semantics you want. I'm not sure that prejudging that is a > good thing for us to do, especially when the argument that text semantics > are what you would probably want is so weak. > > Andrew mentions in the extension you pointed to that providing a default > comparison operator would enable people to do UNION, DISTINCT, etc on JSON > columns without thinking about it. I'm not convinced that "without > thinking about it" is a good thing here. But if we were going to enable > that, I'd feel better about making it default to jsonb semantics ... > I think you've been a little liberal with quoting the docs ;-) The reason I made it an extension is precisely because it's not unambiguously clear what json equality should mean. cheers andrew
On 5/12/16 4:25 PM, David E. Wheeler wrote: > On May 12, 2016, at 12:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Andrew mentions in the extension you pointed to that providing a default >> comparison operator would enable people to do UNION, DISTINCT, etc on JSON >> columns without thinking about it. I'm not convinced that "without >> thinking about it" is a good thing here. But if we were going to enable >> that, I'd feel better about making it default to jsonb semantics ... > > If you want the JSONB semantics, why wouldn’t you use JSONB instead of JSON? Probably in an attempt to bypass parse overhead on ingestion. Possibly because JSONB silently eats duplicated keys while JSON doesn't (though in that case even casting to JSONB is probably not what you want). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461
On 5/11/16 7:05 PM, David E. Wheeler wrote: > On May 11, 2016, at 10:34 AM, Kevin Grittner <kgrittn@gmail.com> wrote: > >> I'm not clear enough on your intended usage to know whether these >> operators are a good fit, but they are sitting there waiting to be >> used if they do fit. > > Huh. I haven’t had any problems with IS DISTINCT FROM for rows, except for the situation in which a failure is thrown becausethe types vary, say between TEXT and CITEXT. That can drive the tester crazy, since it says something like: > > Results differ beginning at row 3: > have: (44,Anna) > want: (44,Anna) > > But overall I think that’s okay; the tester really does want to make sure the type is correct. Speaking specifically to is(), what I'd find most useful is if it at least hinted that there might be some type shenanigans going on, because I've run across something like your example more than once and it always takes a lot to finally figure out WTF is going on. I think it'd also be useful to be able to specify an equality operator to is(), though that means not using IS DISTINCT. Something else to keep in mind here is that is() is defined as is(anyelement, anyelement, text), which means you've lost your original type information when you use it. I don't think you could actually do anything useful here because of that. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461
On May 17, 2016, at 7:58 AM, Jim Nasby <Jim.Nasby@BlueTreble.com> wrote: > Probably in an attempt to bypass parse overhead on ingestion. > > Possibly because JSONB silently eats duplicated keys while JSON doesn't (though in that case even casting to JSONB is probablynot what you want). It’s also when you’d want text equivalent semantics. Best, David
Sorry for the pgTAP off-topicness here, hackers. Please feel free to ignore. On May 17, 2016, at 8:10 AM, Jim Nasby <Jim.Nasby@BlueTreble.com> wrote: > Speaking specifically to is(), what I'd find most useful is if it at least hinted that there might be some type shenanigansgoing on, because I've run across something like your example more than once and it always takes a lot to finallyfigure out WTF is going on. Agreed. Same for the relation testing functions. Maybe some additional diagnostics could be added in the event of failure. > I think it'd also be useful to be able to specify an equality operator to is(), though that means not using IS DISTINCT. You can use cmp_ok(). http://pgxn.org/dist/pgtap/doc/pgtap.html#cmp_ok. > Something else to keep in mind here is that is() is defined as is(anyelement, anyelement, text), which means you've lostyour original type information when you use it. I don't think you could actually do anything useful here because of that. pg_typeof() will give it to you. Best, David