Обсуждение: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

Поиск
Список
Период
Сортировка

PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

От
Merlin Göttlinger
Дата:
Hello fellow postgres users,

I am very new to postgres and databases in general. I needed a database for a project and I chose to use PostgreSQL. I googled around until I was confident enough to do the things I wanted to do.

I have written two applications that both use the same database.

The first application was a web application using the playframework. It accessed the database using anorm (the default library for database access from playframework which uses JDBC internally) which was not too happy about datatypes like macaddr or inet. Because you more or less write plain SQL-queries with anorm I was able to just cast those types to varchar when selecting, and from varchar when inserting. But this seemed rather ugly.

With the second application I decided to try another library for database access to see if handles those types better. I chose to use slick because the scala world seems to move towards typesafe more and more. Anyway this library of course also uses JDBC. The new auto generated code feature suggested I should use String in my code as the type of the columns with macaddr and inet which I did. When I tried to select values from the database I was quite happy to see that I got my values without a complaint about wrong types like with anorm. Then I tried inserting and was very disappointed to find out that the library tried to insert the values as varchar and the database refused to accept my inserts because they had the wrong type.

With the help of Craig Ringer I was able to resolve this problem by implementing implicit casts from varchar to macadr and inet but this seems rather unintuitive. I don't know if this is just a beginner problem but at least in my opinion it is rather complicated to use the postgres specific types and features from JDBC and its wrappers.

For example if the database would check if the varchar I try to insert into the column with type macaddr is actually a valid MAC address and then take it as macaddr this would have made my life a lot easier and safed me from hours of googeling and trying around. I don't know what problems this would introduce but I would love to see this made a bit simpler to provide a simpler start for PostgreSQL newcomers like me.

Anyway keep the good work up.

Sincearely,

Merlin Göttlinger

PS: English is not my native language so please excuse spelling or grammar errors.

Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

От
John R Pierce
Дата:
On 1/28/2014 12:11 PM, Merlin Göttlinger wrote:
> I don't know if this is just a beginner problem but at least in my
> opinion it is rather complicated to use the postgres specific types
> and features from JDBC and its wrappers.


these issues are entirely created by the wrappers you're using. like
most such, they are undoubtedly designed for a lowest common
denominator, and won't have support for database implementation specific
data types, just common stuff like strings, numbers.



--
john r pierce                                      37N 122W
somewhere on the middle of the left coast



Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

От
Tom Lane
Дата:
John R Pierce <pierce@hogranch.com> writes:
> On 1/28/2014 12:11 PM, Merlin G�ttlinger wrote:
>> I don't know if this is just a beginner problem but at least in my
>> opinion it is rather complicated to use the postgres specific types
>> and features from JDBC and its wrappers.

> these issues are entirely created by the wrappers you're using. like
> most such, they are undoubtedly designed for a lowest common
> denominator, and won't have support for database implementation specific
> data types, just common stuff like strings, numbers.

I think you can fix it by explicitly casting your placeholders, eg
"?::macaddr".

            regards, tom lane


Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

От
John R Pierce
Дата:
On 1/28/2014 1:20 PM, Tom Lane wrote:
> I think you can fix it by explicitly casting your placeholders, eg
> "?::macaddr".

that might work for a wrapper that lets you roll your own SQL, but I
thought he said one of these autogenerated SQL, taking it out of his
control.



--
john r pierce                                      37N 122W
somewhere on the middle of the left coast



Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

От
Mike Christensen
Дата:
I've had the same problem as well with NHibernate (On .NET) with Postgres ENUM types.  Luckily, NHibernate is incredibly powerful and you *can* get everything working flawlessly, however it takes some serious digging into the source code and reading the docs to figure it out.  The main issue is that NHibernate, out of the box, wants to map an ENUM as a number.  For example:

INSERT INTO FOO SomeEnumColumn VALUES (1);

This will cause an error, because PG is looking for a string value (Even though ENUMs are stored as numeric values under the covers).  It's pretty easy to configure NHibernate to convert ENUMs to strings (there's tons of blog posts on that)..  However, this causes NHibernate to write:

INSERT INTO FOO SomeEnumColumn VALUES ('EnumValue'::text);

Which will also cause an error.  I've found the only way around it is to configure NHibernate to treat ENUMs as "Objects" which will simply generate:

INSERT INTO FOO SomeEnumColumn VALUES ('EnumValue'); -- No casting here, yay!

This works.  However, to agree with the original poster's point, if Postgres could be a little more forgiving about values that could be interpreted as correct (like an implicit cast between numeric and enum and string and enum) then we wouldn't have these issues..

Mike


On Tue, Jan 28, 2014 at 1:37 PM, John R Pierce <pierce@hogranch.com> wrote:
On 1/28/2014 1:20 PM, Tom Lane wrote:
I think you can fix it by explicitly casting your placeholders, eg
"?::macaddr".

that might work for a wrapper that lets you roll your own SQL, but I thought he said one of these autogenerated SQL, taking it out of his control.




--
john r pierce                                      37N 122W
somewhere on the middle of the left coast



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

От
John R Pierce
Дата:
On 1/28/2014 2:35 PM, Mike Christensen wrote:
> This works.  However, to agree with the original poster's point, if
> Postgres could be a little more forgiving about values that could be
> interpreted as correct (like an implicit cast between numeric and enum
> and string and enum) then we wouldn't have these issues..

it had more implicit casts prior to (I think) 8.3, but there were many
ambiguities where things could be interpreted to mean radically
different sorts of operations, so they tightened things up in 8.3+ (or
was it 8.4+ ?)



--
john r pierce                                      37N 122W
somewhere on the middle of the left coast



Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

От
Mike Christensen
Дата:
Interesting!

I'd be curious as to what types of bugs were caused by these implicit casts..

Note 8.3 was in the days back before ORMs became popular, so "just write better SQL" was a perfectly decent solution to the problem back then.  Now days, this requirement might make Postgres incompatible with certain ORMs out there, which is a bummer.  I'm wondering if these ambiguities you speak of could be solved in other ways.  Such as implicitly cast iff the intention is not ambiguous, otherwise raise some sort of "ambiguous" error or default to some behavior.

Mike


On Tue, Jan 28, 2014 at 2:46 PM, John R Pierce <pierce@hogranch.com> wrote:
On 1/28/2014 2:35 PM, Mike Christensen wrote:
This works.  However, to agree with the original poster's point, if Postgres could be a little more forgiving about values that could be interpreted as correct (like an implicit cast between numeric and enum and string and enum) then we wouldn't have these issues..

it had more implicit casts prior to (I think) 8.3, but there were many ambiguities where things could be interpreted to mean radically different sorts of operations, so they tightened things up in 8.3+ (or was it 8.4+ ?)




--
john r pierce                                      37N 122W
somewhere on the middle of the left coast



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

От
Andrew Sullivan
Дата:
On Tue, Jan 28, 2014 at 02:55:03PM -0800, Mike Christensen wrote:

> I'd be curious as to what types of bugs were caused by these implicit
> casts..

Typically, they were cases when there was an ambiguity that the
programmer didn't understand, causing applications to blow up in
surprising and wonderful ways.

There are things you can do if you're really prepared for the gun
aimed at your feet.  Since you can create casts in Postgres, you can
actually add back many of the implicit casts yourself.

> Such as implicitly cast iff the intention is not ambiguous

I think if the developers could write code that read minds, they'd be
working on more profitable enterprises ;-)

Best,

A

--
Andrew Sullivan
ajs@crankycanuck.ca


Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

От
Mike Christensen
Дата:
How do you create casts in Postgres?


On Tue, Jan 28, 2014 at 3:24 PM, Andrew Sullivan <ajs@crankycanuck.ca> wrote:
On Tue, Jan 28, 2014 at 02:55:03PM -0800, Mike Christensen wrote:

> I'd be curious as to what types of bugs were caused by these implicit
> casts..

Typically, they were cases when there was an ambiguity that the
programmer didn't understand, causing applications to blow up in
surprising and wonderful ways.

There are things you can do if you're really prepared for the gun
aimed at your feet.  Since you can create casts in Postgres, you can
actually add back many of the implicit casts yourself.

> Such as implicitly cast iff the intention is not ambiguous

I think if the developers could write code that read minds, they'd be
working on more profitable enterprises ;-)

Best,

A

--
Andrew Sullivan
ajs@crankycanuck.ca


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

От
Mike Christensen
Дата:
Oh.  The CREATE CAST command.  Wow, I was totally unaware of this entire feature!


On Tue, Jan 28, 2014 at 3:36 PM, Mike Christensen <mike@kitchenpc.com> wrote:
How do you create casts in Postgres?


On Tue, Jan 28, 2014 at 3:24 PM, Andrew Sullivan <ajs@crankycanuck.ca> wrote:
On Tue, Jan 28, 2014 at 02:55:03PM -0800, Mike Christensen wrote:

> I'd be curious as to what types of bugs were caused by these implicit
> casts..

Typically, they were cases when there was an ambiguity that the
programmer didn't understand, causing applications to blow up in
surprising and wonderful ways.

There are things you can do if you're really prepared for the gun
aimed at your feet.  Since you can create casts in Postgres, you can
actually add back many of the implicit casts yourself.

> Such as implicitly cast iff the intention is not ambiguous

I think if the developers could write code that read minds, they'd be
working on more profitable enterprises ;-)

Best,

A

--
Andrew Sullivan
ajs@crankycanuck.ca


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

От
Andrew Sullivan
Дата:
On Tue, Jan 28, 2014 at 03:38:49PM -0800, Mike Christensen wrote:
> Oh.  The CREATE CAST command.  Wow, I was totally unaware of this entire
> feature!

See, this is why Postgres really is better than you ever think.  Just
when you're convinced that you have a totally impossible problem, it
turns out that one of the brilliant developers has not only thought of
it, but has fixed it while you weren't looking.  In a large-scale
generalized way that doesn't have whirling knives sticking out of it.
(Yes, I have been using some other DBMSes recently, and no, I don't
like it.)

A

--
Andrew Sullivan
ajs@crankycanuck.ca


Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

От
Craig Ringer
Дата:
On 01/29/2014 05:20 AM, Tom Lane wrote:
> John R Pierce <pierce@hogranch.com> writes:
>> On 1/28/2014 12:11 PM, Merlin Göttlinger wrote:
>>> I don't know if this is just a beginner problem but at least in my
>>> opinion it is rather complicated to use the postgres specific types
>>> and features from JDBC and its wrappers.
>
>> these issues are entirely created by the wrappers you're using. like
>> most such, they are undoubtedly designed for a lowest common
>> denominator, and won't have support for database implementation specific
>> data types, just common stuff like strings, numbers.
>
> I think you can fix it by explicitly casting your placeholders, eg
> "?::macaddr".

You're both missing the point here, because you're thinking about it
like these people are hand-writing SQL with string literals, and can
just edit their SQL to insert some casts.

They aren't, and they can't. They're using generated SQL - and I
strongly suspect the vast majority of Pg users are, with Hibernate,
ActiveRecord, etc etc etc.

They're also not using literals. They're using bind parameters, because
they're sensible or because and their tools encourage or force them to
do so. When using bind params you don't have PostgreSQL's convenient
'unknown' literal pseudo-type. This works:

    CREATE TABLE mac(addr macaddr);

    INSERT INTO mac(addr) VALUES ('00:AB:CD:EF:01:23')

but what most frameworks and client drivers will do is the
protocol-level equivalent of:

    PREPARE insert_stmt(text) AS INSERT INTO mac(addr) VALUES ($1);

    EXECUTE insert_stmt('00:AB:CD:EF:01:23');

which will bail out with:

regress=> PREPARE insert_stmt(text) AS
          INSERT INTO mac(addr) VALUES ($1);

ERROR:  column "addr" is of type macaddr but expression is of type text

This is only a problem in the first place because PostgreSQL is much,
much stricter than anything else about typing of inputs while also
having lots of handy types nothing else has. Users want to be able to
use these types, but find it difficult and painful because PostgreSQL
rejects inputs for them because it's being painfully pedantic about a
few details.

I just want us to allow, by default, implicit casts FROM text (not TO
text) using the input function for all PostgreSQL's validated
non-standard types (and XML due to limited deployment of SQL/XML support
in client drivers).

So you should be able to write:

    INSERT INTO mac(addr) VALUES ('00:AB:CD:EF:01:23'::text)

I know that's not beautiful, pure typing behaviour, and it'd be nice if
we could just complain or refuse it. But really, the _in function will
parse and reject invalid input, and if it's valid input, why are we
complaining?


We should allow implicit casts, using input functions, for at least:

* ENUMs
* xml, json, hstore
* macaddr, cidr, inet
* point, circle, ...
* uuid

as these are all types that are subject to validation on input, and
aren't going to have direct client driver support.


Here are "solutions" that *won't* work:


"Just bind string parameters as 'unknown'"
----

You can bind params as 'unknown' and let the planner figure it out:

    PREPARE insert_stmt(unknown) AS INSERT INTO mac(addr) VALUES ($1);
    EXECUTE insert_stmt('00:AB:CD:EF:01:23');

but if a driver sends all string-type data from the client language as
'unknown', that will cause overloaded functions that accept 'text' or
'varchar' to fail when they used to succeed with a concrete text-typed
bind parameter.



"Just fix the framework":
----

You *can't* "just cast the input" when you're working via pretty much
anything except directly written SQL. I prefer to do that, but I'm in
the minority - more and more people are using query generators and ORMs.
To use json, xml, macaddr, inet, etc users need to write extensions to
their tools to teach them about those types and map them to language types.

Sometimes the framework is hidden behind more layers. For example,
PgJDBC is often used by Hibernate or EclipseLink, *via the JPA2 API*,
another layer that makes it even harder to implement custom type mappings.



"Create the casts"
----

Alternately, users have to figure out how to CREATE CAST - and that it's
possible. Users can CREATE CAST a weaker cast for the type, but:

* If there's already a cast for the type in the catalogs you may have to
hack the catalogs instead, as CREATE CAST will fail; and

* It's a convoluted process requiring wrapper functions for most simple
inputs - because you can't:
    CREATE CAST (text AS json) WITH FUNCTION json_in(cstring)
Instead you need a wrapper function that takes "text", even though text
is implicitly castable to cstring.

More importantly, you need to know about CREATE CAST, and figure all
this out. Which sucks for people who just want to use "json".


--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

От
Tom Lane
Дата:
Mike Christensen <mike@kitchenpc.com> writes:
> Oh.  The CREATE CAST command.  Wow, I was totally unaware of this entire
> feature!

Before you get too excited about inserting your own implicit casts,
you really should get familiar with the reasons there aren't ones
already ;-).

As was mentioned upthread, we used to have rather more implicit casts than
we do today.  We took a bunch of them out in 7.3, and some more in 8.3,
resulting in much gnashing of teeth each time.  Some of it from people who
later admitted that the exercise of cleaning up their SQL code had exposed
previously-unrecognized bugs in their applications.

The more or less canonical example of what unrestrained implicit casting
can do to you is here:
http://www.postgresql.org/message-id/b42b73150702191339t71edd1bxa2510df0c4d75876@mail.gmail.com
in which an expression like this
    ((now() - '1 day'::interval)::timestamp - now()) < 0
behaved in a way not only nonsensical but locale-dependent.  Although it
looks sane on first glance, the left side actually produces a result of
type interval, and there is no "interval < integer" operator.  Modern
versions of PG tell you so, but what happened pre-8.3 was that the parser
would find the implicit coercions from interval to text and integer to
text, and conclude that it could legally implement this expression by
coercing both sides to text and applying the "text < text" operator.
Well, it ran, but it didn't give the results the user expected.

Variants on the theme can be found throughout our mail archives, at
least up till the pre-8.3 releases dropped out of general use.
Some fun ones I found in a quick troll:
http://www.postgresql.org/message-id/1536369C345BD4118148000629C9833D57EAB5@nifty.preston.traveltech.co.uk
http://www.postgresql.org/message-id/45D4E5A7.9060702@wykids.org
http://www.postgresql.org/message-id/E1Bg5qd-0001E8-00@ms2.city.ac.uk
http://www.postgresql.org/message-id/2793.1037034592@sss.pgh.pa.us
http://www.postgresql.org/message-id/12659.1071876784@sss.pgh.pa.us

The design rule we use now, which seems generally successful at preventing
such surprising behaviors, is to allow a cast to be implicit only when it
is a non-information-losing coercion from one type to another one in the
same basic type category.  For instance, integer to numeric is fine
(since, for instance, numeric comparison of two values is generally going
to be consistent with integer semantics).  Integer to text, not so much.

In the particular cases being mentioned here, such as enum versus text,
we'd be opening people up to this type of hurt if we added implicit casts,
because you could very well get a textual comparison where you'd expected
an enum-ordering-based comparison.  Or vice versa.

So we're not going there; we've already been there, and not liked it.
But you're free to repeat our mistakes if you insist.

            regards, tom lane


Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

От
Tom Lane
Дата:
Craig Ringer <craig@2ndquadrant.com> writes:
> I just want us to allow, by default, implicit casts FROM text (not TO
> text) using the input function for all PostgreSQL's validated
> non-standard types (and XML due to limited deployment of SQL/XML support
> in client drivers).

Sorry, that is *just* as dangerous as implicit casts to text were.
It would bite a different set of queries, but not any less painfully.

I have about zero sympathy for ORMs that aren't willing to address
this issue properly.  Yeah, it might be less than trivial, but that
doesn't mean that the right answer is to create semantic hazards
on the server side.

            regards, tom lane


Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

От
David Johnston
Дата:
Tom Lane-2 wrote
> Craig Ringer <

> craig@

> > writes:
>> I just want us to allow, by default, implicit casts FROM text (not TO
>> text) using the input function for all PostgreSQL's validated
>> non-standard types (and XML due to limited deployment of SQL/XML support
>> in client drivers).
>
> Sorry, that is *just* as dangerous as implicit casts to text were.
> It would bite a different set of queries, but not any less painfully.
>
> I have about zero sympathy for ORMs that aren't willing to address
> this issue properly.  Yeah, it might be less than trivial, but that
> doesn't mean that the right answer is to create semantic hazards
> on the server side.

Less dangerous since an error will eventually be thrown on non-conforming
input but we've still moved what is now a parse-time error into a runtime
error.

The "correct" solution would seem to be for ORMs to "bind unknown" against
the input but only when the supplied string is meant to be a representation
of a PostgreSQL type as opposed to being actual string data. The ORM is free
to provide the necessary API to distinguish between the two and the caller
has to know then database to call the proper method (i.e., not setString if
your data intended for anything besides a text/varchar column).

I'm not sure how function overloading can be solved no matter which implicit
cast procedure methodology you choose; though I haven't ponder it much.  The
goal should be for the ORM to basically tell PostgreSQL "here's some data,
you deal with it. That is exactly what the "unknown" moniker is for.  But if
they tell us it's a text typed value we believe them.

David J.




--
View this message in context:
http://postgresql.1045698.n5.nabble.com/PostgreSQL-specific-datatypes-very-confusing-for-beginners-who-use-wrappers-around-JDBC-tp5789352p5789429.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

От
Craig Ringer
Дата:
On 01/29/2014 02:01 PM, Tom Lane wrote:
> Craig Ringer <craig@2ndquadrant.com> writes:
>> I just want us to allow, by default, implicit casts FROM text (not TO
>> text) using the input function for all PostgreSQL's validated
>> non-standard types (and XML due to limited deployment of SQL/XML support
>> in client drivers).
>
> Sorry, that is *just* as dangerous as implicit casts to text were.
> It would bite a different set of queries, but not any less painfully.

I'd be interested in some examples of that. It's clear that implicit
casts *to* text were unsafe, but what makes implicit casts *from* text
*to* validated types with specific structures just as bad?

How does allowing an implicit cast from 'text' to 'macaddr' create the
opportunity for an undetected error?

Or 'hstore' ?

I'm not seeing the risk here.



This is a painful issue for a significant group; you can find them on
Stack Overflow, Rails and Django and Hibernate discussion boards, Java
user groups, etc. They've been taught not to care about the DB and write
"database-agnostic" code, but they're seeing Pg features that're so
useful that they'd like to bend that and start using some Pg features.
Only to find they can't do it without throwing away everything they have.

I wish ORMs would go away sometimes too, and I recognise that there are
certain kinds of broken and stupid that it makes no sense to cater to. I
just don't think this is one of them - this problem is universal, I
can't think of an ORM that *doesn't* have it, and it's created by
PostgreSQL, not the ORMs.

I guess this comes down to whether the goal is to be like Haskell -
pure, perfect, and used by nobody for anything real - or a pragmatic
tool for real world productive use.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

От
Craig Ringer
Дата:
On 01/29/2014 02:36 PM, David Johnston wrote:
>
> The "correct" solution would seem to be for ORMs to "bind unknown" against
> the input but only when the supplied string is meant to be a representation
> of a PostgreSQL type as opposed to being actual string data. The ORM is free
> to provide the necessary API to distinguish between the two and the caller
> has to know then database to call the proper method (i.e., not setString if
> your data intended for anything besides a text/varchar column).

I certainly agree that that's the ideal, and it's closer to achievable
than any other fix to these sorts of systems.

A challenge I've found when approaching this from the ORM side has been
getting people to care. The sentiment has tended to be along the lines
of: No other DBMS does this or requires this, why do we have to jump
through hoops just to make PostgreSQL happy?

I'm not claiming that's a good reason for inaction.

I think there's more hope of getting ORM systems to differentiate
between "unknown-typed literal" and "concrete text-typed literal" than
actually implementing proper support for Pg's numerous useful types. Not
much more hope, but some.

Look at the example that started this thread, though. The stack is:

PostgreSQL
  PgJDBC
    Java JDBC API
      EBean ORM
        Play! Framework

and *every level* needs to have a clue about this or a way to pass the
information trough transparently.

Now think about Hibernate, Sequel ORM, Django ORM, ActiveRecord,
EclipseLink, OpenJPA, DataMapper, Entity Framework, EBean, TopLink,
iBatis/MyBatis, Doctrine, Propel, CakePHP, Zend, SQLAlchemy, ....

Wouldn't it be nice if we could find a solution to this user pain point
in one place?

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

От
Florian Weimer
Дата:
On 01/29/2014 09:07 AM, Craig Ringer wrote:

> A challenge I've found when approaching this from the ORM side has been
> getting people to care. The sentiment has tended to be along the lines
> of: No other DBMS does this or requires this, why do we have to jump
> through hoops just to make PostgreSQL happy?

Is this true?  Can you use other JDBC drivers (except SQLite) to insert
Java Strings into NUMERIC columns and Java ints into text columns?

> Look at the example that started this thread, though. The stack is:
>
> PostgreSQL
>    PgJDBC
>      Java JDBC API
>        EBean ORM
>          Play! Framework
>
> and *every level* needs to have a clue about this or a way to pass the
> information trough transparently.
>
> Now think about Hibernate, Sequel ORM, Django ORM, ActiveRecord,
> EclipseLink, OpenJPA, DataMapper, Entity Framework, EBean, TopLink,
> iBatis/MyBatis, Doctrine, Propel, CakePHP, Zend, SQLAlchemy, ....
>
> Wouldn't it be nice if we could find a solution to this user pain point
> in one place?

What about using types on the PostgreSQL side which match the
application types?

In any case, use *can* use strings everywhere if you use the
stringtype=unspecified connection parameter:

<http://jdbc.postgresql.org/documentation/92/connect.html#connection-parameters>

--
Florian Weimer / Red Hat Product Security Team


Re: Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

От
Merlin Moncure
Дата:
On Wed, Jan 29, 2014 at 2:07 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
> On 01/29/2014 02:36 PM, David Johnston wrote:
>>
>> The "correct" solution would seem to be for ORMs to "bind unknown" against
>> the input but only when the supplied string is meant to be a representation
>> of a PostgreSQL type as opposed to being actual string data. The ORM is free
>> to provide the necessary API to distinguish between the two and the caller
>> has to know then database to call the proper method (i.e., not setString if
>> your data intended for anything besides a text/varchar column).
>
> I certainly agree that that's the ideal, and it's closer to achievable
> than any other fix to these sorts of systems.
>
> A challenge I've found when approaching this from the ORM side has been
> getting people to care. The sentiment has tended to be along the lines
> of: No other DBMS does this or requires this, why do we have to jump
> through hoops just to make PostgreSQL happy?
>
> I'm not claiming that's a good reason for inaction.
>
> I think there's more hope of getting ORM systems to differentiate
> between "unknown-typed literal" and "concrete text-typed literal" than
> actually implementing proper support for Pg's numerous useful types. Not
> much more hope, but some.
>
> Look at the example that started this thread, though. The stack is:
>
> PostgreSQL
>   PgJDBC
>     Java JDBC API
>       EBean ORM
>         Play! Framework
>
> and *every level* needs to have a clue about this or a way to pass the
> information trough transparently.
>
> Now think about Hibernate, Sequel ORM, Django ORM, ActiveRecord,
> EclipseLink, OpenJPA, DataMapper, Entity Framework, EBean, TopLink,
> iBatis/MyBatis, Doctrine, Propel, CakePHP, Zend, SQLAlchemy, ....
>
> Wouldn't it be nice if we could find a solution to this user pain point
> in one place?

Hi Merlin!

The solution is obvious: "fix the ORM, or stop using it".  Don't even
get me started on hibernate -- it reserves (or at least did for a very
long time) the colon character to itself in an inescapable fashion and
does lots of other stupid things that are annoying in the extreme.

If you use a library that writes your SQL for you, you're just going
to have to limit your database features to what the ORM supports.
IMSNHO Any technology that hides the SQL statement from the programmer
or hacks it up in some unpreventable way should be avoided.  It's not
the database's job to work around them.  To those of you stuck in ORM
limbo, my advice would be to stick to basic types.  I would also
advise keeping as much business logic in the database as possible to
make the inevitable porting effort into a more intelligently designed
application stack easier.

merlin


Re: Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

От
Adrian Klaver
Дата:
On 01/29/2014 12:07 AM, Craig Ringer wrote:
> On 01/29/2014 02:36 PM, David Johnston wrote:
>>
>> The "correct" solution would seem to be for ORMs to "bind unknown" against
>> the input but only when the supplied string is meant to be a representation
>> of a PostgreSQL type as opposed to being actual string data. The ORM is free
>> to provide the necessary API to distinguish between the two and the caller
>> has to know then database to call the proper method (i.e., not setString if
>> your data intended for anything besides a text/varchar column).
>
> I certainly agree that that's the ideal, and it's closer to achievable
> than any other fix to these sorts of systems.
>
> A challenge I've found when approaching this from the ORM side has been
> getting people to care. The sentiment has tended to be along the lines
> of: No other DBMS does this or requires this, why do we have to jump
> through hoops just to make PostgreSQL happy?
>
> I'm not claiming that's a good reason for inaction.
>
> I think there's more hope of getting ORM systems to differentiate
> between "unknown-typed literal" and "concrete text-typed literal" than
> actually implementing proper support for Pg's numerous useful types. Not
> much more hope, but some.
>
> Look at the example that started this thread, though. The stack is:
>
> PostgreSQL
>    PgJDBC
>      Java JDBC API
>        EBean ORM
>          Play! Framework
>
> and *every level* needs to have a clue about this or a way to pass the
> information trough transparently.

And therein lies the problem, in the pursuit of simplicity, application
developers have embraced complicated stacks. A change at any point in
the above stack has the potential to unravel the whole system. So it is
not obvious to me that 'fixing' one end of the stack is going to solve
the problem as a whole.

>
> Now think about Hibernate, Sequel ORM, Django ORM, ActiveRecord,
> EclipseLink, OpenJPA, DataMapper, Entity Framework, EBean, TopLink,
> iBatis/MyBatis, Doctrine, Propel, CakePHP, Zend, SQLAlchemy, ....
>
> Wouldn't it be nice if we could find a solution to this user pain point
> in one place?

See above, I do not think that is possible.
>


--
Adrian Klaver
adrian.klaver@gmail.com


Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

От
Andrew Sullivan
Дата:
On Wed, Jan 29, 2014 at 03:05:23PM +0800, Craig Ringer wrote:
> This is a painful issue for a significant group

Yes.

> They've been taught not to care about the DB and write
> "database-agnostic" code, but they're seeing Pg features that're so
> useful that they'd like to bend that and start using some Pg features.
> Only to find they can't do it without throwing away everything they have.

Also yes.  This is important.  The _whole point_ of ORMs is that
they're hiding the underlying details of the database implementation;
in practice, this turns out to be lowest common denominator where
"lowest" is pretty low.  "Database-agnostic" code is, by definition,
not going to use database-specific features.

The idea that you can have a database-agnostic ORM that can use all
the clever features of the underlying database system is just
confused.  You can't have this both ways, and other strongly-typed
database systems don't do a whole lot better with this (often, you
will discover that the automatically-generated schemas these ORMs
produce use type text everywhere, for exactly this reason).  People
who insist that this ought to be possible in the general case are
saying, "I want a pony."

What you might do, however, is generate a bunch of CREATE CAST
statements for the implicit casts from text you want.  This is
dangerous for all the reasons Tom noted, but it might be actually good
enough for the ORM cases you're worried about.  If you think that,
maybe the right answer is to start up a project like
"ORM-cast-footgun" or whatever and create the relevant casts in a
generalized way.

> I guess this comes down to whether the goal is to be like Haskell -
> pure, perfect, and used by nobody for anything real - or a pragmatic
> tool for real world productive use.

That's a fun false dichotomy, but Postgres is a general purpose tool
and therefore needs to ship by default with the safest general purpose
behaviour.  Unlike many other systems, however, Postgres actually
gives you the power to adjust its behaviour according to your use
case, and there's no reason not to use that when appropriate.

Best regards,

A

--
Andrew Sullivan
ajs@crankycanuck.ca


Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

От
Tom Lane
Дата:
Craig Ringer <craig@2ndquadrant.com> writes:
> On 01/29/2014 02:01 PM, Tom Lane wrote:
>> Sorry, that is *just* as dangerous as implicit casts to text were.
>> It would bite a different set of queries, but not any less painfully.

> I'd be interested in some examples of that. It's clear that implicit
> casts *to* text were unsafe, but what makes implicit casts *from* text
> *to* validated types with specific structures just as bad?

The problem with an implicit cast is that it might or might not happen,
either way being contrary to the user's nearly-subconscious expectation.
If the cast isn't applied, then validation that might have been done
by the destination type is moot.

In the cases where current policy deems an implicit cast to be safe,
it doesn't matter a whole lot because the semantics of most operators
are about the same for either the source or destination type; even
if the user doesn't understand exactly which operator will be used for
"2 < 2.5", he's unlikely to be surprised by the results.  This happy
state of affairs doesn't hold for casts between text and some random
other type though.

Or in short, the risk factor here is that the user might write his
query assuming that an implicit cast from text would get applied,
but it doesn't and the operation proceeds using textual semantics.
This'd affect a different set of queries from the cases with
implicit casts to text, but the outcome is rather similar.

Prior to 7.3, Postgres did actually allow a bunch of implicit
coercions from text, and they caused their share of problems.

> This is a painful issue for a significant group; you can find them on
> Stack Overflow, Rails and Django and Hibernate discussion boards, Java
> user groups, etc. They've been taught not to care about the DB and write
> "database-agnostic" code, but they're seeing Pg features that're so
> useful that they'd like to bend that and start using some Pg features.
> Only to find they can't do it without throwing away everything they have.

Well, that's the end result of being "database agnostic", when it's
defined in the terms these libraries have traditionally used, which
really is "we're going to pick a tiny subset of SQL and code strictly
to that".

I think the burden is on them to figure out how their abstractions
should be generalized.  What you're suggesting is that we poke a
big hole in our type system to let them avoid doing the first round
of necessary fixes --- but most likely, there are still things they'd
need to change even if we did so, leaving us with a hole in our type
system and not much to show for it.

> I wish ORMs would go away sometimes too, and I recognise that there are
> certain kinds of broken and stupid that it makes no sense to cater to. I
> just don't think this is one of them - this problem is universal, I
> can't think of an ORM that *doesn't* have it, and it's created by
> PostgreSQL, not the ORMs.

Uh, no, it's created by ORMs that haven't heard of type extensibility.
The reason they don't have this problem with other databases is exactly
because those other databases don't have type extensibility.

            regards, tom lane


Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

От
Adrian Klaver
Дата:
On 01/29/2014 08:29 AM, Tom Lane wrote:
> Craig Ringer <craig@2ndquadrant.com> writes:
>> On 01/29/2014 02:01 PM, Tom Lane wrote:

>
>> I wish ORMs would go away sometimes too, and I recognise that there are
>> certain kinds of broken and stupid that it makes no sense to cater to. I
>> just don't think this is one of them - this problem is universal, I
>> can't think of an ORM that *doesn't* have it, and it's created by
>> PostgreSQL, not the ORMs.
>
> Uh, no, it's created by ORMs that haven't heard of type extensibility.
> The reason they don't have this problem with other databases is exactly
> because those other databases don't have type extensibility.

Agreed. An ORM that has tackled this issue is SQLAlchemy. It has the
concept of database dialects and uses that in hand with the extendable
sqlalchemy.types to deal with database specific types.


>
>             regards, tom lane
>
>


--
Adrian Klaver
adrian.klaver@gmail.com


Re: Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

От
Kevin Grittner
Дата:
Florian Weimer <fweimer@redhat.com> wrote:

> Can you use other JDBC drivers (except SQLite) to insert
> Java Strings into NUMERIC columns and Java ints into text
> columns?

The JDBC specification can be downloaded from here:

http://download.oracle.com/otndocs/jcp/jdbc-4_1-mrel-spec/index.html

Check out appendix B.  According to the charts there, any of these
Java types (from table B-5):

  String
  java.math.BigDecimal
  Boolean
  Byte
  Short
  Integer
  Long
  Float
  Double

should be assignable using setObject and setNull to these JDBC
target types:

  TINYINT
  SMALLINT
  INTEGER
  BIGINT
  REAL
  FLOAT
  DOUBLE
  DECIMAL
  NUMERIC  BIT
  BOOLEAN
  CHAR
  VARCHAR
  LONGVARCHAR

In addition (from the same table) String should be assignable to:

  BINARY
  VARBINARY
  LONGVARBINARY
  DATE
  TIME
  TIMESTAMP

So, it's at least partly a question of whether we want to conform
to the JDBC specification.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

От
Craig Ringer
Дата:
On 01/29/2014 08:03 PM, Florian Weimer wrote:
> On 01/29/2014 09:07 AM, Craig Ringer wrote:
>
>> A challenge I've found when approaching this from the ORM side has been
>> getting people to care. The sentiment has tended to be along the lines
>> of: No other DBMS does this or requires this, why do we have to jump
>> through hoops just to make PostgreSQL happy?
>
> Is this true?  Can you use other JDBC drivers (except SQLite) to insert
> Java Strings into NUMERIC columns

That'll require direct JDBC tests, which I don't presently have time to
set up for the DBs of interest (downloading demo versions or finding
ones I can mess with, etc).

The following tests based on simple SQL expressions, using CASTs to
force interpretation of values as character-typed, should be informative
though.

> and Java ints into text columns?

I wasn't ever suggesting that; that's why I'm being quite specific about
referring to implicit casts *FROM* text. Not tested.



In the following tests the client interface used is JDBC, but it's used
to execute statements directly, not with bind params. So it's more like
running the statement in psql or equivalent; for that reason, CASTs are
used to force explicit types.


Oracle: Behaves much like PostgreSQL for its own custom types, and
SQL/XML, but permissive for NUMERIC and TIMESTAMP (appears to fit the
JDBC spec):

http://sqlfiddle.com/#!4/cc065/5
http://sqlfiddle.com/#!4/cc065/6
http://sqlfiddle.com/#!4/1adc6/4



MS SQL server 2012: permissive

http://sqlfiddle.com/#!6/57662/3


PostgreSQL: Strict

http://sqlfiddle.com/#!15/596f78/2
http://sqlfiddle.com/#!15/596f78/3
http://sqlfiddle.com/#!15/596f78/4
http://sqlfiddle.com/#!15/596f78/6


MySQL 5.5: Permissive (surprise!)

http://sqlfiddle.com/#!2/89152



I don't really care about dredging up DB2, Firebird, etc.

So of the set tested, we're the strictest, Oracle is next-strictest and
looks like it's as strict as it can be while remaining JDBC compliant. I
was surprised to see that it won't accept character literal input for
its XmlType and UriType.

Other servers are more permissive about inputs.

> What about using types on the PostgreSQL side which match the
> application types?

So, unless your language has a native json type, or standard json
library that the PostgreSQL client driver can rely on being used, you
shouldn't be able to use json in PostgreSQL?

Not convinced.

> In any case, use *can* use strings everywhere if you use the
> stringtype=unspecified connection parameter:
>
> <http://jdbc.postgresql.org/documentation/92/connect.html#connection-parameters>

Yes, as I mentioned upthread, this is true for JDBC with the caveat that
it'll cause problems with function overload resolution.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services