Re: psycopg3 and adaptation choices

Поиск
Список
Период
Сортировка
От Adrian Klaver
Тема Re: psycopg3 and adaptation choices
Дата
Msg-id 2b9859f0-0964-2baa-b6bc-13f975ae0f67@aklaver.com
обсуждение исходный текст
Ответ на psycopg3 and adaptation choices  (Daniele Varrazzo <daniele.varrazzo@gmail.com>)
Ответы Re: psycopg3 and adaptation choices  (Daniele Varrazzo <daniele.varrazzo@gmail.com>)
Список psycopg
On 11/7/20 8:32 AM, Daniele Varrazzo wrote:
> Hello,
> 
> As you may know I'm implementing a new psycopg version using
> server-side argument binding. As a real-time test I'm writing a Django
> backend for it, and running Django test suite. This has uncovered
> unexpected behaviours caused by the different ways parameters are
> adapted by the backend as opposite as passed inline in the query.
> 
> I wrote down my findings at
> https://www.varrazzo.com/blog/2020/11/07/psycopg3-adaptation/ and
> would be happy to get some feedback. I'll copy the writing here too:
> please forget the reST.
> 
> Cheers,
> 
> -- Daniele
> 
> ---
> 
> One of the main behavioural differences between psycopg2 and 3, and a big
> enought cause of incompatibility to require a "3", is the use of server-side
> binding vs. client-side (psycopg2 predates the FE-BE protocol supporting it).
> Passing Python objects to the server as separate arguments leads sometimes to
> a behaviour of the server different to what people are used to using psql and
> psycopg2. An example: what do you think::
> 
>      cursor.execute("select %s, %s", ["hello", 10])
> 
> should fetch? I assume it would be desirable to have a record ``("hello",
> 10)`` back - consistently to what you would return if you merged argument
> client-side and queried for ``select 'hello', 10``. However keeping this
> behaviour predictable proved already tricky. When passing arguments to merge
> server-side, the client must pass their string (or binary) representation and,
> optionally, the oid of their types. The most intuitive thing to do would be to
> associate the text type oid (25) to Python strings. This works well for
> queries like the above one, but it proves too strict a type definition in
> context where a cast from text is not allowed. For instance this will fail::
> 
>      cursor.execute("create table testjson(data jsonb)")
>      cursor.execute("insert into testjson (data) values (%s)", ["{}"])
> 
> The latter will raise the same exception you would get in psql if you execute
> ``insert into testjson (data) values ('{}'::text)``. The reason why a ``values
> ('{}')`` works as expected in psql is because the untyped literal is interpreted
> as unknown, and the server can always cast from unknown to any other type. If
> we wanted to insert the data above the query should be rewritten as::
> 
>      cursor.execute("insert into testjson (data) values (%s::jsonb)", ["{}"])
> 
> About a week ago I started the task to write `a Django backend for
> psycopg3`__: running its test suite is proving a great way to observe the
> behaviour of server-side binding in a plethora of different environments. The
> problem of the non-implicit cast from text to jsonb made porting the backend
> from psycopg2 to 3 `quite a tricky matter`__.
> 
> .. __: https://github.com/dvarrazzo/django/commits/psycopg3
> .. __: https://github.com/dvarrazzo/django/commit/1ca8b71ba9bc6acfe239dd42f751037644e59e13
> 
> In order to provide an experience more similar to the use of psql and of
> psycopg2, one possibility is to not pass an oid type for the strings, leaving
> it unknown. When I tried with that approach, passing strings to jsonb fields
> (which is not necessarily what you would do, but it's what Django does it,
> probably to uniform JSON dumping across different adapter) became immediately
> much simpler. So it seems a good choice, but not an overall win: our "hello
> world" query::
> 
>      cursor.execute("select %s, %s", ["hello", 10])
> 
> fails with PostgreSQL 9.6 and older versions, returning an error such as
> *could not determine data type of parameter $1*. PostgreSQL 10 and following
> are more permissive and convert unknown to text on output.
> 
> It seems like there is a choice to make here: mine at the moment is to keep
> the latter behaviour, both because that's what allows to write the simpler
> code with the majority of the supported PostgreSQL versions. People using 9.6
> and previous version would need to adjust to::
> 
>      cursor.execute("select %s::text, %s", ["hello", 10])
> 
> but the use case of inserting data into tables seems a more important use case
> than parroting back a string after a database roundtrip.
> 
> What about the numbers? That's a problem thorny in a different way. Python
> integers are of arbitrary size, so they map better to a subset of the
> ``numeric`` type than to ``int4`` or ``int8``. However there is a similar
> problem, where seemingly familiar expression don't behave as expected. If this
> works in psql::
> 
>      select data -> 'a' -> 2 from testjson;
> 
> a similar statement with 2 as a parameter will fail with an error like
> *operator does not exist: jsonb -> numeric*. Passing unknown as oid would
> work, but now our greeting would return instead ``('hello', '10')``, with the
> number converted to string. This is probably too much of a departure from the
> expected behaviour; at least, unlike the test-to-json case, there is an
> automatic cast from ``numeric`` to integer, so an ``insert into
> integer_field`` works as expected. Only operators and functions whose
> arguments are declared ``integer`` and there is no ``numeric`` version trip
> into an error: JSON's ``->``, binary operators such as ``>>`` and ``&``, and
> few others::
> 
>      piro=# select current_date, current_date + 1;
>       current_date |  ?column?
>      --------------+------------
>       2020-11-07   | 2020-11-08
>      (1 row)
> 
>      piro=# select current_date + 1::numeric;
>      ERROR:  operator does not exist: date + numeric
>      LINE 1: select current_date + 1::numeric;
>                                  ^
>      HINT:  No operator matches the given name and argument types. You might
>      need to add explicit type casts.
> 
> Note that when Postgres says ``integer`` it means 4 bytes signed: defaulting
> the Python ``int`` to the ``bigint`` oid solves no problem - the same
> functions would fail the same way, and defaulting it to ``int4`` I feel it
> would overflow too easily in a 64 bits world.
> 
> How to fix that? as the hint suggests, the user would have to either add a
> cast to the query, which would look like::
> 
>      cur.execute("select current_date + %s::int", [offset])
> 
> or to specify an object that would get converted in Postgres to the oid of
> ``integer``: in psycopg3 I'm testing with adding subclasses of ``int`` called
> ``Int4``, ``Int8`` etc. to map more closely on the Postgres types::
> 
>      cur.execute("select current_date + %s", [Int4(offset)])
> 
> My choices so far are then:
> 
> - cast Python ``str`` to unknown by default (only alternative: ``text``)
> - cast Python ``int`` to ``numeric`` by default (alternatives: ``int4``,
>    ``int8``, ``unknown``).
> 
> Being the adaptation system flexible, people are able to override these
> choices, but I don't expect many people to do it, and doing it process-wise
> might cause interoperation problems across libraries. The behaviour
> out-of-the-box is obviously important and I would like to get the tradeoffs
> right.
> 
> What do you think?

If I'm following correctly in psycopg2 the adapter does type adaption on 
the client side and passes that to server with oid for processing. In 
psycopg3 you are proposing to let the server do more of the type 
adaption and to that end you are looking for lowest common denominator 
type to pass to server and then let it do the final casting. Is that 
about right or am I way off base?


-- 
Adrian Klaver
adrian.klaver@aklaver.com



В списке psycopg по дате отправления:

Предыдущее
От: Daniele Varrazzo
Дата:
Сообщение: psycopg3 and adaptation choices
Следующее
От: Daniele Varrazzo
Дата:
Сообщение: Re: psycopg3 and adaptation choices