Обсуждение: Executing on the connection?

Поиск
Список
Период
Сортировка

Executing on the connection?

От
Daniele Varrazzo
Дата:
Hello,

if there is a thing that people approaching psycopg find confusing is
the connection/cursor duality. The connection object as a wrapper for
the database connection is clear to understand for anyone who has used
psql. But the cursor as a means to give command is spurious: you can
give commands but all the cursors are serialised, not parallelised, on
the same connection, and it isn't useful for transactions either...

The only unique thing a cursor has is to hold a result and consume it;
for the rest it doesn't own neither the postgres session nor the
transaction. But it gives them command. Weird.

You are surely familiar with the psycopg2 usage pattern:

    conn = psycopg2.connect(dsn)
    cur = conn.cursor()
    cur.execute(query, params):
    for record in cur:
        ... # do something

The cursor() can take parameters, e.g. to create server-side "named"
cursors, but most often people will use the standard client-side
cursor, which is a lightweight object, little more than a wrapper for
a PGresult.

One little change I've made to psycopg3 cursors is to make it return
"self" on execute() (it currently returns None, so it's totally
unused). This allows chaining a fetch operation right after execute,
so the pattern above can be reduced to:

    conn = psycopg3.connect(dsn)
    cur = conn.cursor()
    record = cur.execute(query, params).fetchone()
    # or
    for record in cur.execute(query, params):
        ... # do something

And of course nastymess such as:

    conn.cursor().execute(query).fetchone()
    psycopg3.connect(dsn).cursor().execute(query).fetchone()

But, taste.

I'm toying with the idea of adding a 'connection.execute(query,
[params])' methd, which would basically just create a cursor
internally, query on it, and return it. No parameter could be passed
to the cursor() call, so it could only create the most standard,
client-side cursor (or whatever the default for the connection is, if
there is some form of cursor_factory, which hasn't been implemented in
psycopg3 yet). For anything more fancy, cursor() should be called
explicitly.

As a result people could use:

    conn = psycopg3.connect(dsn)
    record = conn.execute(query, params).fetchone()
    # or
    for record in conn.execute(query, params):
        ... # do something

No other methods bloating the connection interface: no executemany(),
copy(), callproc (actually there will be no callproc at all in
psycopg3: postgres has no fast path for function call and too much
semantics around stored procedure that a single callproc() couldn't
cover).

Being the cursor client-side, its close() doesn't actually do anythin
apart from making it unusable, so just disposing of it without calling
close() is totally safe.

Thoughts?

Cheers!

-- Daniele



Re: Executing on the connection?

От
Christophe Pettus
Дата:

> On Dec 2, 2020, at 03:20, Daniele Varrazzo <daniele.varrazzo@gmail.com> wrote:
> One little change I've made to psycopg3 cursors is to make it return
> "self" on execute() (it currently returns None, so it's totally
> unused).

Very nice!

> As a result people could use:
>
>    conn = psycopg3.connect(dsn)
>    record = conn.execute(query, params).fetchone()
>    # or
>    for record in conn.execute(query, params):
>        ... # do something

+1.  I think it would be a handy addition.

--
-- Christophe Pettus
   xof@thebuild.com




Re: Executing on the connection?

От
Rory Campbell-Lange
Дата:
On 02/12/20, Daniele Varrazzo (daniele.varrazzo@gmail.com) wrote:
> I'm toying with the idea of adding a 'connection.execute(query,
> [params])' methd, which would basically just create a cursor
> internally, query on it, and return it. No parameter could be passed
> to the cursor() call, so it could only create the most standard,
> client-side cursor (or whatever the default for the connection is, if
> there is some form of cursor_factory, which hasn't been implemented in
> psycopg3 yet). For anything more fancy, cursor() should be called
> explicitly.
> 
> As a result people could use:
> 
>     conn = psycopg3.connect(dsn)
>     record = conn.execute(query, params).fetchone()
>     # or
>     for record in conn.execute(query, params):
>         ... # do something

The only issue I see with this sort of approach is the problem of having
to set the search path, which I believe needs to be done for each
cursor where custom search paths are used.

Maybe there can be a with context to allow explicit cursor setup and
teardown around an execute statement if needed, or to provide settings
to the execute function.

> No other methods bloating the connection interface: no executemany(),
> copy(), callproc (actually there will be no callproc at all in
> psycopg3: postgres has no fast path for function call and too much
> semantics around stored procedure that a single callproc() couldn't
> cover).

Ah. We presently use callproc a lot, together with NamedTupleConnection,
which works brilliantly for us.

Rory



Re: Executing on the connection?

От
Marco Beri
Дата:
On Wed, 2 Dec 2020 at 12:20, Daniele Varrazzo <daniele.varrazzo@gmail.com> wrote:
One little change I've made to psycopg3 cursors is to make it return
"self" on execute() (it currently returns None, so it's totally
unused). This allows chaining a fetch operation right after execute,
so the pattern above can be reduced to:

    conn = psycopg3.connect(dsn)
    cur = conn.cursor()
    record = cur.execute(query, params).fetchone()
    # or
    for record in cur.execute(query, params):
        ... # do something 
 

I'm toying with the idea of adding a 'connection.execute(query,
[params])' methd, which would basically just create a cursor
internally, query on it, and return it. No parameter could be passed
to the cursor() call, so it could only create the most standard,
client-side cursor (or whatever the default for the connection is, if
there is some form of cursor_factory, which hasn't been implemented in
psycopg3 yet). For anything more fancy, cursor() should be called
explicitly.

As a result people could use:

    conn = psycopg3.connect(dsn)
    record = conn.execute(query, params).fetchone()
    # or
    for record in conn.execute(query, params):
        ... # do something

No other methods bloating the connection interface: no executemany(),
copy(), callproc (actually there will be no callproc at all in
psycopg3: postgres has no fast path for function call and too much
semantics around stored procedure that a single callproc() couldn't
cover).

Being the cursor client-side, its close() doesn't actually do anythin
apart from making it unusable, so just disposing of it without calling
close() is totally safe.

Thoughts?

I like it a lot!

Ciao.
Marco.

Re: Executing on the connection?

От
Daniele Varrazzo
Дата:
On Wed, 2 Dec 2020 at 12:48, Rory Campbell-Lange
<rory@campbell-lange.net> wrote:

> The only issue I see with this sort of approach is the problem of having
> to set the search path, which I believe needs to be done for each
> cursor where custom search paths are used.

No, being the search path a property of the session, it is a property
of the connection, and all the cursors on the same connection will
share the same.

The use case suggested is only for the barebone use of connection +
cursor + execute + fetch. I wouldn't advise it as the most generic
usage, only as a shortcut.


> > ...actually there will be no callproc at all in
> > psycopg3...

> Ah. We presently use callproc a lot, together with NamedTupleConnection,
> which works brilliantly for us.

The problem with callproc is that it has a confusing semantic in
itself: from the DBAPI
<https://www.python.org/dev/peps/pep-0249/#callproc>:

    """
    The result of the call is returned as modified copy of the input sequence.
    Input parameters are left untouched, output and input/output
parameters replaced
    with possibly new values.
    """

which 1) is silly unless you are writing C, 2) in postgres you don't
know if the params are I/O unless you do extra query to query the
catalog. Maybe it's modelled after databases using this precise same
interface, but postgres doesn't.

On top of that, Postgres also provides named parameters, which are
supported in psycopg2, and were in psycopg3. You can see how sprawling
it was to convert named and positional parameters from this commit,
where I finally pulled out the plug

<https://github.com/psycopg/psycopg3/commit/9369a3b3717c12d4a398fe0934994ad19cbc564a#diff-9e33206c8c2b3b18daec4b778457f6fc54cf03754747cd476833048fc374b9e6L364>.

On top of on the top of that, recent PostgreSQL added a new concept of
procedure, different from what is called stored procedure, and which
require CALL instead of SELECT:
<https://www.postgresql.org/docs/13/sql-call.html> Again, without
knowing the catalog, you cannot tell by name which is which, but
people would expect us to know:
<https://github.com/psycopg/psycopg2/issues/1155>.

If you know which parameters are positional and which are named, if
you know whether a stored procedure is a function or a new procedure,
users can just use cursor.execute(): it will be faster than composing
SQL dynamically. Using the newly available fetch*() chaining you can
get results directly from the call, which in psycopg2 needs two lines
of code:

    result = cur.execute("SELECT my_function(%s, 42, foo=%s)", ["bar",
baz]).fetchone()

This is pretty much the most efficient thing that can be done (bar
making it a prepared statement), because the libpq/postgres don't
offer any fast-path for function calls. There is this
<https://www.postgresql.org/docs/current/libpq-fastpath.html>, but
it's deprecatish, non-feature-complete, and requires a pg_catalog
lookup to find the function oid.

Hope this explains the rationale behind dropping callproc (which is an
optional method of the DBAPI so it doesn't make us less
standard-compliant by not implementing it). Feedback is welcome anyway
:)

Cheers,

-- Daniele



Re: Executing on the connection?

От
Vladimir Ryabtsev
Дата:
I support this idea. Other drivers have such shortcuts (or even lack cursor() equivalent at all).

I would even prefer a shorter shortcut, like

conn.fetchone(query, params)

But it breaks parameter consistency at some degree and bloats connection interface, so it's up to you.

Vladimir

On Wed, 2 Dec 2020 at 03:20, Daniele Varrazzo <daniele.varrazzo@gmail.com> wrote:
Hello,

if there is a thing that people approaching psycopg find confusing is
the connection/cursor duality. The connection object as a wrapper for
the database connection is clear to understand for anyone who has used
psql. But the cursor as a means to give command is spurious: you can
give commands but all the cursors are serialised, not parallelised, on
the same connection, and it isn't useful for transactions either...

The only unique thing a cursor has is to hold a result and consume it;
for the rest it doesn't own neither the postgres session nor the
transaction. But it gives them command. Weird.

You are surely familiar with the psycopg2 usage pattern:

    conn = psycopg2.connect(dsn)
    cur = conn.cursor()
    cur.execute(query, params):
    for record in cur:
        ... # do something

The cursor() can take parameters, e.g. to create server-side "named"
cursors, but most often people will use the standard client-side
cursor, which is a lightweight object, little more than a wrapper for
a PGresult.

One little change I've made to psycopg3 cursors is to make it return
"self" on execute() (it currently returns None, so it's totally
unused). This allows chaining a fetch operation right after execute,
so the pattern above can be reduced to:

    conn = psycopg3.connect(dsn)
    cur = conn.cursor()
    record = cur.execute(query, params).fetchone()
    # or
    for record in cur.execute(query, params):
        ... # do something

And of course nastymess such as:

    conn.cursor().execute(query).fetchone()
    psycopg3.connect(dsn).cursor().execute(query).fetchone()

But, taste.

I'm toying with the idea of adding a 'connection.execute(query,
[params])' methd, which would basically just create a cursor
internally, query on it, and return it. No parameter could be passed
to the cursor() call, so it could only create the most standard,
client-side cursor (or whatever the default for the connection is, if
there is some form of cursor_factory, which hasn't been implemented in
psycopg3 yet). For anything more fancy, cursor() should be called
explicitly.

As a result people could use:

    conn = psycopg3.connect(dsn)
    record = conn.execute(query, params).fetchone()
    # or
    for record in conn.execute(query, params):
        ... # do something

No other methods bloating the connection interface: no executemany(),
copy(), callproc (actually there will be no callproc at all in
psycopg3: postgres has no fast path for function call and too much
semantics around stored procedure that a single callproc() couldn't
cover).

Being the cursor client-side, its close() doesn't actually do anythin
apart from making it unusable, so just disposing of it without calling
close() is totally safe.

Thoughts?

Cheers!

-- Daniele


Re: Executing on the connection?

От
Adrian Klaver
Дата:
On 12/2/20 3:20 AM, Daniele Varrazzo wrote:
> Hello,
> 

> 
> As a result people could use:
> 
>      conn = psycopg3.connect(dsn)
>      record = conn.execute(query, params).fetchone()
>      # or
>      for record in conn.execute(query, params):
>          ... # do something

That is what the sqlite3 module does and I find it handy.

> 
> No other methods bloating the connection interface: no executemany(),
> copy(), callproc (actually there will be no callproc at all in

Just to be clear the above(callproc excepted) or some variation of it 
will still be available off the cursor interface, correct?

> psycopg3: postgres has no fast path for function call and too much
> semantics around stored procedure that a single callproc() couldn't
> cover).
> 
> Being the cursor client-side, its close() doesn't actually do anythin
> apart from making it unusable, so just disposing of it without calling
> close() is totally safe.
> 
> Thoughts?
> 
> Cheers!
> 
> -- Daniele
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Executing on the connection?

От
Daniele Varrazzo
Дата:
On Wed, 2 Dec 2020 at 15:02, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

> Just to be clear the above(callproc excepted) or some variation of it
> will still be available off the cursor interface, correct?

Yes, that's correct. The cursor remains with the methods it is known
for. This Connection.execute() would only be the moral equivalent of:

    class Connection:
        def execute(self, query, params=None):
            cur = self.cursor()
            cur.execute(query, params)
            return cur

but nothing would be removed from the cursor interface.

-- Daniele



Re: Executing on the connection?

От
Karsten Hilbert
Дата:
> > One little change I've made to psycopg3 cursors is to make it return
> > "self" on execute() (it currently returns None, so it's totally
> > unused).

+1


> > As a result people could use:
> >
> >    conn = psycopg3.connect(dsn)
> >    record = conn.execute(query, params).fetchone()
> >    # or
> >    for record in conn.execute(query, params):
> >        ... # do something

+1

Karsten
--
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B



Re: Executing on the connection?

От
Daniele Varrazzo
Дата:
On Wed, 2 Dec 2020 at 11:20, Daniele Varrazzo
<daniele.varrazzo@gmail.com> wrote:

> I'm toying with the idea of adding a 'connection.execute(query,
> [params])' methd, which would basically just create a cursor
> internally, query on it, and return it. No parameter could be passed
> to the cursor() call, so it could only create the most standard,
> client-side cursor (or whatever the default for the connection is, if
> there is some form of cursor_factory, which hasn't been implemented in
> psycopg3 yet). For anything more fancy, cursor() should be called
> explicitly.

This is what I've pushed earlier:

https://www.psycopg.org/psycopg3/docs/connection.html#psycopg3.Connection.execute

Feedback is welcome. Cheers!

-- Daniele



Re: Executing on the connection?

От
Denis Laxalde
Дата:
Daniele Varrazzo a écrit :
> On Wed, 2 Dec 2020 at 11:20, Daniele Varrazzo
> <daniele.varrazzo@gmail.com> wrote:
> 
>> I'm toying with the idea of adding a 'connection.execute(query,
>> [params])' methd, which would basically just create a cursor
>> internally, query on it, and return it. No parameter could be passed
>> to the cursor() call, so it could only create the most standard,
>> client-side cursor (or whatever the default for the connection is, if
>> there is some form of cursor_factory, which hasn't been implemented in
>> psycopg3 yet). For anything more fancy, cursor() should be called
>> explicitly.
> 
> This is what I've pushed earlier:
> 
> https://www.psycopg.org/psycopg3/docs/connection.html#psycopg3.Connection.execute
> 

It did not strike me earlier, but it looks a bit weird to have 
connection.execute() return a "cursor" to read results while this 
"cursor" can also be used to execute commands. So, perhaps, another 
object, with only the interface for result retrieval would be more 
appropriate?

Otherwise, that's a very nice addition, thanks!



Re: Executing on the connection?

От
Adrian Klaver
Дата:
On 12/8/20 5:51 AM, Denis Laxalde wrote:
> Daniele Varrazzo a écrit :
>> On Wed, 2 Dec 2020 at 11:20, Daniele Varrazzo
>> <daniele.varrazzo@gmail.com> wrote:
>>
>>> I'm toying with the idea of adding a 'connection.execute(query,
>>> [params])' methd, which would basically just create a cursor
>>> internally, query on it, and return it. No parameter could be passed
>>> to the cursor() call, so it could only create the most standard,
>>> client-side cursor (or whatever the default for the connection is, if
>>> there is some form of cursor_factory, which hasn't been implemented in
>>> psycopg3 yet). For anything more fancy, cursor() should be called
>>> explicitly.
>>
>> This is what I've pushed earlier:
>>
>> https://www.psycopg.org/psycopg3/docs/connection.html#psycopg3.Connection.execute 
>>
>>
> 
> It did not strike me earlier, but it looks a bit weird to have 
> connection.execute() return a "cursor" to read results while this 
> "cursor" can also be used to execute commands. So, perhaps, another 
> object, with only the interface for result retrieval would be more 
> appropriate?

Why? It is no different from now where you can reuse a cursor.

> 
> Otherwise, that's a very nice addition, thanks!
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Executing on the connection?

От
Daniele Varrazzo
Дата:
On Tue, 8 Dec 2020 at 13:51, Denis Laxalde <denis.laxalde@dalibo.com> wrote:

> It did not strike me earlier, but it looks a bit weird to have
> connection.execute() return a "cursor" to read results while this
> "cursor" can also be used to execute commands. So, perhaps, another
> object, with only the interface for result retrieval would be more
> appropriate?

It thought about that, and it would be weird if people used a cursor
for further queries after consuming it, yes. But hey, who are we to
deny them to do that? :D I wouldn't go about creating and maintaining
new objects and interfaces only to limit the possibility.

My idea is that those who are interested in using `conn.execute()` are
exactly the people who don't care about having execute() on a cursor.
I think there's no harm in leaving it there :)

-- Daniele