Re: binary protocol, again
От | Daniele Varrazzo |
---|---|
Тема | Re: binary protocol, again |
Дата | |
Msg-id | CA+mi_8aaBR5_Dik+eaGYM3cr+qnWP49ZGixVx2k8w=cnyVhxEw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: binary protocol, again ("P. Christeas" <xrg@linux.gr>) |
Ответы |
Re: binary protocol, again
("P. Christeas" <xrg@linux.gr>)
|
Список | psycopg |
On Fri, Jul 20, 2012 at 7:00 PM, P. Christeas <xrg@linux.gr> wrote: > On Friday 20 July 2012, you wrote: >> You should be hunting for semicolons in the query with knowledge of >> comments (single line, multiline, nested), strings (including >> dollar-quoted strings, unicode-escape, standard-conforming)... It >> seems a painful road. > > I follow the "better safe than sorry" approach: if the query has anything > after a semicolon, it /could/ be a multi-query case and will failover. > Usually, user input should only appear in parameters so cannot introduce a > semicolon into the query string. We have a small impedance mismatch, because we are thinking about the same new cursor, but I am doing it in terms of "a cursor using PQexecParams instead of PQexec, you are thinking about "a cursor using binary types instead of text". The two cases are somewhat orthogonal, actually though they depend each other: you can use binary stuff (as query params) only if you use PQexecParams. So, if you want to have binary params, why do you want to make your life more complex supporting PQexec too and switching from a function to the other according to the query content or the params types? Why doesn't the cursor_bin (as you call it) always use only PQexecParams and the ISQLParam protocol for adaptation? Yes, there are some missing features, but I don't think it's a big deal: people will be documented about the features and the shortcomings of either cursors and go for what they prefer. I find the idea of the failover somewhat dangerous as which code path will be taken will depend on the content of the query string. >> Great :) Sorry though, I haven't got it: have you decided introducing >> now the cursor_bin, hence you don't need parsing the query to see if >> it's eligible or were you already using the cursor_bin? In the latter >> case, why are you parsing the query and not always using PQexecParams? > > cursor_bin /tries/ to use the binary protocol, but may still fail over into > the text one, if the queries are not eligible. Yes, but in order to fall back on the text protocol it is not necessary to go back to PQexec: you can use PQexecParams with text params. If people want to use multi-statements, identifier parameters or other features supported in the more flexible ISQLQuote, they will just not use cursor_bin. It is both complex on our side, error prone on both sides and surprising on their side: I don't see any advantage justifying that. It's much more clean to just fail with whatever error the libpq passes us in case multi-statements are used instead of trying a fallback that results in entirely different code being executed (different adapters etc.) > It is just a way of apps to say "yes, I want to go binary". > If you think about it, there is cases where we may need to mix PQexecParams > and PQexec in the same transaction, using the same pythonic cursor. So, that > class must be capable of both types of call. You can support both protocols in the same transactions by using different cursors in the same connection. Trying to use the same cursor looks amazingly messy to support and don't think it's worth. What is a case in which people would want to use the same cursor for the two param styles? > The other thing is binary *results*. Using a different cursor clearly indicates > that the application agrees to get the results in binary (ie. I won't break > the world without asking), so we could set 'cursor_bin' to use that even for > PQexec calls. If we get a fairly complete set of typecasters for both text and binary we could use them for the classic cursor as well on PQexec: this seems just orthogonal wrt using PQexecParams in cursor_bin. >> Lovely (not bikeshedding about the name: the important is the >> protocols to be distinct) > > A note here: I'm planning to use the *same* adapter classes for both the > ISQLQuote and ISQLParam usages. I think it would be ugly to have two types of > BINARY, two types of CHAR etc. Here, you can say that if your parameters are > adapted to [CHAR, CHAR, INTEGER] you can go PQExecParams, while if they are > non adaptable like [CHAR, CHAR, MYFOO] it has to failover. This is dangerous and obscure. I think, if MYFOO has an adapter or is conform to PQexecParams then it can be passed as parameter to cursor_bin, otherwise you get an adapter error saying what's wrong. In other words, if you are trying to push PQexecParams for use in the classic cursors and falling back in case you are even remotely afraid that something may break, then no: no deal. It's not going to happen. We don't do magic: we do boring predictable stuff, and changing codepath based on the content of the sql string or the parameters type is just too surprising and brittle. People may have an ISQLParam adapter for their MyFoo and suddenly an application would break with an adaptation error because a ; with a comment after is added to the query and no adapter for the "legacy" ISQLQuote had been provided. About the ugliness of having two adapters sets: either this or bloat in each adapter will ensure. Declaring an adapter conform to both the protocols can be easily done, it's part of the PEP 246 (the part we implement), but ISQLParam must already have one text and one binary adaptation function, plus possibly a method to tell whether text or binary is to be used, plus one to get the type oid, and I don't think I want this stuff on the simpler (or complex for its own reason) ISQLQuote adapters. Another option would be to have a method get_data() returning a pair (bool, str) where the bool specifies whether str is binary or not. But even if text is required, it will be a different string from ISQLQuote's getquoted() result (e.g. a python date is returned as "E'2012-07-20'::date" in ISQLQuote but just as "2012-07-20" in ISQLParam in text mode). To mitigate the ugliness we should rather have a better package organization: psycopg2 has two main sub-modules: "extensions" which is "everything not strictly DBAPI" and "extras" which is (as its docstring says) "everything I don't know yet where to put". Well, after 12 something years psycopg is been around, we have a fairly precise idea of what we have: 1. different subclasses of connections and cursors and 2. adapters for extra types; that's pretty much everything. I would organize them in "connections", "cursors", "types" modules, keeping the old ones for hysterical raisins and not adding anything anymore there. Adapters for the two protocols could live in different sub-modules: psycopg2.types.quote and psycopg2.types.param for example. The case of the strict compliancy to the dbapi (hence the extensions module) is something I don't think has ever been exercised in psycopg's life and I don't even think the module still compiles without extensions, so I'd be happy to drop it (Fog, correct me if I'm wrong). Writing the dates example above, I've thought about another inconsistency between the two behaviours: ISQLParam would require more often the use of explicit cast next to the placeholders, because for many types we append a type to disambiguate. You won't do that in ISQLParam so some queries working in ISQLQuote would fail. OTOH you may decide to add another feature to ISQLParam: getting the name of the Postgres type of the adapted object (e.g. you could convert a python placeholder %s into a postgres placeholder $1::date). But then you would be much more uniform than we currently are in passing the types, so you may have the opposite scenario: queries working in ISQLParam but failing in ISQLQuote. Things may work as expected, or some query may break with a postgres error due to cast ambiguity: I prefer this happening only once and immediately when a program is being developed and not potentially lurking and exploding at runtime because of an unexpected type in a parameters set. > WDYT? IT: KISS :) Let's have the cursor class adapting on ISQLQuote, using PQexec and binding client side; and cursor_bin adapting on ISQLParam, using PQexecParams and binding server side. No automatic switch between the two: I thank you if you want to do this to improve the basic cursor performance but I don't want it as there's just no way to make it robust enough. We can design a feature to use globally a certain connection/cursor class by default so that people would decide early in their app what to use. We could also have "from psycopg2 import psycopg3" to get the cursor_bin type automatically :) (Just joking: I'd be happy to see this stuff in psycopg 2.5). Cheers! -- Daniele
В списке psycopg по дате отправления: