Обсуждение: Creating dynamically-typed tables using psycopg2's built-in formatting

Поиск
Список
Период
Сортировка

Creating dynamically-typed tables using psycopg2's built-in formatting

От
Daniel Cohen
Дата:
Hi!

I'm working on a project in Python that interacts with a PostgreSQL data warehouse, and I'm using the psycopg2 API. I am looking to create dynamically-typed tables.

For example, I would like to be able to execute the following code:

from psycopg2 import connect, sql

connection = connect(host="host", port="port", database="database", user="user", password="pw")

def create_table(tbl_name, col_name, col_type):   query = sql.SQL("CREATE TABLE {} ({} {})".format(sql.Identifier(tbl_name), sql.Identifier(col_name), sql.Identifier(column_type)))   connection.execute(query)

create_table('animals', 'name', 'VARCHAR')

and end up with a table named "animals" that contains a column "name" of type VARCHAR. However, when I attempt to run this, I get an error: 'type "VARCHAR" does not exist'. I assume psycopg2's built-in formatter is putting double quotes around the VARCHAR type when there should not be any. Normally, I would just work around this myself, but the documentation is very clear that Python string concatenation should never be used for fear of SQL injection attacks. Security is a concern for this project, so I would like to know if it's possible to create dynamically-typed tables in this fashion using pyscopg2, and if not, whether there exists another third-party API that can do so securely.

A second issue I've had is that when creating tables with a similar methodology, the sql.Identifier() function does not perform as I expect it to. When I use it to dynamically feed in table names, for example, I get varying results. See below:

CREATE TABLE tbl AS SELECT * FROM other_tbl;

in raw SQL creates a table called tbl, whereas

cursor.execute(sql.SQL("CREATE TABLE {} AS SELECT * FROM other_tbl").format(sql.Identifier(tbl))

creates a table called "tbl". The two are different, and

SELECT * FROM tbl;

returns a totally different table than

SELECT * FROM "tbl";

Please let me know if I can fix either of these problems; I want to be able to dynamically feed types into SQL queries, and I want the tables created to be of the form tbl not "tbl". Thank you!

Danny 


Re: Creating dynamically-typed tables using psycopg2's built-informatting

От
Christophe Pettus
Дата:
Hi, Daniel,

First, tbl and "tbl" aren't "totally different":

> xof=# create table tbl (i integer);
> CREATE TABLE
> xof=# create table "tbl" (i integer);
> ERROR:  relation "tbl" already exists

The difference is that putting double quotes around an SQL identifier makes the comparison type-sensitive, and allows
forcharacters not otherwise allowed in identifiers: 

> xof=# select * from Tbl;
>  i
> ---
> (0 rows)
>
> xof=# select * from "Tbl";
> ERROR:  relation "Tbl" does not exist
> LINE 1: select * from "Tbl";
>                       ^

You can use SQL.identifier, but you need to make sure you are getting the case right; in general, PostgreSQL types are
alllower-case, and it's only the lack of double quotes that makes this work: 

xof=# create table x (i VARCHAR);
CREATE TABLE
xof=# create table y (i "VARCHAR");
ERROR:  type "VARCHAR" does not exist
LINE 1: create table y (i "VARCHAR");
                          ^
xof=# create table y (i "varchar");
CREATE TABLE

> On Jun 13, 2019, at 12:28, Daniel Cohen <daniel.m.cohen@berkeley.edu> wrote:
>
> Hi!
>
> I'm working on a project in Python that interacts with a PostgreSQL data warehouse, and I'm using the psycopg2 API. I
amlooking to create dynamically-typed tables. 
>
> For example, I would like to be able to execute the following code:
>
> from psycopg2 import connect,
>  sql
>
> connection
> = connect(host="host", port="port", database="database", user="user", password="pw")
>
>
>
> def create_table(tbl_name, col_name, col_type):
>
>     query
> = sql.SQL("CREATE TABLE {} ({} {})".format(sql.Identifier(tbl_name), sql.Identifier(col_name),
sql.Identifier(column_type)))
>
>     connection
> .execute(query)
>
>
> create_table
> ('animals', 'name', 'VARCHAR')
> and end up with a table named "animals" that contains a column "name" of type VARCHAR. However, when I attempt to run
this,I get an error: 'type "VARCHAR" does not exist'. I assume psycopg2's built-in formatter is putting double quotes
aroundthe VARCHAR type when there should not be any. Normally, I would just work around this myself, but the
documentationis very clear that Python string concatenation should never be used for fear of SQL injection attacks.
Securityis a concern for this project, so I would like to know if it's possible to create dynamically-typed tables in
thisfashion using pyscopg2, and if not, whether there exists another third-party API that can do so securely.  
>
> A second issue I've had is that when creating tables with a similar methodology, the sql.Identifier() function does
notperform as I expect it to. When I use it to dynamically feed in table names, for example, I get varying results. See
below:
>
> CREATE TABLE tbl AS SELECT * FROM other_tbl;
> in raw SQL creates a table called tbl, whereas
>
> cursor.execute(sql.SQL("CREATE TABLE {} AS SELECT * FROM other_tbl").format(sql.Identifier(tbl))
> creates a table called "tbl". The two are different, and
>
> SELECT * FROM tbl;
>
> returns a totally different table than
>
> SELECT * FROM "tbl";
> Please let me know if I can fix either of these problems; I want to be able to dynamically feed types into SQL
queries,and I want the tables created to be of the form tbl not "tbl". Thank you! 
>
> Danny
>
>

--
-- Christophe Pettus
   xof@thebuild.com




RE: Creating dynamically-typed tables using psycopg2's built-informatting

От
David Raymond
Дата:
SELECT * FROM tbl;
returns a totally different table than
SELECT * FROM "tbl";

That's a little disconcerting.

For the capitalization thing basically what's going on is two things. (The way I understand it)
1) Postgres IS case sensitive.
2) When any query gets sent to the server, the server converts everything not inside quotes to lower case as step 1,
_before_trying to match names for tables, columns, functions, etc.
 


So if you send
SELECT * FROM TBL;
One of the first things the server does is change it to
select * from tbl;
at which point it will do a case-sensitive match for tbl

And if you run
SELECT * FROM "TBL";
it gets turned into
select * from "TBL";
at which point it will do a case-sensitive match for TBL

But in your case what is in quotes is the same as its lowercase version. So it "should" match up to the same thing. So
ifyou are indeed getting different results from different tables then either my understanding is way off, or something
weirdis going on.
 

Re: Creating dynamically-typed tables using psycopg2's built-informatting

От
Christophe Pettus
Дата:

> On Jun 13, 2019, at 13:29, David Raymond <David.Raymond@tomtom.com> wrote:
> For the capitalization thing basically what's going on is two things. (The way I understand it)
> 1) Postgres IS case sensitive.
> 2) When any query gets sent to the server, the server converts everything not inside quotes to lower case as step 1,
_before_trying to match names for tables, columns, functions, etc. 

It's a bit more elaborate than that, although that's not a bad summary of the perceived result.  The details are here:

    https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS\

--
-- Christophe Pettus
   xof@thebuild.com




Re: Creating dynamically-typed tables using psycopg2's built-in formatting

От
Daniel Cohen
Дата:
Hi Christophe,

Thanks so much for your response. The uppercase --> lowercase fix worked for the types, but I'm still only getting tables that can be searched by double-quotations (i.e. > SELECT * FROM tbl; returns nothing, but > SELECT * FROM "tbl"; returns the table I uploaded). I can't tell from your message what the script should say to fix this. Thank you again!

Best,

Danny

On Thu, Jun 13, 2019 at 1:00 PM Christophe Pettus <xof@thebuild.com> wrote:
Hi, Daniel,

First, tbl and "tbl" aren't "totally different":

> xof=# create table tbl (i integer);
> CREATE TABLE
> xof=# create table "tbl" (i integer);
> ERROR:  relation "tbl" already exists

The difference is that putting double quotes around an SQL identifier makes the comparison type-sensitive, and allows for characters not otherwise allowed in identifiers:

> xof=# select * from Tbl;
>  i
> ---
> (0 rows)
>
> xof=# select * from "Tbl";
> ERROR:  relation "Tbl" does not exist
> LINE 1: select * from "Tbl";
>                       ^

You can use SQL.identifier, but you need to make sure you are getting the case right; in general, PostgreSQL types are all lower-case, and it's only the lack of double quotes that makes this work:

xof=# create table x (i VARCHAR);
CREATE TABLE
xof=# create table y (i "VARCHAR");
ERROR:  type "VARCHAR" does not exist
LINE 1: create table y (i "VARCHAR");
                          ^
xof=# create table y (i "varchar");
CREATE TABLE

> On Jun 13, 2019, at 12:28, Daniel Cohen <daniel.m.cohen@berkeley.edu> wrote:
>
> Hi!
>
> I'm working on a project in Python that interacts with a PostgreSQL data warehouse, and I'm using the psycopg2 API. I am looking to create dynamically-typed tables.
>
> For example, I would like to be able to execute the following code:
>
> from psycopg2 import connect,
>  sql
>
> connection
> = connect(host="host", port="port", database="database", user="user", password="pw")
>
>
>
> def create_table(tbl_name, col_name, col_type):
>
>     query
> = sql.SQL("CREATE TABLE {} ({} {})".format(sql.Identifier(tbl_name), sql.Identifier(col_name), sql.Identifier(column_type)))
>
>     connection
> .execute(query)
>
>
> create_table
> ('animals', 'name', 'VARCHAR')
> and end up with a table named "animals" that contains a column "name" of type VARCHAR. However, when I attempt to run this, I get an error: 'type "VARCHAR" does not exist'. I assume psycopg2's built-in formatter is putting double quotes around the VARCHAR type when there should not be any. Normally, I would just work around this myself, but the documentation is very clear that Python string concatenation should never be used for fear of SQL injection attacks. Security is a concern for this project, so I would like to know if it's possible to create dynamically-typed tables in this fashion using pyscopg2, and if not, whether there exists another third-party API that can do so securely.
>
> A second issue I've had is that when creating tables with a similar methodology, the sql.Identifier() function does not perform as I expect it to. When I use it to dynamically feed in table names, for example, I get varying results. See below:
>
> CREATE TABLE tbl AS SELECT * FROM other_tbl;
> in raw SQL creates a table called tbl, whereas
>
> cursor.execute(sql.SQL("CREATE TABLE {} AS SELECT * FROM other_tbl").format(sql.Identifier(tbl))
> creates a table called "tbl". The two are different, and
>
> SELECT * FROM tbl;
>
> returns a totally different table than
>
> SELECT * FROM "tbl";
> Please let me know if I can fix either of these problems; I want to be able to dynamically feed types into SQL queries, and I want the tables created to be of the form tbl not "tbl". Thank you!
>
> Danny
>
>

--
-- Christophe Pettus
   xof@thebuild.com

RE: Creating dynamically-typed tables using psycopg2's built-informatting

От
David Raymond
Дата:
Could you let us know what version the server is, and also show a table listing? (Connect with psql and do a \d for
example)


From: Daniel Cohen <daniel.m.cohen@berkeley.edu> 
Sent: Thursday, June 13, 2019 4:55 PM
To: Christophe Pettus <xof@thebuild.com>
Cc: psycopg@postgresql.org
Subject: Re: Creating dynamically-typed tables using psycopg2's built-in formatting

Hi Christophe,

Thanks so much for your response. The uppercase --> lowercase fix worked for the types, but I'm still only getting
tablesthat can be searched by double-quotations (i.e. > SELECT * FROM tbl; returns nothing, but > SELECT * FROM "tbl";
returnsthe table I uploaded). I can't tell from your message what the script should say to fix this. Thank you again!
 

Best,

Danny 


Re: Creating dynamically-typed tables using psycopg2's built-informatting

От
Federico Di Gregorio
Дата:
On 6/13/19 10:54 PM, Daniel Cohen wrote:
> Thanks so much for your response. The uppercase --> lowercase fix worked 
> for the types, but I'm still only getting tables that can be searched by 
> double-quotations (i.e. > SELECT * FROM tbl; returns nothing, but > 
> SELECT * FROM "tbl"; returns the table I uploaded). I can't tell from 
> your message what the script should say to fix this. Thank you again!

 From psql use the "\d" command to list all tables and check their 
names. Send the output here so that we can help you understand why your 
qury doesn't do what you expect.

federico

-- 
Federico Di Gregorio                         federico.digregorio@dndg.it
DNDG srl                                                  http://dndg.it
                       The number of the beast: vi vi vi. -- Delexa Jones



Re: Creating dynamically-typed tables using psycopg2's built-informatting

От
Adrian Klaver
Дата:
On 6/13/19 12:28 PM, Daniel Cohen wrote:
> Hi!
> 
> I'm working on a project in Python that interacts with a PostgreSQL data 
> warehouse, and I'm using the psycopg2 API. I am looking to create 
> dynamically-typed tables.
> 
> For example, I would like to be able to execute the following code:
> 
> |frompsycopg2 importconnect,sql connection 
>
=connect(host="host",port="port",database="database",user="user",password="pw")defcreate_table(tbl_name,col_name,col_type):query

> =sql.SQL("CREATE TABLE {} ({} 
>
{})".format(sql.Identifier(tbl_name),sql.Identifier(col_name),sql.Identifier(column_type)))connection.execute(query)create_table('animals','name','VARCHAR')|
> 
> and end up with a table named "animals" that contains a column "name" of 
> type VARCHAR. However, when I attempt to run this, I get an error: 
> *'type "VARCHAR" does not exist'*. I assume psycopg2's built-in 
> formatter is putting double quotes around the VARCHAR type when there 
> should not be any. Normally, I would just work around this myself, but 
> the documentation is/very/clear that Python string concatenation should 
> never be used for fear of SQL injection attacks. Security is a concern 
> for this project, so I would like to know if it's possible to create 
> dynamically-typed tables in this fashion using pyscopg2, and if not, 
> whether there exists another third-party API that can do so securely.
> 
> A second issue I've had is that when creating tables with a similar 
> methodology, the sql.Identifier() function does not perform as I expect 
> it to. When I use it to dynamically feed in table names, for example, I 
> get varying results. See below:|
> |
> 
> |CREATE TABLE tbl AS SELECT * FROM other_tbl;|
> 
> in raw SQL creates a table called tbl, whereas
> 
> |cursor.execute(sql.SQL("CREATE TABLE {} AS SELECT * FROM 
> other_tbl").format(sql.Identifier(tbl))|
> 
> creates a table called "tbl". The two are different, and

I'm not seeing it:

cursor.execute(sql.SQL("CREATE TABLE {} AS SELECT * FROM 
t1").format(sql.Identifier("tbl")))

test_(aklaver)> \d
...
public | t1                     | table    | aklaver
public | tbl                    | table    | aklaver
...

The question then becomes how is the variable tbl in your script being 
assigned to?

> 
> |SELECT * FROM tbl;|
> 
> ||
> 
> returns a totally different table than
> 
> SELECT * FROM "tbl";
> 
> Please let me know if I can fix either of these problems; I want to be 
> able to dynamically feed types into SQL queries, and I want the tables 
> created to be of the form tbl not "tbl". Thank you!
> 
> Danny
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Creating dynamically-typed tables using psycopg2's built-informatting

От
Christophe Pettus
Дата:

> On Jun 13, 2019, at 13:54, Daniel Cohen <daniel.m.cohen@berkeley.edu> wrote:
>
> Thanks so much for your response. The uppercase --> lowercase fix worked for the types, but I'm still only getting
tablesthat can be searched by double-quotations (i.e. > SELECT * FROM tbl; returns nothing, but > SELECT * FROM "tbl";
returnsthe table I uploaded).  

By "returns nothing," do you mean you get an error, or that you get zero rows?

I'd connect to the database using psql and use \d to see what tables actually exist.
--
-- Christophe Pettus
   xof@thebuild.com




Re: Creating dynamically-typed tables using psycopg2's built-in formatting

От
Daniel Cohen
Дата:
Hi again,

I realize the error was that I specified the name as "schema.tbl", and I think because there was a period in the name, it converted to double-quote. I tried again as just "tbl" and it worked perfectly. Thanks so much for all your help.

Best,

Danny

On Fri, Jun 14, 2019 at 9:12 AM Christophe Pettus <xof@thebuild.com> wrote:


> On Jun 13, 2019, at 13:54, Daniel Cohen <daniel.m.cohen@berkeley.edu> wrote:
>
> Thanks so much for your response. The uppercase --> lowercase fix worked for the types, but I'm still only getting tables that can be searched by double-quotations (i.e. > SELECT * FROM tbl; returns nothing, but > SELECT * FROM "tbl"; returns the table I uploaded).

By "returns nothing," do you mean you get an error, or that you get zero rows?

I'd connect to the database using psql and use \d to see what tables actually exist. 
--
-- Christophe Pettus
   xof@thebuild.com

Re: Creating dynamically-typed tables using psycopg2's built-in formatting

От
Sebastiaan Mannem
Дата:
Yeah,
"schema.tbl" will search all schemas in search path for a table explicitly called "schema.tbl".
So that will never find a table called "tbl", and most probably never find a table at all.

And without quotes, it will search for a table called "tbl" in a schema called "schema".
So your issue makes perfect sense now.

FWIW, if you where to query "schema"."tbl", it would look for a table (or view) called "tbl" in a schema called "schema",
Which seems to be what you want.

 
Sebastiaan Alexander Mannem
Senior Consultant
Anthony Fokkerweg 1
1059 CM Amsterdam, The Netherlands
T: +31 6 82521560
www.edbpostgres.com


On 14 Jun 2019, at 18:41, Daniel Cohen <daniel.m.cohen@berkeley.edu> wrote:

Hi again,

I realize the error was that I specified the name as "schema.tbl", and I think because there was a period in the name, it converted to double-quote. I tried again as just "tbl" and it worked perfectly. Thanks so much for all your help.

Best,

Danny

On Fri, Jun 14, 2019 at 9:12 AM Christophe Pettus <xof@thebuild.com> wrote:


> On Jun 13, 2019, at 13:54, Daniel Cohen <daniel.m.cohen@berkeley.edu> wrote:
>
> Thanks so much for your response. The uppercase --> lowercase fix worked for the types, but I'm still only getting tables that can be searched by double-quotations (i.e. > SELECT * FROM tbl; returns nothing, but > SELECT * FROM "tbl"; returns the table I uploaded).

By "returns nothing," do you mean you get an error, or that you get zero rows?

I'd connect to the database using psql and use \d to see what tables actually exist. 
--
-- Christophe Pettus
   xof@thebuild.com


Вложения

Re: Creating dynamically-typed tables using psycopg2's built-in formatting

От
Daniele Varrazzo
Дата:
On Fri, Jun 14, 2019 at 6:44 PM Sebastiaan Mannem <sebastiaan.mannem@enterprisedb.com> wrote:
Yeah,
"schema.tbl" will search all schemas in search path for a table explicitly called "schema.tbl".
So that will never find a table called "tbl", and most probably never find a table at all.

Also note that from psycopg 2.8 you can use Identifier("schema", "tbl") to represent a dot-separated sequence of identifiers.


-- Daniele
Вложения

Re: Creating dynamically-typed tables using psycopg2's built-in formatting

От
Daniel Cohen
Дата:
Awesome, thanks!

On Fri, Jun 14, 2019 at 12:13 PM Daniele Varrazzo <daniele.varrazzo@gmail.com> wrote:
On Fri, Jun 14, 2019 at 6:44 PM Sebastiaan Mannem <sebastiaan.mannem@enterprisedb.com> wrote:
Yeah,
"schema.tbl" will search all schemas in search path for a table explicitly called "schema.tbl".
So that will never find a table called "tbl", and most probably never find a table at all.

Also note that from psycopg 2.8 you can use Identifier("schema", "tbl") to represent a dot-separated sequence of identifiers.



-- Daniele