Обсуждение: Bytea and perl
I would like to use a bytea column to store arbitrary files. I am trying to do this in perl and am wondering how to go about it. The docs for postgresql suggest some pretty elaborate quoting, etc. If I use perl DBI, the docs suggest that I need to do some explicit column datatype binding (namely DBD::Pg::PG_BYTEA). What is the accepted way of inserting a binary string into a bytea column in perl? Thanks, Sean
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 > What is the accepted way of inserting a binary string into > a bytea column in perl? It's pretty much as you described. Here's an example, modified from the test suite: use DBD::Pg qw(:pg_types); ... my $sth = $dbh->prepare(qq{INSERT INTO dbd_pg_test (id,bytetest) VALUES (?,?)}); $sth->bind_param(2, undef, { pg_type => DBD::Pg::PG_BYTEA }); $sth->execute(400, 'aa\\bb\\cc\\\0dd\\'); Other options include storing just a filename, or base-64 encoding everything and storing it as a text. I tend to prefer the latter more often than not, as the encode/decode goes very quickly on most modern computers. - -- Greg Sabino Mullane greg@turnstep.com PGP Key: 0x14964AC8 200603222207 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -----BEGIN PGP SIGNATURE----- iD8DBQFEIhEBvJuQZxSWSsgRAvAuAKDXEg6i+aykLuDeYpPCYCF+5XosNACfd/hZ PR8cSm0/9NCJVInn+yEBpsU= =jfTz -----END PGP SIGNATURE-----
On 3/22/06 10:08 PM, "Greg Sabino Mullane" <greg@turnstep.com> wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > >> What is the accepted way of inserting a binary string into >> a bytea column in perl? > > It's pretty much as you described. Here's an example, modified > from the test suite: > > use DBD::Pg qw(:pg_types); > ... > my $sth = $dbh->prepare(qq{INSERT INTO dbd_pg_test (id,bytetest) VALUES > (?,?)}); > > $sth->bind_param(2, undef, { pg_type => DBD::Pg::PG_BYTEA }); Greg, Thanks for the answer. OK. Here is my follow-up question. Why is this explicit parameter binding necessary? When would I want to have pg_type be something other than PG_BYTEA when inserting into a bytea column? The reason this is important is that many (read this as ALL, as far as I know) modules built on top of DBI do not use explicit paramater binding and rely on the sth->execute(...) quoting to do the right thing, which it does with all column types except bytea, it seems. This renders these higher-level interfaces not so useful for bytea storage unless one base64 encodes (which increases storage needs by 25% and does add some nontrivial overhead for very large files). I haven't looked at the DBD::Pg code on this issue, so I don't know any intricacies, so I'm more curious than anything else. > $sth->execute(400, 'aa\\bb\\cc\\\0dd\\'); > > Other options include storing just a filename, or base-64 encoding > everything and storing it as a text. I tend to prefer the latter > more often than not, as the encode/decode goes very quickly on > most modern computers. I guess a third option is the large object interface, which I am trying to avoid. Thanks again, Sean
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 > OK. Here is my follow-up question. Why is this explicit parameter binding > necessary? When would I want to have pg_type be something other than > PG_BYTEA when inserting into a bytea column? You wouldn't, but the trick is getting all the pieces to know that the column is bytea. DBD::Pg has no inherent way to find out for iteslf. Nor does libpq. The planner has an idea, but that information is not transmitted back to DBD:Pg. The difference then becomes that the low-level calls that DDB::Pg makes to PostgreSQL via PQexecParams and PQexecPrepared are different if any of the values are binary. If they are, we can't simply pass a string, but have to pass a separate array of string lengths, as we can't use \0 to indicae the end of the data anymore. > The reason this is important is that many (read this as ALL, as far as I > know) modules built on top of DBI do not use explicit paramater binding and > rely on the sth->execute(...) quoting to do the right thing, which it does > with all column types except bytea, it seems. Well, there are other column type cases where it will fail, but they are not as common as bytea. Unfortunately, there is no easy solution. Hopefully these high-level interface modules left some hooks and knobs to handle this sort of situation. If they don't, drop them a line, because they should. :) > I guess a third option is the large object interface, which I am trying to > avoid. I suspect that this is even less supported by the other modules, so you might as well go with the binding at that point. Good luck: hopefully one of the four options will work out for you. - -- Greg Sabino Mullane greg@turnstep.com PGP Key: 0x14964AC8 200603232139 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -----BEGIN PGP SIGNATURE----- iD8DBQFEI15nvJuQZxSWSsgRAnDGAJ9CW2gb0qE53isrOfLjoALuQYetKQCgwCLQ A1EKVpnIhjPHiqT0HTAfwjY= =LmM9 -----END PGP SIGNATURE-----
On 3/23/06 9:50 PM, "Greg Sabino Mullane" <greg@turnstep.com> wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > >> OK. Here is my follow-up question. Why is this explicit parameter binding >> necessary? When would I want to have pg_type be something other than >> PG_BYTEA when inserting into a bytea column? > > You wouldn't, but the trick is getting all the pieces to know that the column > is bytea. DBD::Pg has no inherent way to find out for iteslf. Nor does libpq. > The planner has an idea, but that information is not transmitted back to > DBD:Pg. > The difference then becomes that the low-level calls that DDB::Pg makes to > PostgreSQL via PQexecParams and PQexecPrepared are different if any of the > values > are binary. If they are, we can't simply pass a string, but have to pass a > separate array of string lengths, as we can't use \0 to indicae the end of the > data anymore. Ahhh. Now things start to make sense. >> The reason this is important is that many (read this as ALL, as far as I >> know) modules built on top of DBI do not use explicit paramater binding and >> rely on the sth->execute(...) quoting to do the right thing, which it does >> with all column types except bytea, it seems. > > Well, there are other column type cases where it will fail, but they are not > as common as bytea. Unfortunately, there is no easy solution. Hopefully these > high-level interface modules left some hooks and knobs to handle this sort > of situation. If they don't, drop them a line, because they should. :) I will. >> I guess a third option is the large object interface, which I am trying to >> avoid. > > I suspect that this is even less supported by the other modules, so you might > as well go with the binding at that point. Good luck: hopefully one of the > four options will work out for you. Oh, it really isn't that big a deal. Right now, I am using base64 encoding, which is fine. Thanks for elaborating. Sean
On Mar 23, 2006, at 9:50 PM, Greg Sabino Mullane wrote: >> OK. Here is my follow-up question. Why is this explicit >> parameter binding >> necessary? When would I want to have pg_type be something other than >> PG_BYTEA when inserting into a bytea column? > > You wouldn't, but the trick is getting all the pieces to know that > the column > is bytea. DBD::Pg has no inherent way to find out for iteslf. Nor > does libpq. > The planner has an idea, but that information is not transmitted > back to DBD:Pg. I have not looked at libpq in any detail, but it should have access to the type of all the parameters in the prepared statement. The Describe (F) statement in the frontend/backend protocol identifies the type of each parameter. I'm using this in Lisp to convert parameters as needed for prepared statements. http://www.postgresql.org/docs/8.1/static/protocol-message-formats.html John DeSoi, Ph.D. http://pgedit.com/ Power Tools for PostgreSQL
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 > I have not looked at libpq in any detail, but it should have access to the > type of all the parameters in the prepared statement. The Describe (F) > statement in the frontend/backend protocol identifies the type of each > parameter. I'm using this in Lisp to convert parameters as needed for > prepared statements. It should, but it does not yet. Known limitation. It's in the libpq docs but not sure if it is on the official TODO list...yep, it's in there. Pardon if this has been addressed already, my mail server crashed and I'm slowly catching up. - -- Greg Sabino Mullane greg@turnstep.com PGP Key: 0x14964AC8 200603282138 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -----BEGIN PGP SIGNATURE----- iD8DBQFEKfMtvJuQZxSWSsgRAsJeAKCC08Isc+1Oip1MO/EPNNXZwBdgXQCgprLU qFnAhuyoRKyV4CMQ6QFHL7Q= =F4b8 -----END PGP SIGNATURE-----