Обсуждение: BYTEA
Hi, people! I've used PostgreSQL for few months and it fully satisfies my needs. Now I'm starting to explore BYTEA type (for storing small 1-5 Kb png images) and found it very cumbersome. I'd like to hear background explanation of why strings used as bytea literals pass two phases of parsing? it seems very odd to me to escape backslash two times (like that: '\\\\' ) just to insert it into string.. It adds more complexity to apps, forcing it encode/decode those literals.. I think it would be enough to quote it like any other string, just for parser to understand it correctly, and insert into table. Bytea is binary data by it's nature, why to quote it when I perform selects? I'm not going to show it to humans, it is just data for my application. And if bytea really is displayed to a human then translating value into readable form (quote non-printable characters etc.) is a task of the application (psql, my app ..) but not PostgreSQL backend's. Any comments? Timur.
The easiest thing that I've found to do is: INSERT ... decode('string','base64')::bytea .... and SELECT ... encode(attr1,'base64')::text ... where attr1 is the BYTEA attribute, and 'string' is a string of base64-encoded data. You can also use 'hex' instead of base64. This works well because most languages have a good way to do either a binary->hex->binary conversion or a binary->base64->binary conversion. I don't really see a way postgres could do it any better, since the queries and results are both strings, so there have to be rules about what a string might include. Base64 works nicely because it's fairly compact, and all "safe" text. Regards, Jeff On Friday 20 September 2002 12:57 am, Timur V. Irmatov wrote: > Hi, people! > > I've used PostgreSQL for few months and it fully satisfies my needs. > > Now I'm starting to explore BYTEA type (for storing small 1-5 Kb png > images) and found it very cumbersome. > > I'd like to hear background explanation of why strings used as bytea > literals pass two phases of parsing? it seems very odd to me to escape > backslash two times (like that: '\\\\' ) just to insert it into > string.. It adds more complexity to apps, forcing it encode/decode > those literals.. > > I think it would be enough to quote it like any other string, just for > parser to understand it correctly, and insert into table. Bytea is > binary data by it's nature, why to quote it when I perform selects? > I'm not going to show it to humans, it is just data for my > application. > > And if bytea really is displayed to a human then translating value into > readable form (quote non-printable characters etc.) is a task of the > application (psql, my app ..) but not PostgreSQL backend's. > > Any comments? > > Timur. > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org
Jeff! Friday, September 20, 2002, 1:24:28 PM, You wrote: JD> I don't really see a way postgres could do it any better, since the queries JD> and results are both strings, so there have to be rules about what a string JD> might include. Base64 works nicely because it's fairly compact, and all JD> "safe" text. Hm.. If i have a varchar field then if I want to insert a string some'tex\t into it, I quote it like that: 'some\'tex\\t' .. but if I want to insert the same into bytea field I must quote it more: 'some\\'tex\\\\' .. why ??? JD> On Friday 20 September 2002 12:57 am, Timur V. Irmatov wrote: >> Hi, people! >> >> I've used PostgreSQL for few months and it fully satisfies my needs. >> >> Now I'm starting to explore BYTEA type (for storing small 1-5 Kb png >> images) and found it very cumbersome. >> >> I'd like to hear background explanation of why strings used as bytea >> literals pass two phases of parsing? it seems very odd to me to escape >> backslash two times (like that: '\\\\' ) just to insert it into >> string.. It adds more complexity to apps, forcing it encode/decode >> those literals.. >> >> I think it would be enough to quote it like any other string, just for >> parser to understand it correctly, and insert into table. Bytea is >> binary data by it's nature, why to quote it when I perform selects? >> I'm not going to show it to humans, it is just data for my >> application. >> >> And if bytea really is displayed to a human then translating value into >> readable form (quote non-printable characters etc.) is a task of the >> application (psql, my app ..) but not PostgreSQL backend's. >> >> Any comments? >> >> Timur. >> >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 6: Have you searched our list archives? >> >> http://archives.postgresql.org JD> ---------------------------(end of broadcast)--------------------------- JD> TIP 6: Have you searched our list archives? JD> http://archives.postgresql.org Sincerely Yours, Timur mailto:itvthor@sdf.lonestar.org
On Friday 20 Sep 2002 9:46 am, Timur V. Irmatov wrote: > Jeff! > > Friday, September 20, 2002, 1:24:28 PM, You wrote: > > JD> I don't really see a way postgres could do it any better, since the > queries JD> and results are both strings, so there have to be rules about > what a string JD> might include. Base64 works nicely because it's fairly > compact, and all JD> "safe" text. > > Hm.. If i have a varchar field then if I want to insert a string > some'tex\t into it, I quote it like that: 'some\'tex\\t' .. but if I > want to insert the same into bytea field I must quote it more: > 'some\\'tex\\\\' .. why ??? Because it *is* parsed twice IIRC (bear in mind I'm not a developer and I haven't looked at the code yet). The first parse gets it into the system *as a chunk of text*, so for varchar there's nothing more to be done. If you're storing a bytea (or other) then it needs to be converted to that form, i.e. parsed again. - Richard Huxton
Like the other reply said, it does get parsed an additional time for the bytea-specific encodings. base64 encoding/decoding might help you out with that. If you're really inserting a lot of binary data, that seems like the way to go to me. If you find it easier to mess with text types or escape the strings before you pass to bytea, then go for it, but I like the idea of base64 encoding it when I have a lot of unsafe characters. Oh, one other note: you still have to escape the following characters, for anything passed in a query: single-quote backslash NULL Regards, Jeff On Friday 20 September 2002 01:46 am, Timur V. Irmatov wrote: > Jeff! > > Friday, September 20, 2002, 1:24:28 PM, You wrote: > > JD> I don't really see a way postgres could do it any better, since the > queries JD> and results are both strings, so there have to be rules about > what a string JD> might include. Base64 works nicely because it's fairly > compact, and all JD> "safe" text. > > Hm.. If i have a varchar field then if I want to insert a string > some'tex\t into it, I quote it like that: 'some\'tex\\t' .. but if I > want to insert the same into bytea field I must quote it more: > 'some\\'tex\\\\' .. why ??? > > JD> On Friday 20 September 2002 12:57 am, Timur V. Irmatov wrote: > >> Hi, people! > >> > >> I've used PostgreSQL for few months and it fully satisfies my needs. > >> > >> Now I'm starting to explore BYTEA type (for storing small 1-5 Kb png > >> images) and found it very cumbersome. > >> > >> I'd like to hear background explanation of why strings used as bytea > >> literals pass two phases of parsing? it seems very odd to me to escape > >> backslash two times (like that: '\\\\' ) just to insert it into > >> string.. It adds more complexity to apps, forcing it encode/decode > >> those literals.. > >> > >> I think it would be enough to quote it like any other string, just for > >> parser to understand it correctly, and insert into table. Bytea is > >> binary data by it's nature, why to quote it when I perform selects? > >> I'm not going to show it to humans, it is just data for my > >> application. > >> > >> And if bytea really is displayed to a human then translating value into > >> readable form (quote non-printable characters etc.) is a task of the > >> application (psql, my app ..) but not PostgreSQL backend's. > >> > >> Any comments? > >> > >> Timur. > >> > >> > >> ---------------------------(end of broadcast)--------------------------- > >> TIP 6: Have you searched our list archives? > >> > >> http://archives.postgresql.org > > JD> ---------------------------(end of > broadcast)--------------------------- JD> TIP 6: Have you searched our list > archives? > > JD> http://archives.postgresql.org > > > > Sincerely Yours, > Timur > mailto:itvthor@sdf.lonestar.org > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
"Timur V. Irmatov" <itvthor@sdf.lonestar.org> writes: > I'd like to hear background explanation of why strings used as bytea > literals pass two phases of parsing? Because there is no other alternative: the parser is not aware when it's scanning a SQL command whether a string literal will end up being taken as a BYTEA value or not. One possibility for inserting BYTEA without quite so much quoting is to insert via COPY IN instead of INSERT. This is not a complete solution however, since you still have to beware of COPY's escaping rules for newlines and tabs. If you really don't want to be bothered, you could think about defining a function create function myinsert (..., bytea, ...) as ' INSERT INTO mytable VALUES($1,$2,...)' language SQL; and then invoking this function via the "fastpath" interface. The fastpath stuff is pretty ugly but it would let you send raw binary data. regards, tom lane