Re: libpq, ecpg and the bytea data type
От | Michael Fuhr |
---|---|
Тема | Re: libpq, ecpg and the bytea data type |
Дата | |
Msg-id | 20050915090814.GA68915@winnie.fuhr.org обсуждение исходный текст |
Ответ на | libpq, ecpg and the bytea data type (Mark Richardson <markmapo@yahoo.com>) |
Список | pgsql-interfaces |
On Wed, Sep 14, 2005 at 04:20:13PM -0700, Mark Richardson wrote: > If I have a table defined... > CREATE TABLE myTable (myByteaField bytea NOT NULL); > > then populate it... > COPY myTable (myByteaField) FROM stdin; > \000\000\000\000\000\000\000\000 > \011\101\206\155\136\035\071\135 > \012\000\162\047\105\223\322\121 > \. Bytea values should be escaped with two backslashes, not one. See "Binary Data Types" in the documentation: http://www.postgresql.org/docs/7.4/interactive/datatype-binary.html > SELECT length(myByteaField) FROM stdin; Please post the actual command you ran; the above fails with 'relation "stdin" does not exist.' Presumably you really queried myTable (this might seem like nitpicking, but sometimes little differences matter). > Length > -------- > 0 > 8 > 1 > > I'm assuming that this is because of the length function (seeing > the '\0' causes strlen to be shorter than the actual data). Actually it's because the data wasn't fully loaded due to the incorrectly-escaped \000 characters: SELECT '\000\001'::bytea;bytea ------- (1 row) SELECT '\\000\\001'::bytea; bytea ----------\000\001 (1 row) The length function doesn't have a problem with \000 characters: SELECT length('\\000\\001'::bytea);length -------- 2 (1 row) If you look at the implementation of the length(bytea) function (byteaoctetlen() in src/backend/utils/adt/varlena.c) you'll see that it doesn't read the data at all; it simply reads a size attribute that's stored with the data. > So I try to use the libpq interface, and when I execute the code > PQgetLength(pgresult, 0, 0) > the result is 0 (for record 0, field ) Again, please post what you actually did; the above should have failed to link with an error like "undefined reference to `PQgetLength'" (the actual function name is PQgetlength). > PQgetLength(pgresult, 2, 0) > The result is 1 (for record 2, field 0). So this leads me to > believe that the values are stored in the database as a string, and > they are being terminated during the copy (I also tried inserts, > with the same results). The values are stored as a sequence of bytes, and they are indeed being truncated during the COPY because the data was incorrectly escaped. > On a last ditch effort, I used ECPG with the following code... > EXEC SQL BEGIN DECLARE SECTION; > char myByteaField_[3][8]; > EXEC SQL END DECLARE SECTION; > > EXEC SQL SELECT myByteaField INTO :myByteaField_ FROM myTable; > for (int i=0; i<3; i++) > for (int j=0; j<8; j++) > std::cout << myByteaField_[i][j] << "(" << int(myByteaField_[i][j] << ")."; > std::cout << std::endl; > > I get the following.... > (0).(0).(0).(0).(0).(0).(0).(0). <- this is expected The myByteaField_ array might have been zeroed to begin with, and you're probably seeing that data, not data filled in from the query. What happens if you fill the array with a character like '*' before doing the SELECT? If I do that with the data input as in your example, I see the first character as \0 and the rest of the data as the garbage in the array: (0).*(42).*(42).*(42).*(42).*(42).*(42).*(42). > \(92).0(48).1(49).1(49).A(65).\(92).2(50).0(48) <- why escape sequence? Non-printable characters are escaped unless you request the results in binary; see the documentation link posted earlier. > \(92).0(48).1(49).2(50).(0).(0).(0).(0) <- dropped the last 6 bytes > > So I'm pretty sure that the data is stored as a string, and it > is being terminated at the '\0'. Bytea and other variable-length types are stored as a length attribute and then the data itself. See the discussion of TOAST in the documentation: http://www.postgresql.org/docs/8.0/interactive/storage-toast.html > ECPG not only has the same behavior, but the conversion of the > data to a char array seems incorrect (maybe the ECPG pre-compiler > is thinking that I want to_char() of the value of the byte array). Unless you request binary results, queries return text representations of the data; for bytea that means that non-printable characters will be escaped. > So my question is, how would you be able to store any data in a > bytea if it contained any bytes with the value of 0 (I'm sure there > are images stored as bytea and they have 0 byte values). I don't > need to store an image, just these 8 bytes. Escape the data correctly with two backslashes; the "Binary Data Types" documentation explains why this is necessary. -- Michael Fuhr
В списке pgsql-interfaces по дате отправления: