Обсуждение: copy ... from stdin csv; and bytea
My application is adding a bytea column to a table into which data is dumped in approximately 4k row batches, one batch approximately every 10 seconds. To this point, those dumps have used copy from stdin; however, I'm having some difficulty getting bytea encodings to work with it. Specifically, I can't seem to get the parser to recognize that what I'm handing it is an escaped string that needs to be parsed back into individual bytes rather than stored as-is. The encoding is very straightforward for INSERT, of course, but the COPY ... FROM STDIN CSV doesn't seem to want to work no matter what I've tried: \\000 "\\000" "E'\\000'" etc. Is there a trick to this that I just didn't see in the documentation, or is this some limitation of CSV copy-in? If the latter, are there suggestions for workarounds other than to fallback on the inserts? Using 8.3.3, and this is specifically via libpq, if that makes a difference. Thanks much. -- - David T. Wilson david.t.wilson@gmail.com
David Wilson wrote: > My application is adding a bytea column to a table into which data is > dumped in approximately 4k row batches, one batch approximately every > 10 seconds. To this point, those dumps have used copy from stdin; > however, I'm having some difficulty getting bytea encodings to work > with it. Specifically, I can't seem to get the parser to recognize > that what I'm handing it is an escaped string that needs to be parsed > back into individual bytes rather than stored as-is. The encoding is > very straightforward for INSERT, of course, but the COPY ... FROM > STDIN CSV doesn't seem to want to work no matter what I've tried: > > \\000 > "\\000" > "E'\\000'" > > etc. > > Is there a trick to this that I just didn't see in the documentation, > or is this some limitation of CSV copy-in? If the latter, are there > suggestions for workarounds other than to fallback on the inserts? > Try just a single \ e.g. "ge.xls","application/vnd.ms-excel",71168,"\320\317\021\340\241[snip] klint. -- Klint Gore Database Manager Sheep CRC A.G.B.U. University of New England Armidale NSW 2350 Ph: 02 6773 3789 Fax: 02 6773 3266 EMail: kgore4@une.edu.au
On Mon, Jul 28, 2008 at 1:24 AM, Klint Gore <kgore4@une.edu.au> wrote: > Try just a single \ > > e.g. > "ge.xls","application/vnd.ms-excel",71168,"\320\317\021\340\241[snip] Thanks- I did try that, and it at least gave the expected output from select, but is there a way to verify that it's actually handling it correctly rather than simply storing the sequence of characters? I'm not certain how to check the actual byte width of a column within a row, and I'd *really* rather not be storing 4 bytes for every 1 in the binary if I can avoid it- this column is already going to be doubling field width; quadrupling it would give me space headaches I really don't want to deal with. :) -- - David T. Wilson david.t.wilson@gmail.com
David Wilson wrote: > On Mon, Jul 28, 2008 at 1:24 AM, Klint Gore <kgore4@une.edu.au> wrote: > > Try just a single \ > > > > e.g. > > "ge.xls","application/vnd.ms-excel",71168,"\320\317\021\340\241[snip] > > Thanks- I did try that, and it at least gave the expected output from > select, but is there a way to verify that it's actually handling it > correctly rather than simply storing the sequence of characters? I'm > not certain how to check the actual byte width of a column within a > row, and I'd *really* rather not be storing 4 bytes for every 1 in the > binary if I can avoid it- this column is already going to be doubling > field width; quadrupling it would give me space headaches I really > don't want to deal with. :) > > select length(bytea_field) from table You could use ||pg_relation_size|(|text|)| or ||pg_total_relation_size|(|text|) |to see how much disk space it takes up. You can play with the storage settings for the column if you want to try and handle the space better. see alter table set storage. klint. -- Klint Gore Database Manager Sheep CRC A.G.B.U. University of New England Armidale NSW 2350 Ph: 02 6773 3789 Fax: 02 6773 3266 EMail: kgore4@une.edu.au
Klint Gore <kgore4@une.edu.au> writes: > David Wilson wrote: >> I'm not certain how to check the actual byte width of a column within a >> row, > select length(bytea_field) from table If you want the actual on-disk footprint, use pg_column_size() regards, tom lane
Tom Lane wrote: > Klint Gore <kgore4@une.edu.au> writes: > > David Wilson wrote: > >> I'm not certain how to check the actual byte width of a column within a > >> row, > > > select length(bytea_field) from table > > If you want the actual on-disk footprint, use pg_column_size() > > Size on disk would have the compression from the default storage = extended wouldn't it? I verified it for myself manually anyway. copy (select * from original limit 5) to stdout with csv; create table foo (like original); alter table foo alter column bytea_field set storage external; copy foo from stdin with csv; select |reltoastrelid from pg_class where relanem = 'original' found the file for it and looked at it with a hex viewer. | klint. -- Klint Gore Database Manager Sheep CRC A.G.B.U. University of New England Armidale NSW 2350 Ph: 02 6773 3789 Fax: 02 6773 3266 EMail: kgore4@une.edu.au