Обсуждение: copy ... from stdin csv; and bytea

Поиск
Список
Период
Сортировка

copy ... from stdin csv; and bytea

От
"David Wilson"
Дата:
My application is adding a bytea column to a table into which data is
dumped in approximately 4k row batches, one batch approximately every
10 seconds. To this point, those dumps have used copy from stdin;
however, I'm having some difficulty getting bytea encodings to work
with it. Specifically, I can't seem to get the parser to recognize
that what I'm handing it is an escaped string that needs to be parsed
back into individual bytes rather than stored as-is. The encoding is
very straightforward for INSERT, of course, but the COPY ... FROM
STDIN CSV doesn't seem to want to work no matter what I've tried:

\\000
"\\000"
"E'\\000'"

etc.

Is there a trick to this that I just didn't see in the documentation,
or is this some limitation of CSV copy-in? If the latter, are there
suggestions for workarounds other than to fallback on the inserts?

Using 8.3.3, and this is specifically via libpq, if that makes a difference.

Thanks much.
--
- David T. Wilson
david.t.wilson@gmail.com

Re: copy ... from stdin csv; and bytea

От
Klint Gore
Дата:
David Wilson wrote:
> My application is adding a bytea column to a table into which data is
> dumped in approximately 4k row batches, one batch approximately every
> 10 seconds. To this point, those dumps have used copy from stdin;
> however, I'm having some difficulty getting bytea encodings to work
> with it. Specifically, I can't seem to get the parser to recognize
> that what I'm handing it is an escaped string that needs to be parsed
> back into individual bytes rather than stored as-is. The encoding is
> very straightforward for INSERT, of course, but the COPY ... FROM
> STDIN CSV doesn't seem to want to work no matter what I've tried:
>
> \\000
> "\\000"
> "E'\\000'"
>
> etc.
>
> Is there a trick to this that I just didn't see in the documentation,
> or is this some limitation of CSV copy-in? If the latter, are there
> suggestions for workarounds other than to fallback on the inserts?
>
Try just a single \

e.g.
"ge.xls","application/vnd.ms-excel",71168,"\320\317\021\340\241[snip]

klint.

--
Klint Gore
Database Manager
Sheep CRC
A.G.B.U.
University of New England
Armidale NSW 2350

Ph: 02 6773 3789
Fax: 02 6773 3266
EMail: kgore4@une.edu.au


Re: copy ... from stdin csv; and bytea

От
"David Wilson"
Дата:
On Mon, Jul 28, 2008 at 1:24 AM, Klint Gore <kgore4@une.edu.au> wrote:
> Try just a single \
>
> e.g.
> "ge.xls","application/vnd.ms-excel",71168,"\320\317\021\340\241[snip]

Thanks- I did try that, and it at least gave the expected output from
select, but is there a way to verify that it's actually handling it
correctly rather than simply storing the sequence of characters? I'm
not certain how to check the actual byte width of a column within a
row, and I'd *really* rather not be storing 4 bytes for every 1 in the
binary if I can avoid it- this column is already going to be doubling
field width; quadrupling it would give me space headaches I really
don't want to deal with. :)

--
- David T. Wilson
david.t.wilson@gmail.com

Re: copy ... from stdin csv; and bytea

От
Klint Gore
Дата:
David Wilson wrote:
> On Mon, Jul 28, 2008 at 1:24 AM, Klint Gore <kgore4@une.edu.au> wrote:
> > Try just a single \
> >
> > e.g.
> > "ge.xls","application/vnd.ms-excel",71168,"\320\317\021\340\241[snip]
>
> Thanks- I did try that, and it at least gave the expected output from
> select, but is there a way to verify that it's actually handling it
> correctly rather than simply storing the sequence of characters? I'm
> not certain how to check the actual byte width of a column within a
> row, and I'd *really* rather not be storing 4 bytes for every 1 in the
> binary if I can avoid it- this column is already going to be doubling
> field width; quadrupling it would give me space headaches I really
> don't want to deal with. :)
>
>
select length(bytea_field) from table

You could use ||pg_relation_size|(|text|)| or
||pg_total_relation_size|(|text|) |to see how much disk space it takes up.

You can play with the storage settings for the column if you want to try
and handle the space better. see alter table set storage.

klint.

--
Klint Gore
Database Manager
Sheep CRC
A.G.B.U.
University of New England
Armidale NSW 2350

Ph: 02 6773 3789
Fax: 02 6773 3266
EMail: kgore4@une.edu.au


Re: copy ... from stdin csv; and bytea

От
Tom Lane
Дата:
Klint Gore <kgore4@une.edu.au> writes:
> David Wilson wrote:
>> I'm not certain how to check the actual byte width of a column within a
>> row,

> select length(bytea_field) from table

If you want the actual on-disk footprint, use pg_column_size()

            regards, tom lane

Re: copy ... from stdin csv; and bytea

От
Klint Gore
Дата:
Tom Lane wrote:
> Klint Gore <kgore4@une.edu.au> writes:
> > David Wilson wrote:
> >> I'm not certain how to check the actual byte width of a column within a
> >> row,
>
> > select length(bytea_field) from table
>
> If you want the actual on-disk footprint, use pg_column_size()
>
>
Size on disk would have the compression from the default storage =
extended wouldn't it?

I verified it for myself manually anyway.

copy (select * from original limit 5) to stdout with csv;

create table foo (like original);
alter table foo alter column bytea_field set storage external;
copy foo from stdin with csv;

select |reltoastrelid from pg_class where relanem = 'original'

found the file for it and looked at it with a hex viewer.
|
klint.

--
Klint Gore
Database Manager
Sheep CRC
A.G.B.U.
University of New England
Armidale NSW 2350

Ph: 02 6773 3789
Fax: 02 6773 3266
EMail: kgore4@une.edu.au