Re: raw output from copy

Поиск
Список
Период
Сортировка
От Pavel Stehule
Тема Re: raw output from copy
Дата
Msg-id CAFj8pRAx0p3X9T=VB0vpnbG7byx+jv8GrQL6zvmX_My+dq4xnw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: raw output from copy  ("Dickson S. Guedes" <listas@guedesoft.net>)
Список pgsql-hackers


2015-07-23 22:05 GMT+02:00 Dickson S. Guedes <listas@guedesoft.net>:
2015-07-07 3:32 GMT-03:00 Pavel Stehule <pavel.stehule@gmail.com>:
>
> Hi
>
> previous patch was broken, and buggy
>
> Here is new version with fixed upload and more tests
>
> The interesting is so I should not to modify interface or client - so it should to work with any current driver with protocol support >= 3.

Hi
 

Hi Pavel,

Here are some thoughts:

1) from docs: "only row data in network byte order are exported or imported."

Should it be "only raw data"?

I don't understand well - it use a PostgreSQL buildin "send" functions - and result of these functions is defined as "in network byte order"

 

2) from docs "Because this format doesn't support any delimiter, only
one value  can be exported or imported. NULL values are not allowed."

That "only one value can be exported or imported" is a little sad for
someone with a table with more than one column that accepts bytea. The
implemented feature doesn't covers the use-case where a table 'image'
has columns: id integer, image bytea, thumbnail bytea, and I want to
import binary data in that. We could put here the cases where we have
NOT NULL columns. Since these are expected and the error messages
complain about that couldn't them be covered in docs more explicitly?

This mode should not to replace current COPY binary mode. RAW binary output for multiple fields is terrible complex task - you can use a fix length, you can use some special separator etc. I remember a terrible complex bulkload on Oracle or MSSQL - and I would to design it differently. I prefer to have a COPY statement simple as possible - If you need import/export all fields in record - then you can:

1. you can use a new LO api (for import) - load binary files as LO, INSERT and drop used LO
2. call more COPY statements, and join exported files with operation system tools (for export),
3. you can write specialized application that will support a COPY API and  export, import data in your preferred format.

The same complexity is with input, and I would not to write generic binary files parser.

 

3) from code: "bool row_processed; /* true, when first row was processed */"

in this mode is only one row - so first_row_processed sounds little bit strange.
 

Maybe rename the variable to something like `first_row_processed` and
rip off the comment?

4) from code:

if (cstate->raw)
    format = 2;
else if (cstate->binary)
    format = 1;
else
    format = 0;

Maybe create a constant for code readability?

good idea


If by one side this feature does not covers a more generalized case,
by other is a nice start, IMHO.

It is exactly what I don't would - the complexity of usage can go up to sky with generic binary format file processing.

Regards

Pavel
 

--
Dickson S. Guedes
mail/xmpp: guedes@guedesoft.net - skype: guediz
http://github.com/guedes - http://guedesoft.net
http://www.postgresql.org.br

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: pg_dump quietly ignore missing tables - is it bug?
Следующее
От: Ildus Kurbangaliev
Дата:
Сообщение: Re: RFC: replace pg_stat_activity.waiting with something more descriptive