Обсуждение: Import: I need help

Поиск
Список
Период
Сортировка

Import: I need help

От
Quan Zongliang
Дата:
Hi, all

I haven't any experience with libpq. So there are a lot of questions.

When import to db using INSERT statement. Like this:
   res = PQexec(m_conn->connection(), "BEGIN");
   LOOP: res = PQexec(m_conn->connection(), "INSERT INTO ...");
   res = PQexec(m_conn->connection(), "COMMIT");
If any INSERT statement failed in the middle, the whole transaction is aborted.
In my plan, the end user can select a number of rows to commit their work.
(Option: commit every xxx rows.)
Is there a approach to ingnore error statement and leave current transaction
in normal status? If it can be did, the program only need log the error and
go on to deal with left statements.
Or when encounter a error, stop the deal?

If the end user decide to import with INSERT statement,
these statements will be sent to db with asynchronous mode:
LOOP:
   1: submit statement asynchronous
   2: generate next statement
   3: wait for last one to be completed
   4: log
   goto 1
Is it OK?
Or we should use synchronous mode?

First, import function with INSERT statement will be implemented.
The questions about COPY command will come soon. ^-^

A discussion:
In schemas, the pgDatatype had been defined. but it use type-name to recognise type.
I noticed system types had been defined in server/catalog/pg_type.h:
#define BOOLOID            16
#define BYTEAOID        17
and so on.
Why don't use these defination in the pgAdmin?
The judgement with (type==BOOLOID) will get more effective than (name == wxT("boolean")).

Thanks.

-----------------------------------------------
Quan Zongliang
quanzongliang@gmail.com
CIT Japan:  http://www.cit.co.jp
CIT China:  http://www.citbj.com.cn


Re: Import: I need help

От
Dave Page
Дата:
Hi,

On Fri, Feb 13, 2009 at 3:08 AM, Quan Zongliang <quanzongliang@gmail.com> wrote:
> Hi, all
>
> I haven't any experience with libpq. So there are a lot of questions.
>
> When import to db using INSERT statement. Like this:
>   res = PQexec(m_conn->connection(), "BEGIN");
>   LOOP: res = PQexec(m_conn->connection(), "INSERT INTO ...");
>   res = PQexec(m_conn->connection(), "COMMIT");
> If any INSERT statement failed in the middle, the whole transaction is aborted.
> In my plan, the end user can select a number of rows to commit their work.
> (Option: commit every xxx rows.)
> Is there a approach to ingnore error statement and leave current transaction
> in normal status? If it can be did, the program only need log the error and
> go on to deal with left statements.
> Or when encounter a error, stop the deal?

You can use savepoints:
http://www.postgresql.org/docs/8.3/interactive/sql-savepoint.html

> If the end user decide to import with INSERT statement,
> these statements will be sent to db with asynchronous mode:
> LOOP:
>   1: submit statement asynchronous
>   2: generate next statement
>   3: wait for last one to be completed
>   4: log
>   goto 1
> Is it OK?
> Or we should use synchronous mode?

Sounds OK. Async is naturally more complex than sync mode, but it does
allow for a more responsive UI.

> First, import function with INSERT statement will be implemented.
> The questions about COPY command will come soon. ^-^

That's the important mode - it'll be much faster than explicit
imports, but will lack the ability to use savepoints or async
processing.

> A discussion:
> In schemas, the pgDatatype had been defined. but it use type-name to recognise type.
> I noticed system types had been defined in server/catalog/pg_type.h:
> #define BOOLOID                 16
> #define BYTEAOID                17
> and so on.
> Why don't use these defination in the pgAdmin?
> The judgement with (type==BOOLOID) will get more effective than (name == wxT("boolean")).

Those values may, in theory, change between versions of the server,
and also won't exist for user defined types.

--
Dave Page
EnterpriseDB UK:   http://www.enterprisedb.com

Re: Import: I need help

От
Guillaume Lelarge
Дата:
Hi,

Quan Zongliang a écrit :
> [...]
> I haven't any experience with libpq. So there are a lot of questions.
>
> When import to db using INSERT statement. Like this:
>    res = PQexec(m_conn->connection(), "BEGIN");
>    LOOP: res = PQexec(m_conn->connection(), "INSERT INTO ...");
>    res = PQexec(m_conn->connection(), "COMMIT");
> If any INSERT statement failed in the middle, the whole transaction is aborted.
> In my plan, the end user can select a number of rows to commit their work.
> (Option: commit every xxx rows.)
> Is there a approach to ingnore error statement and leave current transaction
> in normal status?

No. If you have an error, the whole transaction is dead. You can deal
with this in two ways. The first one involves doing multiple
COMMIT;BEGIN; statements:

res = PQexec(m_conn->connection(), "BEGIN");
LOOP:
  res = PQexec(m_conn->connection(), "INSERT INTO ...");
  if more than x INSERTs
    res = PQexec(m_conn->connection(), "COMMIT");
    res = PQexec(m_conn->connection(), "BEGIN");
ENDLOOP
res = PQexec(m_conn->connection(), "COMMIT");

The second way is to use SAVEPOINT.

> [...]
> First, import function with INSERT statement will be implemented.
> The questions about COPY command will come soon. ^-^
>

I think you should take a look at pgloader:
  http://pgfoundry.org/projects/pgloader/

It's written in Python but Dimitri had to deal with the same issues than
you. pgloader is able to understand which INSERT fails, and can put them
in a file. It can also use COPY statements.


--
Guillaume.
 http://www.postgresqlfr.org
 http://dalibo.com

Re: Import: I need help

От
Quan Zongliang
Дата:
> The second way is to use SAVEPOINT.

> You can use savepoints:
> http://www.postgresql.org/docs/8.3/interactive/sql-savepoint.html

This a good idea. I forgot it.

> I think you should take a look at pgloader:
>   http://pgfoundry.org/projects/pgloader/
Ok, before the work go on, I study from them.


And more one question:
I think the import work should get a new PGconn instance before it starts.
in document: http://www.postgresql.org/docs/8.3/static/libpq-copy.html said:
It is not possible to execute other SQL commands using the same connection
while the COPY operation is in progress.

How about the INSERT command?
When them start work within a new transaction. It seems to fall into same situation.

-----------------------------------------------
Quan Zongliang
quanzongliang@gmail.com
CIT Japan:  http://www.cit.co.jp
CIT China:  http://www.citbj.com.cn


Re: Import: I need help

От
Quan Zongliang
Дата:
> Those values may, in theory, change between versions of the server,
> and also won't exist for user defined types.
In fact, another type constants had been defined in utils\pgDefs.h
and used in the dlgTypeProperty class.

When I validated the bug report about bit varying from Mr. Hiromichi Nakajima.
New bug be found: can't set length of bit varying type column.
I am trying to fix it and found these definitions.

-----------------------------------------------
Quan Zongliang
quanzongliang@gmail.com
CIT Japan:  http://www.cit.co.jp
CIT China:  http://www.citbj.com.cn


Re: Import: I need help

От
Dave Page
Дата:
On Fri, Feb 13, 2009 at 9:21 AM, Quan Zongliang <quanzongliang@gmail.com> wrote:

> And more one question:
> I think the import work should get a new PGconn instance before it starts.
> in document: http://www.postgresql.org/docs/8.3/static/libpq-copy.html said:
> It is not possible to execute other SQL commands using the same connection
> while the COPY operation is in progress.
>
> How about the INSERT command?
> When them start work within a new transaction. It seems to fall into same situation.

Yes, - you should certainly use a new connection to avoid any conflict
with the main browser.  What would be really cool (later, not now)
would be to divide the workload amongst a user-defined number of
parallel connections.


--
Dave Page
EnterpriseDB UK:   http://www.enterprisedb.com