Re: Duplicated entries are not ignored even if a "do instead nothing" rule is added.

Поиск
Список
Период
Сортировка
От Alban Hertroys
Тема Re: Duplicated entries are not ignored even if a "do instead nothing" rule is added.
Дата
Msg-id 7ED04668-3FD2-423D-93EE-F2F1D4450171@gmail.com
обсуждение исходный текст
Ответ на Duplicated entries are not ignored even if a "do instead nothing" rule is added.  (邓尧 <torshie@gmail.com>)
Ответы Re: Duplicated entries are not ignored even if a "do instead nothing" rule is added.  (Merlin Moncure <mmoncure@gmail.com>)
Список pgsql-general
On 3 Jan 2012, at 5:20, 邓尧 wrote:

> Hi,
>
> I'm new to pgsql, I need the do something like the "INSERT IGNORE" in mysql. After some searching I got a solution,
whichis adding a "do instead nothing" rule to the corresponding table, but it fails sometimes. 

Yeah, if a concurrent transaction tries to create the same record, one of the transactions is going to find that it
alreadyexists on transaction commit. An INSERT-rule is not going to protect you against that. 

> The table and the rule is created with the following sql statements:
>
> create sequence ACCOUNT_ID_SEQ;
> create table ACCOUNT (
>     ID bigint primary key default nextval('ACCOUNT_ID_SEQ'),
>     HOME char(255) not null,
>     NAME char(255)
> );
> create unique index on ACCOUNT(HOME);
> create index on ACCOUNT(NAME);

It seems to me that account(home) is actually the PK - do you really need the artificial id column? That is a matter of
personalpreference; wars are waged on artificial vs natural keys. People in here will usually tell you to use what fits
theproblem best, both sides have benefits and drawbacks ;) 

Another problem you'll have is that char columns are padded up to their full size with spaces - you'll end up trimming
everyvalue in your client applications. You probably want varchar(255) or perhaps better, text. The latter also rids
youof that 255 length limit from mysql. 

> There are about 20 clients do the following insertion (no UPDATE, some of them might DELETE):
> begin transaction:
> insert into ACCOUNT(HOME) values (v1);
> insert into ACCOUNT(HOME) values (v2);
> ...
> commit;
>
> Sometimes I got the error says the unique constraint "account_home_idx" is violated.
>
> Any suggestions?

I assume you're talking about parallel inserts from a multi-process tool for importing this data? If that's the case
thenthere are a number of solutions commonly used. 

First of all, especially if you're inserting a lot of data like this, see if you can use COPY FROM STDIN instead. That
loadsthe whole transaction contents in one go, which is a lot more efficient then thousands of sequential inserts. As
it'sa single statement that way, you don't even need to wrap it in a transaction anymore - you'll get an implicit
transactionper single statement, which is in this case exactly what you want for this single COPY statement. 

The other thing people usually do is to insert the data into a staging table without UNIQUE constraints. After that
theyissue: 

INSERT INTO account(home) SELECT DISTINCT home FROM staging_table WHERE NOT EXISTS (SELECT 1 FROM account WHERE
account.home= staging_table.home); 

Other options are to use external tools written for batch inserting large amounts of data. I seem to recall pgfouine is
suchan application, but I've never used it. 

Alban Hertroys

--
The scale of a problem often equals the size of an ego.



В списке pgsql-general по дате отправления:

Предыдущее
От: Thomas Markus
Дата:
Сообщение: Re: Adding German Character Set to PostgresSQL
Следующее
От: Florian Weimer
Дата:
Сообщение: Re: Adding German Character Set to PostgresSQL