Re: Copy Bulk Ignore Duplicated

Поиск
Список
Период
Сортировка
От Adrian Klaver
Тема Re: Copy Bulk Ignore Duplicated
Дата
Msg-id 46160c63-4e2f-ad70-5c33-260eef49557d@aklaver.com
обсуждение исходный текст
Ответ на Copy Bulk Ignore Duplicated  (Leandro Guimarães <leo.guimaraes@gmail.com>)
Список pgsql-general
On 6/14/19 2:04 PM, Leandro Guimarães wrote:
> Hi,
>     I have a scenario with a large table and I'm trying to insert it via 
> a COPY command with a csv file.
> 
>     Everything works, but sometimes my source .csv file has duplicated 
> data in the previously fulfilled table. If I add a check constraint and 
> try to run the COPY command I have an error that stops the whole insertion.
> 
>    I've tried to put the data in a tmp table and fill the main using 
> distinct this way (the fields and names are just examples):
> 
> INSERT INTO final_table values (name, document)
>     SELECT DISTINCT name, document
>     FROM tmp_TABLE t1
>     WHERE NOT EXISTS (
>     SELECT 1 FROM final_table t2
>     WHERE (t2.name <http://t2.name>, t2.document)
>     IS NOT DISTINCT FROM (t1.name <http://t1.name>, t1.document))

Hit enter too soon on previous post.
Alternative to above query(rough code):

DELETE FROM tmp_TABLE where <check constraint parameters>;
INSERT INTO final_table VALUES () SELECT * tmp_TABLE;

> 
> The problem is that my final_table is a large (and partitioned) table 
> and this query is taking a long time to execute.
> 
> Someone have any idea (really guys anything would be great) how to solve 
> this situation? I need to ignore duplicates instead to have some error.
> 
> I'm using*PostgreSQL 9.4* so I can't use "ON CONFLICT" and upgrade is 
> not an option.
> 
> Thanks and Kind Regards!
> 
> 
> Leandro Guimarães
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



В списке pgsql-general по дате отправления:

Предыдущее
От: Adrian Klaver
Дата:
Сообщение: Re: Copy Bulk Ignore Duplicated
Следующее
От: Tim Cross
Дата:
Сообщение: Re: Copy Bulk Ignore Duplicated