Re: Bulkloading using COPY - ignore duplicates?

Поиск

Список

Период

Сортировка

От	Peter Eisentraut
Тема	Re: Bulkloading using COPY - ignore duplicates?
Дата	13 декабря 2001 г. 13:29:59
Msg-id	Pine.LNX.4.30.0112131714310.647-100000@peter.localdomain обсуждение исходный текст
Ответ на	Bulkloading using COPY - ignore duplicates? (Lee Kindness <lkindness@csl.co.uk>)
Список	pgsql-hackers

Дерево обсуждения

Lee Kindness writes:

>  1. Performance enhancements when doing doing bulk inserts - pre or
> post processing the data to remove duplicates is very time
> consuming. Likewise the best tool should always be used for the job at
> and, and for searching/removing things it's a database.

Arguably, a better tool for this is sort(1).  For instance, if you have a
typical copy input file with tab-separated fields and the primary key is
in columns 1 and 2, you can remove duplicates with

sort -k 1,2 -u INFILE > OUTFILE

To get a record of what duplicates were removed, use diff.

-- 
Peter Eisentraut   peter_e@gmx.net

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Bulkloading using COPY - ignore duplicates?