Re: COPY from .csv File and Remove Duplicates

Поиск
Список
Период
Сортировка
От David Johnston
Тема Re: COPY from .csv File and Remove Duplicates
Дата
Msg-id 6A1FF13C-7526-4296-A132-B4F4988F5E17@yahoo.com
обсуждение исходный текст
Ответ на COPY from .csv File and Remove Duplicates  (Rich Shepard <rshepard@appl-ecosys.com>)
Ответы Re: COPY from .csv File and Remove Duplicates  (Rich Shepard <rshepard@appl-ecosys.com>)
Re: COPY from .csv File and Remove Duplicates  ("David Johnston" <polobo@yahoo.com>)
Re: COPY from .csv File and Remove Duplicates  (Rich Shepard <rshepard@appl-ecosys.com>)
Список pgsql-general
On Aug 11, 2011, at 19:13, Rich Shepard <rshepard@appl-ecosys.com> wrote:

>  A table has a sequence to generate a primary key for inserted records with
> NULLs in that column.
>
>  I have a .csv file of approximately 10k rows to copy into this table. My
> two questions which have not been answered by reference to my postgres
> reference book or Google searches are:
>
>  1) Will the sequence automatically add the nextval() to each new record as
> the copy command runs?
>
>  2) Many of these rows almost certainly are already in the table. I would
> like to remove duplicates either during the COPY command or immediately
> after. I'm considering copying the new data into a clone of the table then
> running a SELECT to add only those rows in the new cloned table to the
> existing table.
>
>  Suggestions on getting these data in without duplicating existing rows
> will be very helpful.
>

If you have duplicates with matching real keys inserting into a staging table and then moving new records to the final
tableis your best option (in general it is better to do a two-step with a staging table since you can readily use
Postgresqlto perform any intermediate translations)  As for the import itself, I believe if the column with the
sequenceis present in the csv the sequence will not be used and, if no value is present, a null will be stored for that
column- causing any not-null constraint to throw an error.  In this case I would just import the data to a staging
tablewithout any kind of artificial key, just the true key, and then during the merge with the live table you simply
omitthe pk field from the insert statement and the sequence will kick in at that point. 

David J.


В списке pgsql-general по дате отправления:

Предыдущее
От: David Johnston
Дата:
Сообщение: Re: Regex Query Index question
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Regex Query Index question