Re: copy from questions

Поиск
Список
Период
Сортировка
От Steve Crawford
Тема Re: copy from questions
Дата
Msg-id 50D1EFE0.8010205@pinpointresearch.com
обсуждение исходный текст
Ответ на copy from questions  (Kirk Wythers <kirk.wythers@gmail.com>)
Ответы Re: copy from questions
Список pgsql-general
On 12/19/2012 08:13 AM, Kirk Wythers wrote:
> I am using version 9.1 and have a large number of files to insert. I am trying to use a simple COPY FROM command but
havea couple questions. 
>
> 1. There are a small number of instances where there are duplicate records that are being caught by the primary key
(asit is supposed to do). However, the duplicate records are artifacts of a recording system and can be ignored. I
wouldlike to be able to simply skip the duplicate or UPDATE the table with the duplicate… Anything that allows the COPY
FROMto proceed while adding only one of the duplicate records to the table. 
>
> 2. SInce I have several hundred files to perform a COPY FROM on, I'd like to automate the import in some way… sort of
a,grab all files in the directory approach: 
>
> COPY newtable FROM '/directory_of_files/*' WITH CSV HEADER DELIMITER AS ',' NULL AS 'NA';
>
>
I suppose you could use a trigger to check each record before inserting
but that is likely to be inefficient for bulk loads. A quick bash loop
is probably your best bet. Something along the lines of:

for inputfile in /infiledirectory/*.csv
do
     cat inputfile | psql [connection-params] -c '\copy rawinput from
stdin csv header...'
done

This imports everything into a "staging" table (I called it rawinput).
 From there you can create your final table with SELECT DISTINCT...

For speed make sure that you create your staging table as "unlogged".

Cheers,
Steve



В списке pgsql-general по дате отправления:

Предыдущее
От: Kirk Wythers
Дата:
Сообщение: copy from questions
Следующее
От: Adrian Klaver
Дата:
Сообщение: Re: How to startup the database server?