Re: Bulk Insert and Index use

Поиск
Список
Период
Сортировка
От Jim J
Тема Re: Bulk Insert and Index use
Дата
Msg-id 41198514.1020103@gvtc.com
обсуждение исходный текст
Ответ на Bulk Insert and Index use  (Rudi Starcevic <tech@wildcash.com>)
Ответы Re: Bulk Insert and Index use
Список pgsql-performance
If the bulk load has the possibility of duplicating data, then you need
to change methods.  Try bulk loading into a temp table,  index it like
the original, eliminate the dups and merge the tables.

It is also possible to do an insert from the temp table into the final
table like:
insert into original (x,x,x)  (select temp.1, temp.2, etc from temp left
join original on temp.street=original.street where original.street is null)

Good Luck
Jim

Rudi Starcevic wrote:

> Hi,
>
> I have a question on bulk checking, inserting into a table and
> how best to use an index for performance.
>
> The data I have to work with is a monthly CD Rom csv data dump of
> 300,000 property owners from one area/shire.
>
> So every CD has 300,000 odd lines, each line of data which fills the
> 'property' table.
>
> Beginning with the first CD each line should require one SELECT and
> one INSERT as it will be the first property with this address.
>
> The SELECT uses fields like 'street' and 'suburb', to check for an
> existing property,
> so I have built an index on those fields.
>
> My question is does each INSERT rebuild the index on the 'street' and
> 'suburb' fields?
> I believe it does but I'm asking to be sure.
>
> If this is the case I guess performance will suffer when I have, say,
> 200,000
> rows in the table.
>
> Would it be like:
>
> a) Use index to search on 'street' and 'suburb'
> b) No result? Insert new record
> c) Rebuild index on 'street' and 'suburb'
>
> for each row?
> Would this mean that after 200,000 rows each INSERT will require
> the index of 000's of rows to be re-indexed?
>
> So far I believe my only options are to use either and index
> or sequential scan and see which is faster.
>
> A minute for your thoughts and/or suggestions would be great.
>
> Thanks.
> Regards,
> Rudi.
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>
>


В списке pgsql-performance по дате отправления:

Предыдущее
От: Litao Wu
Дата:
Сообщение: Re: Slow select, insert, update
Следующее
От: "Gregory S. Williamson"
Дата:
Сообщение: Re: Bulk Insert and Index use