Re: COPY FROM STDIN instead of INSERT

Поиск
Список
Период
Сортировка
От Casey Duncan
Тема Re: COPY FROM STDIN instead of INSERT
Дата
Msg-id 86D1B80D-CDAA-4F0A-BC71-AFA4A8C07629@pandora.com
обсуждение исходный текст
Ответ на Re: COPY FROM STDIN instead of INSERT  ("Ilja Golshtein" <ilejn@yandex.ru>)
Ответы Re: COPY FROM STDIN instead of INSERT
Список pgsql-general
On Oct 18, 2006, at 5:20 AM, Ilja Golshtein wrote:

>> When starting a database from scratch it is much faster to import the
>> data and then create the indexes. The time to create index on a full
>> table is less than the extra time from each index update from the
>> inserts. The more indexes to update the more time updating indexes
>> takes.
>>
>> The problem with a live database is removing the indexes slows down
>> current users and if you are adding 2,000 rows to a table that
>> already
>> has 5,000,000 rows in it then you will loose the benefit.
>
> I am 100% agree with you. What you are describing is a very good
> and useful technique for some maintenance operations.
>
> My current goal is to increase performance in normal [almost ;)]
> OLTP mode of my application, so removing indexes for some time is
> not an option here.
>
> And my question remains.
> Is it Ok to use COPY BINARY FROM STDIN instead of multiple INSERTs?

I don't think I would use BINARY, it seems likely to be susceptible
to changes in the underlying data type storage. From the docs:

"To determine the appropriate binary format for the actual tuple data
you should consult the PostgreSQL source, in particular the *send and
*recv functions for each column's data type (typically these
functions are found in the src/backend/utils/adt/ directory of the
source distribution)."

Regular text COPY is quite a big performance win over INSERTs, and
doesn't require your application to know the vagaries of the data
storage. Also, if you have many indices, time to update them will
probably dominate anyhow, making the difference between binary and
text copy negligible.

A was mentioned, COPY can only insert static data, and does not
support rules (that you might use to support constraint exclusion,
etc). AFIAK, the locking semantics are the same as INSERT, i.e., it
does not lock the entire table or anything like that. Since it would
allow transactions that insert data to finish faster, it should
actually work better under high concurrency.

-Casey

В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Stats Collector Won't Start
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] query log corrupted-looking entries