Re: Bulk DML performance

Поиск
Список
Период
Сортировка
От Álvaro Herrera
Тема Re: Bulk DML performance
Дата
Msg-id 202503171053.nlf3ey2vm2vu@alvherre.pgsql
обсуждение исходный текст
Ответ на RE: Bulk DML performance  (<bill.poole@ymail.com>)
Список pgsql-performance
On 2025-Mar-13, bill.poole@ymail.com wrote:

> I need to perform a merge (INSERT ... ON CONFLICT ... DO UPDATE) on
> the data, so sadly I cannot use COPY.
> 
> I have discovered that for some reason, performing the original insert
> without the ON CONFLICT statement is twice as fast as performing the
> original insert with an ON CONFLICT ... DO UPDATE clause, completing
> in 4 seconds instead of 8. That seems strange to me because I wouldn't
> have thought it would be doing any additional work since a unique
> constraint is on the primary key, so each inserted value would need to
> be checked in either case, and there is no extra work to be done in
> either case.

As I recall, INSERT .. ON CONFLICT UPDATE requires to _insert_ a value
in the index prior to inserting the heap tuple, to guarantee uniqueness
in face of potentially concurrent inserters of the same value.  Maybe
have a look at the WAL produced by the operation with "pg_waldump -z" to
get some idea of the volume of each type of record.


Maybe you could try to use MERGE rather than INSERT .. ON CONFLICT
UPDATE.  The concurrency modelling there is different, and it will
probably have lower overhead.  But you may need to lock the table
explicitly to prevent concurrency problems.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Doing what he did amounts to sticking his fingers under the hood of the
implementation; if he gets his fingers burnt, it's his problem."  (Tom Lane)



В списке pgsql-performance по дате отправления: