Re: Duplicate deletion optimizations

Поиск

Список

Период

Сортировка

От	Jochen Erwied
Тема	Re: Duplicate deletion optimizations
Дата	9 января 2012 г. 03:42:24
Msg-id	1723406361.20120107151837@erwied.eu обсуждение
Ответ на	Re: Duplicate deletion optimizations ("Marc Mamin" <M.Mamin@intershop.de>)
Список	pgsql-performance

Дерево обсуждения

Saturday, January 7, 2012, 1:21:02 PM you wrote:

>   where t_imp.id is null and test.id=t_imp.id;
>   =>
>   where t_imp.id is not null and test.id=t_imp.id;

You're right, overlooked that one. But the increase to execute the query is
- maybe not completely - suprisingly minimal.

Because the query updating the id-column of t_imp fetches all rows from
test to be updated, they are already cached, and the second query is run
completely from cache. I suppose you will get a severe performance hit when
the table cannot be cached...

I ran the loop again, after 30 minutes I'm at about 3-5 seconds per loop,
as long as the server isn't doing something else. Under load it's at about
10-20 seconds, with a ratio of 40% updates, 60% inserts.

> and a partial index on matching rows might help (should be tested):

>  (after the first updat)
>  create index t_imp_ix on t_imp(t_value,t_record,output_id) where t_imp.id is not null.

I don't think this will help much since t_imp is scanned sequentially
anyway, so creating an index is just unneeded overhead.

--
Jochen Erwied     |   home: jochen@erwied.eu     +49-208-38800-18, FAX: -19
Sauerbruchstr. 17 |   work: joe@mbs-software.de  +49-2151-7294-24, FAX: -50
D-45470 Muelheim  | mobile: jochen.erwied@vodafone.de       +49-173-5404164

В списке pgsql-performance по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Duplicate deletion optimizations