Re: how to avoid deadlock on masive update with multiples delete

Поиск

Список

Период

Сортировка

От	Merlin Moncure
Тема	Re: how to avoid deadlock on masive update with multiples delete
Дата	5 октября 2012 г. 18:55:43
Msg-id	CAHyXU0yZ+qvuE_erWx86GSzTE6KTH3rtc-KEyDgJaTYM5Smx3g@mail.gmail.com обсуждение исходный текст
Ответ на	Re: how to avoid deadlock on masive update with multiples delete (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: how to avoid deadlock on masive update with multiples delete (Tom Lane <tgl@sss.pgh.pa.us>)
Список	pgsql-performance

Дерево обсуждения

On Fri, Oct 5, 2012 at 10:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
>> On Friday, October 05, 2012 05:31:43 PM Tom Lane wrote:
>>> There's no guarantee that the planner won't re-sort the rows coming from
>>> the sub-select, unfortunately.
>
>> More often than not you can prevent the planner from doing that by putting a
>> OFFSET 0 in the query. Not 100% but better than nothing.
>
> No, that will accomplish exactly nothing.  The ORDER BY is already an
> optimization fence.  The problem is that of the several ways the planner
> might choose to join the subquery output to the original table, not all
> will produce the join rows in the same order as the subquery's result
> is.  For instance, when I tried his example I initially got
>
>  Delete on test  (cost=400.88..692.85 rows=18818 width=34)
>    ->  Merge Join  (cost=400.88..692.85 rows=18818 width=34)
>          Merge Cond: (test.g = x.g)
>          ->  Sort  (cost=135.34..140.19 rows=1940 width=10)
>                Sort Key: test.g
>                ->  Seq Scan on test  (cost=0.00..29.40 rows=1940 width=10)
>          ->  Sort  (cost=265.53..270.38 rows=1940 width=32)
>                Sort Key: x.g
>                ->  Subquery Scan on x  (cost=135.34..159.59 rows=1940 width=32)
>                      ->  Sort  (cost=135.34..140.19 rows=1940 width=10)
>                            Sort Key: test_1.ctid
>                            ->  Seq Scan on test test_1  (cost=0.00..29.40 rows=1940 width=10)
>
> which is going to do the deletes in "g" order, not ctid order;
> and then after an ANALYZE I got
>
>  Delete on test  (cost=90.83..120.58 rows=1000 width=34)
>    ->  Hash Join  (cost=90.83..120.58 rows=1000 width=34)
>          Hash Cond: (test.g = x.g)
>          ->  Seq Scan on test  (cost=0.00..16.00 rows=1000 width=10)
>          ->  Hash  (cost=78.33..78.33 rows=1000 width=32)
>                ->  Subquery Scan on x  (cost=65.83..78.33 rows=1000 width=32)
>                      ->  Sort  (cost=65.83..68.33 rows=1000 width=10)
>                            Sort Key: test_1.ctid
>                            ->  Seq Scan on test test_1  (cost=0.00..16.00 rows=1000 width=10)
>
> which is going to do the deletes in ctid order, but that's an artifact
> of using a seqscan on the test table; the order of the subquery's output
> is irrelevant, since it got hashed.
>
>> We really need ORDER BY for DML.
>
> Meh.  That's outside the SQL standard (not only outside the letter of
> the standard, but foreign to its very conceptual model) and I don't
> think the problem really comes up that often.  Personally, if I had to
> deal with this I'd use a plpgsql function (or DO command) that does

Can't it be forced like this (assuming it is in fact a vanilla order
by problem)?

EXPLAIN DELETE FROM test USING (SELECT g FROM test ORDER BY
ctid FOR UPDATE) x where x.g = test.g;

(emphasis on 'for update')

merlin

В списке pgsql-performance по дате отправления:

Предыдущее

От: Tom Lane
Дата: 05 октября 2012 г., 18:46:11
Сообщение: Re: how to avoid deadlock on masive update with multiples delete

Следующее

От: Claudio Freire
Дата: 05 октября 2012 г., 18:58:55
Сообщение: Re: how to avoid deadlock on masive update with multiples delete

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: how to avoid deadlock on masive update with multiples delete

Предыдущее

Следующее