Re: Making joins involving ctid work for the benefit of UPSERT

Поиск

Список

Период

Сортировка

От	Robert Haas
Тема	Re: Making joins involving ctid work for the benefit of UPSERT
Дата	28 июля 2014 г. 18:37:15
Msg-id	CA+TgmoZdW5EfRVRgkdJzXG2qun2FnSf_c1eNunO0gS+svhp6+Q@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Making joins involving ctid work for the benefit of UPSERT (Peter Geoghegan <pg@heroku.com>)
Ответы	Re: Making joins involving ctid work for the benefit of UPSERT (Peter Geoghegan <pg@heroku.com>) Re: Making joins involving ctid work for the benefit of UPSERT (Bruce Momjian <bruce@momjian.us>)
Список	pgsql-hackers

Дерево обсуждения

On Wed, Jul 23, 2014 at 7:35 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> It's certain arguable whether you should INSERT and then turn failures
>> into an update or try to UPDATE and then turn failures into an INSERT;
>> we might even want to have both options available, though that smells
>> a little like airing too much of our dirty laundry.  But I think I
>> generally favor optimizing for the UPDATE case for more or less the
>> same reasons Kevin articulated.
>
> I don't see the connection between this and Kevin's remarks. And FWIW,
> I don't see a reason to favor inserts or updates. Fortunately, what I
> have balances both cases very well, and doesn't cause bloat. The work
> of descending the index to lock it isn't wasted if an update is
> required. My implementation decides to either insert or update at
> literally the latest possible moment.

AFAIUI, this is because your implementation uses lwlocks in a way that
Andres and I both find unacceptable.  My suspicion is that any version
of this that ends up getting committed is going to involve a risk of
bloat in cases involving retries, and I think it will be easier to
minimize bloat in an update-driven implementation.  But I suppose
that's speculative.

>> Here you seem to be suggested that I intended to propose your existing
>> design rather than something else, which I didn't.  In this design,
>> you find the conflict (at most one) but scanning for the tuple to be
>> updated.
>
> Yes, but what if you don't see a conflict because it isn't visible to
> your snapshot, and then you insert, and only then (step 5), presumably
> with a dirty snapshot, you find a conflict? How does the loop
> terminate if that brings you back to step 1 with the same MVCC
> snapshot feeding the update?

Good point.  Maybe the syntax should be something like:

UPSERT table (keycol [, keycol] ...) { VALUES (val [, val] ...) [,
...] | select_query }

That would address both the concern about being able to pipe multiple
tuples through it and the point you just raised.  We look for a row
that matches each given tuple on the key columns; if one is found, we
update it; if none is found, we insert.

> I agree that you want to uniquely identify each tuple. What I meant
> was, why should we not be able to upsert multiple rows in a single
> command? What's wrong with that?

Nothing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Kevin Grittner
Дата: 28 июля 2014 г., 18:24:32
Сообщение: Re: delta relations in AFTER triggers

Следующее

От: Robert Haas
Дата: 28 июля 2014 г., 18:46:28
Сообщение: Re: gaussian distribution pgbench -- splits v4

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Making joins involving ctid work for the benefit of UPSERT

Предыдущее

Следующее