Re: ask for review of MERGE

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: ask for review of MERGE
Дата
Msg-id AANLkTikdP6c_Mw_1mZUtSzidXyWt2aT_5HL5scU=KHCq@mail.gmail.com
обсуждение исходный текст
Ответ на Re: ask for review of MERGE  (Greg Stark <gsstark@mit.edu>)
Ответы Re: ask for review of MERGE  (Simon Riggs <simon@2ndQuadrant.com>)
Список pgsql-hackers
On Mon, Oct 25, 2010 at 4:10 PM, Greg Stark <gsstark@mit.edu> wrote:
> On Mon, Oct 25, 2010 at 12:40 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> Now, as Greg says, that might be what some people want, but it's
>> certainly monumentally unserializable.
>
> To be clear when I said it's what people want what I meant was that in
> the common cases it's doing exactly what people want. As opposed to
> getting closer to what people want in general but not quite hitting
> the mark in the common cases.
>
> Just as an example I think it's important that in the simplest case,
> upsert of a single record, it be 100% guaranteed to do the naive
> upsert. If two users are doing the merge of a single key at the same
> time one of them had better insert and one of them had better update
> or else users are going to be monumentally surprised.

Hmm, so let's think about that case.

The first merge comes along and finds no match so it fires the NOT
MATCHED rule, which inserts a tuple.  The second merge comes along and
finds no match, so it also fires the NOT MATCHED rule and tries to
insert a tuple.  But upon consulting the PRIMARY KEY index it finds
that an in-doubt tuple exists so it goes to sleep waiting for the
first transaction to commit or abort.  If the first transaction
commits it then decides that the jig is up and fails.  We could
(maybe) fix this by doing something similar to what EPQ does for
updates: when the first transaction commits, instead of redoing the
insert, we back up and recheck whether the new tuple would have
matched the join clause and, if so, we instead fire the MATCHED action
on the updated tuple.  If not, we fire NOT MATCHED anyway.  I'm not
sure how hard that would be, or whether it would introduce any other
nasty anomalies in more complex cases.

Alternatively, we could introduce an UPSERT or REPLACE statement
intended to handle exactly this case and leave MERGE for more complex
situations.  It's pretty easy to imagine what the coding of that
should look like: if we encounter an in-doubt tuple in we wait on its
xmin.  If the transaction aborts, we insert.  If it commits, and we're
in READ COMMITTED mode, we update it; but if we're in REPEATABLE READ
or SERIALIZABLE mode, we abort with a serialization error.  That's a
lot simpler to understand and reason about than MERGE in its full
generality.

I think it's pretty much hopeless to think that MERGE is going to work
in complex concurrent scenarios without creating serialization
anomalies, or at least rollbacks.  I think that's baked into the
nature of what the statement does.  To simulate MERGE, you need to
read from the target table and then do writes that depend on what you
read.  If you do that with the commands that are available today,
you're going to get serialization anomalies and/or rollbacks under
concurrency.  The mere fact of that logic being inside the database
rather than outside isn't going to make that go away.  Now sometimes,
as with exclusion constraints, you can play games with dirty snapshots
to get the semantics you want, but whether that's possible in a
particular case depends on the details of the operation being
performed, and here I think it can't be done.  Some operations are
*fundamentally* unserializable.

A very simple example of this is a sequence that is guaranteed not to
have gaps (a feature we've occasionally been requested to provide).
If N processes request a sequence number simultaneously, you have to
hand out a value to the first guy and wait and see whether he commits
or aborts before deciding what number to give the second guy.  That
sucks, so usually we just design our applications not to require that
sequences be gap-free.  Similarly here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


В списке pgsql-hackers по дате отправления:

Предыдущее
От: James Cloos
Дата:
Сообщение: Re: Floating-point timestamps versus Range Types
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Floating-point timestamps versus Range Types