Re: MERGE vs REPLACE

Поиск

Список

Период

Сортировка

От	Bruce Momjian
Тема	Re: MERGE vs REPLACE
Дата	20 ноября 2005 г. 16:27:11
Msg-id	200511201726.jAKHQI007897@candle.pha.pa.us обсуждение исходный текст
Ответ на	Re: MERGE vs REPLACE (Tom Lane <tgl@sss.pgh.pa.us>)
Список	pgsql-hackers

Дерево обсуждения

Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Oh, good point.  I was thinking just about concurrent MERGEs.  However,
> > it is more complicated than that.  By definitaion you can not see
> > changes from other transactions while your statement is being run (even
> > if you increment CommandCounter), so to be atomic, you would still see
> > the row even though some other transaction had deleted it.
> 
> We would have to use the same semantics we use now for read-committed
> UPDATE, that is look at the latest version of the row even though this
> would not normally be visible to the transaction's snapshot.
> 
> In the case of a serializable transaction, no doubt we should fail if
> any concurrent change actually happens.

I have some psaudocode to explain what we want for this feature,
whatever syntax we choose:
StartCheck unique indexFound    lock row for update    if zero rows, return to start    if more than one row, fail
updaterowNotfound    create savepoint    insert row into heap    lock index page    if conflicting index entry, abort
savepoint,return to start    add index entry    unlock index page

While the "notfound" case might look strange, we actually use this exact
method for inserts now, see ExecInsert() and _bt_doinsert(). 
Particularly see this comment in the second function:
   /*    * If we're not allowing duplicates, make sure the key isn't already in    * the index.    *    * NOTE:
obviously,_bt_check_uniquecan only detect keys that are already in    * the index; so it cannot defend against
concurrentinsertions of the    * same key.  We protect against that by means of holding a write lock on    * the target
page. Any other would-be inserter of the same key must    * acquire a write lock on the same target page, so only one
would-be   * inserter can be making the check at one time.  Furthermore, once we are    * past the check we hold write
lockscontinuously until we have performed    * our insertion, so no later inserter can fail to see our insertion.    *
(Thisrequires some care in _bt_insertonpg.)    *    * If we must wait for another xact, we release the lock while
waiting,and    * then must start over completely.    */

Here is the unique check error from _bt_check_unique():
                   ereport(ERROR,                           (errcode(ERRCODE_UNIQUE_VIOLATION),
errmsg("duplicatekey violates unique constraint \"%s\"",                          RelationGetRelationName(rel))));

I think the problem here is that it is going to longjump() back to
postgres.c (and out of your code loop).  While we have savepoints, I
think they only work coming from client applications, rather than inside
our code.  Ideally you would like to be able to say:
savepoint();func();rollback_to_savepoint();

but you can't, so I think you are going to have to factor out that
unique error callback and return a failure code to the caller.  I
suppose some boolean flag need to be added to _bt_doinsert(), but that
is called via a function pointer for the index type, so you are going to
have to update the insert function signatures for all access methods. 
The good news is that only btree supports unique indexes, according to
the documentation ("Only B-tree currently supports unique indexes") so
for the other access methods the extra parameter is just ignored.

Another issue is multiple unique indexes.  What if the first unique
index matches one row, but a different row matches the second unique
indexed column?  Fail because unique checks do not identify exactly one
row?

Or the _new_ value for the second indexed column conflicts with the
second unique index. The MERGE/REPLACE should fail.  The UPDATE block
will handle this on its own, but the INSERT block will need to check for
that an really error out, rather than return to the caller, so the loop
in ExecInsertIndexTuples() has to restart on unique failure _only_ on
the first index check, not the subsequent ones.

One simplification would be to allow MERGE/REPLACE only on a table that
has a single unique index.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

В списке pgsql-hackers по дате отправления:

Предыдущее

От: "Pavel Stehule"
Дата: 20 ноября 2005 г., 16:05:42
Сообщение: Re: Returning multiple result sets

Следующее

От: Martijn van Oosterhout
Дата: 20 ноября 2005 г., 17:06:05
Сообщение: Re: Returning multiple result sets

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: MERGE vs REPLACE

Предыдущее

Следующее