Обсуждение: Duplicate-key-detection failure case found in btree

Поиск
Список
Период
Сортировка

Duplicate-key-detection failure case found in btree

От
Tom Lane
Дата:
I have found a case in nbtree where it is possible for two concurrent
transactions to insert the same key value in a unique index :-(

The reason is that _bt_check_unique can (of course) only detect keys
that are already in the index.  To prevent concurrent insertion of
duplicate keys, we have to rely on locking.  Two would-be inserters
of the same key must lock the same target page in _bt_doinsert, so
they cannot run the check at the same time.  The second to obtain
the lock will see the first one's already-inserted key.

However, suppose that the index contains many existing instances
of the same key (presumably pointing at deleted-but-not-yet-vacuumed
tuples).  If the existing instances span several index pages, it is
likely that _bt_insertonpg will decide to insert the new key on one
of the pages to the right of the original target key.  As presently
coded, _bt_insertonpg drops the write lock it has before acquiring
write lock on the next page to the right.  This creates a window
wherein another process could perform _bt_check_unique, fail to
find any non-dead instances of the key, and decide that it's okay
to insert the same key.

The fix is to acquire the next page's write lock before we drop the
current page's.  This is deadlock-free since no writer ever tries
to chain write-locks to the left (in fact the same thing is done in
the page split logic).

I am not convinced that this explains any of the field reports of
duplicate rows that we've seen, but it's clearly a bug.  Will commit
the fix shortly.
        regards, tom lane


Re: Duplicate-key-detection failure case found in btree

От
Bruce Momjian
Дата:
> The fix is to acquire the next page's write lock before we drop the
> current page's.  This is deadlock-free since no writer ever tries
> to chain write-locks to the left (in fact the same thing is done in
> the page split logic).
> 
> I am not convinced that this explains any of the field reports of
> duplicate rows that we've seen, but it's clearly a bug.  Will commit
> the fix shortly.

Sounds good to me.  Good find.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Duplicate-key-detection failure case found in btree

От
Jan Wieck
Дата:
Bruce Momjian wrote:
> > The fix is to acquire the next page's write lock before we drop the
> > current page's.  This is deadlock-free since no writer ever tries
> > to chain write-locks to the left (in fact the same thing is done in
> > the page split logic).
> >
> > I am not convinced that this explains any of the field reports of
> > duplicate rows that we've seen, but it's clearly a bug.  Will commit
> > the fix shortly.
>
> Sounds good to me.  Good find.
   It  always scares me that bugs like this can live untriggered   for multiple releases.  What  else  is  lingering
under the   surface ...
 
   And  I  absolutely  agree  with  Tom,  there is no connection   between this bug and the duplicate rows reported.
Good catch   anyway.
 


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com



Re: Duplicate-key-detection failure case found in btree

От
Tom Lane
Дата:
Jan Wieck <janwieck@yahoo.com> writes:
>     It  always scares me that bugs like this can live untriggered
>     for multiple releases.  What  else  is  lingering  under  the
>     surface ...

FWIW, I don't think that this bug did survive for multiple releases
(though it very nearly made it into 7.2).  The present form of
bt_check_unique and bt_insertonpg was new code in 7.1, replacing
the BTP_CHAIN method of handling identical keys.  That code had a
lot of bugs of its own, but I don't think it had this one.  I will
take the blame for introducing this bug into 7.1 ...
        regards, tom lane