Re: Pathological performance when inserting many NULLs into a unique index

Поиск

Список

Период

Сортировка

От	Peter Geoghegan
Тема	Re: Pathological performance when inserting many NULLs into a unique index
Дата	19 апреля 2019 г. 03:33:55
Msg-id	CAH2-Wz==MsfGnLb4m_JLrPSwuL1x0K=vB1_n0msT92cWOq51zQ@mail.gmail.com обсуждение исходный текст
Ответ на	Pathological performance when inserting many NULLs into a unique index (Peter Geoghegan <pg@bowt.ie>)
Список	pgsql-hackers

Дерево обсуждения

On Wed, Apr 17, 2019 at 7:37 PM Peter Geoghegan <pg@bowt.ie> wrote:
> Tentatively, I think that the fix here is to not "itup_key->scantid =
> NULL" when a NULL value is involved (i.e. don't "itup_key->scantid =
> NULL" when we see IndexTupleHasNulls(itup) within _bt_doinsert()). We
> may also want to avoid calling _bt_check_unique() entirely in this
> situation.

> I'll create an open item for this, and begin work on a patch tomorrow.

I came up with the attached patch, which effectively treats a unique
index insertion as if the index was not unique at all in the case
where new tuple is IndexTupleHasNulls() within _bt_doinsert(). This is
correct because inserting a new tuple with a NULL value can neither
contribute to somebody else's duplicate violation, nor have a
duplicate violation error of its own. I need to test this some more,
though I am fairly confident that I have the right idea.

We didn't have this problem before my v12 work due to the "getting
tired" strategy, but that's not the full story. We also didn't (and
don't) move right within _bt_check_unique() when
IndexTupleHasNulls(itup), because _bt_isequal() treats NULL values as
not equal to any value, including even NULL -- this meant that there
was no useless search to the right within _bt_check_unique().
Actually, the overall effect of how _bt_isequal() is used is that
_bt_check_unique() does *no* useful work at all with a NULL scankey.
It's far simpler to remove responsibility for new tuples/scankeys with
NULL values from _bt_check_unique() entirely, which is what I propose
to do with this patch.

The patch actually ends up removing quite a lot more code than it
adds, because it removes _bt_isequal() outright. I always thought that
nbtinsert.c dealt with NULL values in unique indexes at the wrong
level, and I'm glad to see _bt_isequal() go. Vadim's accompanying 1997
comment about "not using _bt_compare in real comparison" seems
confused to me. The new _bt_check_unique() may still need to compare
the scankey to some existing, adjacent tuple with a NULL value, but
_bt_compare() is perfectly capable of recognizing that that adjacent
tuple shouldn't be considered equal. IOW, _bt_compare() is merely
incapable of behaving as if "NULL != NULL", which is a bad reason for
keeping _bt_isequal() around.

--
Peter Geoghegan

Вложения

0001-Prevent-O-N-2-unique-index-insertion-edge-case.patch

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Andres Freund
Дата: 19 апреля 2019 г., 03:30:20
Сообщение: Comments for lossy ORDER BY are lacking

Следующее

От: Andres Freund
Дата: 19 апреля 2019 г., 03:37:00
Сообщение: Re: Comments for lossy ORDER BY are lacking

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Pathological performance when inserting many NULLs into a unique index

Вложения

Предыдущее

Следующее