Re: Unexpected sequential scan on an indexed column

Поиск

Список

Период

Сортировка

От	Eddy Escardo-Raffo
Тема	Re: Unexpected sequential scan on an indexed column
Дата	15 ноября 2009 г. 22:59:45
Msg-id	4eaa4a5e0911151559y1cc43ee8md2d7fe14dfa6f2b1@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Unexpected sequential scan on an indexed column (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: Unexpected sequential scan on an indexed column
Список	pgsql-performance

Дерево обсуждения

Thanks, Tom. I had discarded the possibility of data type mismatch already, which was your first guess, but was wondering if the lopsided distribution of location values would lead the planner to make a decision that is good on average but bad for this particular query, as you point out in your second guess.

I'll try populating the test users with a more evenly distributed location field, which will be more realistic anyway, and see if that works out better.

BTW, the -1 is not really a dummy value, but it's just a value that we have been using in tests for "fake test location ID". I just started performance measurement for my application and so far had measured performance with every user being in the same default location and things seemed to be going well, so I tried to switch a couple users to a different location and see what happened, and that made performance drop significantly.

(even more detail: my queries also limit results to 10 approx, so DB quickly found 10 rows that match location -1, but it took a while to discover there weren't more than 2 rows with the other value).

Thanks!

Eddy

On Sun, Nov 15, 2009 at 3:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Eddy Escardo-Raffo <eescardo@kikini.com> writes:
> The table used in this query is called "users", and it has columns "userid"
> (primary key) and "location".
> The "location" column is indexed.
> The users table has 1 million rows, and all rows have integer typed value
> '-1' for "location" column, except for 2 rows that have the integer value
> '76543'.

Oh, after poking at it a bit more, I realize the problem: the planner
doesn't want to use an indexscan because it assumes there's a
significant probability that the search will be for -1 (in which case
the indexscan would be slower than a seqscan, as indeed your results
prove). Even though it could know in this particular case that the
comparison value isn't -1, I doubt that teaching it that would help your
real queries where it will probably be impossible to determine the
comparison values in advance.

I would suggest considering using NULL rather than inventing a dummy
value for unknown locations. The estimation heuristics will play a
lot nicer with that choice.

regards, tom lane

В списке pgsql-performance по дате отправления:

Предыдущее

От: Tom Lane
Дата: 15 ноября 2009 г., 22:34:09
Сообщение: Re: Unexpected sequential scan on an indexed column

Следующее

От: Eddy Escardo-Raffo
Дата: 16 ноября 2009 г., 00:06:41
Сообщение: Re: Unexpected sequential scan on an indexed column

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Unexpected sequential scan on an indexed column

Предыдущее

Следующее