Re: [SQL] "SELECT IN" Still Broken in 7.4b

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: [SQL] "SELECT IN" Still Broken in 7.4b
Дата
Msg-id 25062.1061498540@sss.pgh.pa.us
обсуждение исходный текст
Ответы Re: [SQL] "SELECT IN" Still Broken in 7.4b  (Stephan Szabo <sszabo@megazone.bigpanda.com>)
Re: [SQL] "SELECT IN" Still Broken in 7.4b  (Manfred Koizar <mkoi-pg@aon.at>)
Re: [SQL] "SELECT IN" Still Broken in 7.4b  (Josh Berkus <josh@agliodbs.com>)
Re: [SQL] "SELECT IN" Still Broken in 7.4b  (Joe Conway <mail@joeconway.com>)
Re: [SQL] "SELECT IN" Still Broken in 7.4b  ("Shridhar Daithankar" <shridhar_daithankar@persistent.co.in>)
Список pgsql-hackers
Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> Within the scope of the new hashed IN stuff I believe so in at least some
> cases.  I have a few million row table of integers where searching for
> values IN (~10000 values) takes longer than creating the temp table,
> copying into it and doing the in subquery.

I did some profiling and soon realized that the main problem is the
executor's trick for not returning the same row multiple times in a
multiple-indexscan plan node.  The point is that givenWHERE a = 1 OR b = 1
you could create a plan that first indexscans on a, then indexscans on
b --- but you mustn't return any tuples in the second scan that you
already returned in the first.  IndexNext solves this by evaluating the
prior-scan index conditions to see if they are true.  Which is okay if
there aren't too many of them.  But when you have an N-element IN list
this means you are going to do O(N^2) index expression evaluations.
In the 10000-element IN-list test case, ExecQual gets called almost
50 million times :-(

I'm toying with the notion of ripping out that logic and instead
building an in-memory hashtable of already-returned TIDs.  This could
theoretically run out of memory if the multiple indexscan returns enough
tuples, but I think in practice that wouldn't happen because the planner
wouldn't choose an indexscan when very large numbers of tuples are being
selected.

Comments?
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ian Barwick
Дата:
Сообщение: Re: [GENERAL] [pgsql-advocacy] Need concrete "Why Postgres not MySQL" bullet list
Следующее
От: Manfred Koizar
Дата:
Сообщение: Decent VACUUM (was: Buglist)