Re: why "WHERE uid NOT IN" is so slow, and EXCEPT in the same situtation is fast?

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: why "WHERE uid NOT IN" is so slow, and EXCEPT in the same situtation is fast?
Дата
Msg-id 28448.1217476507@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: why "WHERE uid NOT IN" is so slow, and EXCEPT in the same situtation is fast?  (Miernik <public@public.miernik.name>)
Список pgsql-performance
Miernik <public@public.miernik.name> writes:
> On Wed, Jul 30, 2008 at 11:08:06PM -0400, Tom Lane wrote:
>> Hmm, what have you got work_mem set to?  The first one would likely
>> have been a lot faster if it had hashed the subplan; which I'd have
>> thought would happen with only 80K rows in the subplan result,
>> except it didn't.

> work_mem = 1024kB

Try increasing that ... I don't recall the exact per-row overhead
but I'm quite sure it's more than 8 bytes.  Ten times that would
likely get you to a hash subquery plan.

> The machine has 48 MB total RAM and is a Xen host.

48MB is really not a sane amount of memory to run a modern database
in.  Maybe you could make it go with sqlite or some other tiny-footprint
DBMS, but Postgres isn't focused on that case.

>> The queries are in fact not exactly equivalent, because EXCEPT
>> involves some duplicate-elimination behavior that won't happen
>> in the NOT IN formulation.  So I don't apologize for your having
>> gotten different plans.

> But if use EXCEPT ALL?

Fraid not, EXCEPT ALL has yet other rules for how it deals with
duplicates.

>> Another issue is that the NOT IN will probably not do what you
>> expected if the subquery yields any NULLs.

> In this specific query I think it is not possible for the subquery to
> have NULLs,

Okay, just wanted to point out a common gotcha.

            regards, tom lane

В списке pgsql-performance по дате отправления:

Предыдущее
От: Miernik
Дата:
Сообщение: Re: why "WHERE uid NOT IN" is so slow, and EXCEPT in the same situtation is fast?
Следующее
От: Miernik
Дата:
Сообщение: Re: how to fix problem then when two queries run at the same time, it takes longer to complete then if run in sequence