Re: -HEAD planner issue wrt hash_joins on dbt3 ?

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: -HEAD planner issue wrt hash_joins on dbt3 ?
Дата
Msg-id 570.1159307608@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: -HEAD planner issue wrt hash_joins on dbt3 ?  (Stefan Kaltenbrunner <stefan@kaltenbrunner.cc>)
Ответы Re: -HEAD planner issue wrt hash_joins on dbt3 ?  (Stefan Kaltenbrunner <stefan@kaltenbrunner.cc>)
Список pgsql-hackers
Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes:
> Tom Lane wrote:
>> It evidently thinks that most of the rows in the join of part and
>> partsupp won't have any matching rows in lineitem, whereas on average
>> there are about 7 matching rows apiece.  So that's totally wacko, and
>> it's not immediately obvious why.  Could we see the pg_stats entries for
>> part.p_partkey, partsupp.ps_partkey, partsupp.ps_suppkey,
>> lineitem.l_partkey, lineitem.l_suppkey?

> http://www.kaltenbrunner.cc/files/dbt3_beta1_statistics.txt

OK, so we have 2 million parts and 100000 suppliers, and ANALYZE doesn't
seem to have been too far off at estimating either of those numbers.
I think the problem is that there are not very many suppliers for any
particular part, and thus the condition "part match AND supplier match"
is really not much more selective than "part match" alone.  The planner
is supposing that their selectivities are independent, which they
aren't.

Offhand I don't see any good way to fix this without multi-column
statistics, which is something that's certainly not happening for 8.2 :-(
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: Constant changes (Re-Build)
Следующее
От: Tom Lane
Дата:
Сообщение: Isn't strdup.h useless code?