Re: Query optimizer 8.0.1 (and 8.0)

Поиск
Список
Период
Сортировка
От Ron Mayer
Тема Re: Query optimizer 8.0.1 (and 8.0)
Дата
Msg-id 42082EE6.3070401@cheapcomplexdevices.com
обсуждение исходный текст
Ответ на Re: Query optimizer 8.0.1 (and 8.0)  (pgsql@mohawksoft.com)
Ответы Re: Query optimizer 8.0.1 (and 8.0)  (pgsql@mohawksoft.com)
Список pgsql-hackers
pgsql@mohawksoft.com wrote:
> 
> In this case, the behavior observed could be changed by altering the
> sample size for a table. I submit that an arbitrary fixed sample size is
> not a good base for the analyzer, but that the sample size should be based
> on the size of the table or some calculation of its deviation.
> 
  Mark,

Do you have any evidence that the Sample Size had anything to do
with the performance problem you're seeing?

I also do a lot with the complete Census/TIGER database.

Every problem I have with the optimizer comes down to the
fact that the data is loaded (and ordered on disk) by
State/County FIPS codes, and then queried by zip-code
or by city name.  Like this:
    Alabama    36101 [hundreds of pages with zip's in 36***]    Alaska     99686 [hundreds of pages with zip's in
9****]   Arizona    85701 [hundreds of pages with zip's in 855**]
 

Note that the zip codes are *NOT* sequential.

The "correlation" statistic sees that the Zip codes are not
sequential; so it makes the *HORRIBLE* assumption that they
are scattered randomly across the disk.

In reality, even though there's no total ordering of the
zip codes; any given zip code only exists on a couple
disk pages; so index scans would be the right choice.


But the single correlation parameter is not sufficient
to let the optimizer known this.

No matter how large a sample size you choose, ANALYZE
will correctly see that Zip codes and State FIPS codes
are non-correlated, and the optimizer will overestimate
the # of pages an index scan will need.
  Ron

PS: I pointed out workarounds in my earlier posting
in this thread.  Yes, I'm using the same TIGER data
you are.




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Martin Pitt
Дата:
Сообщение: Re: libpq API incompatibility between 7.4 and 8.0
Следующее
От: Tom Lane
Дата:
Сообщение: Re: "external indices" ...