Re: Query optimizer 8.0.1 (and 8.0)

Поиск
Список
Период
Сортировка
От pgsql@mohawksoft.com
Тема Re: Query optimizer 8.0.1 (and 8.0)
Дата
Msg-id 16718.24.91.171.78.1108390270.squirrel@mail.mohawksoft.com
обсуждение исходный текст
Ответ на Re: Query optimizer 8.0.1 (and 8.0)  (Oleg Bartunov <oleg@sai.msu.su>)
Ответы Re: Query optimizer 8.0.1 (and 8.0)  (Ron Mayer <rm_pg@cheapcomplexdevices.com>)
Список pgsql-hackers
> Probably off-topic, but I think it's worth to see what astronomers are
> doing with their very big spatial databases. For example, we are working
> with more than 500,000,000 rows catalog and we use some special
> transformation
> of coordinates to integer numbers with preserving objects closeness.
> I hope we could show postgresql is good enough to be used in astronomy
> for very big catalogs. Currently, MS SQL is in use.
> See http://www.sdss.jhu.edu/htm/ for details. We use another technique.

You know, I don't think a lot of people "get" the issues I was describing,
or maybe they don't believe it, I don't know, but, I think that it would
be a useful contrib project to create an 'analyze_special('table',
'column', 'method')' function that does a better job at calculating the
stats for table that contain multiple trend waveforms. A separate function
will probably work well as the trends within the data probably only apply
to specific rows.

It's interesting, because I don't think it needs to calculate a perfect
representation of the data so much as better clue to its nature for the
optimizer.

When I get the time (or can get someone to pay me to do it) I'm going to
try it.

>
>
>      Oleg
> On Wed, 9 Feb 2005 pgsql@mohawksoft.com wrote:
>
>> I wrote a message caled "One Big trend vs multiple smaller trends in
>> table
>> statistics" that, I think, explains what we've been seeing.
>>
>>
>>> pgsql@mohawksoft.com wrote:
>>>>
>>>> In this case, the behavior observed could be changed by altering the
>>>> sample size for a table. I submit that an arbitrary fixed sample size
>>>> is
>>>> not a good base for the analyzer, but that the sample size should be
>>>> based
>>>> on the size of the table or some calculation of its deviation.
>>>>
>>>
>>>    Mark,
>>>
>>> Do you have any evidence that the Sample Size had anything to do
>>> with the performance problem you're seeing?
>>
>> Sample size is only a bandaid for the issue, however, more samples
>> always
>> provide more information.
>>
>>
>>>
>>> I also do a lot with the complete Census/TIGER database.
>>>
>>> Every problem I have with the optimizer comes down to the
>>> fact that the data is loaded (and ordered on disk) by
>>> State/County FIPS codes, and then queried by zip-code
>>> or by city name.  Like this:
>>>
>>>      Alabama    36101 [hundreds of pages with zip's in 36***]
>>>      Alaska     99686 [hundreds of pages with zip's in 9****]
>>>      Arizona    85701 [hundreds of pages with zip's in 855**]
>>>
>>> Note that the zip codes are *NOT* sequential.
>>
>> Again, read "One Big Trend..." and let me know what you think. I think
>> it
>> describes exactly the problem that we see.
>>
>> For now, the solution that works for me is to seriously up the value of
>> "targrows" in analyze.c. It makes it take longer, and while the stats
>> are
>> not "correct" because they are not designed to detect these sorts of
>> patterns, a larger sample allows them to be "less wrong" enough to give
>> a
>> better hint to the planner.
>>
>>
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 7: don't forget to increase your free space map settings
>>
>
>      Regards,
>          Oleg
> _____________________________________________________________
> Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
> Sternberg Astronomical Institute, Moscow University (Russia)
> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
> phone: +007(095)939-16-83, +007(095)939-23-83
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
>



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Sibtay Abbas
Дата:
Сообщение: Re: getting oid of function
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Schema name of function