Re: slower merge join on sorted data chosen over

Поиск

Список

Период

Сортировка

От	Jim C. Nasby
Тема	Re: slower merge join on sorted data chosen over
Дата	26 октября 2005 г. 18:06:39
Msg-id	20051026210635.GG16682@pervasive.com обсуждение исходный текст
Ответ на	Re: slower merge join on sorted data chosen over (Simon Riggs <simon@2ndquadrant.com>)
Список	pgsql-hackers

Дерево обсуждения

On Mon, Oct 17, 2005 at 09:30:24PM +0100, Simon Riggs wrote:
> On Mon, 2005-10-17 at 14:55 -0500, Jim C. Nasby wrote:
> > On Tue, Oct 11, 2005 at 10:58:58AM +0100, Simon Riggs wrote:
> > > On Mon, 2005-10-10 at 15:14 -0500, Kevin Grittner wrote:
> > > > We are looking at doing much more with PostgreSQL over the
> > > > next two years, and it seems likely that this issue will come up
> > > > again where it is more of a problem.  It sounded like there was
> > > > some agreement on HOW this was to be fixed, yet I don't see
> > > > any mention of doing it in the TODO list.  
> > > 
> > > > Is there any sort of
> > > > estimate for how much programming work would be involved?
> > > 
> > > The main work here is actually performance testing, not programming. The
> > > cost model is built around an understanding of the timings and costs
> > > involved in the execution.
> > > 
> > > Once we have timings to cover a sufficiently large range of cases, we
> > > can derive the cost model. Once derived, we can program it. Discussing
> > > improvements to the cost model without test results is never likely to
> > > convince people. Everybody knows the cost models can be improved, the
> > > only question is in what cases? and in what ways?
> > > 
> > > So deriving the cost model needs lots of trustworthy test results that
> > > can be assessed and discussed, so we know how to improve things. [...and
> > > I don't mean 5 minutes with pg_bench...]
> 
> ...
> 
> > DBT seems to be a reasonable test database 
> 
> I was discussing finding the cost equations to use within the optimizer
> based upon a series of exploratory tests using varying data. That is
> different to using the same database with varying parameters. Both sound
> interesting, but it is the former that, IMHO, would be the more
> important.

True, although that doesn't necessarily mean you can't use the same data
generation. For the testing I was doing before I was just varying
correlation using cluster (or selecting from different fields with
different correlations).
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: slower merge join on sorted data chosen over