Re: Progress on fast path sorting, btree index creation time

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: Progress on fast path sorting, btree index creation time
Дата
Msg-id 20120206211907.GG19450@momjian.us
обсуждение исходный текст
Ответ на Re: Progress on fast path sorting, btree index creation time  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Progress on fast path sorting, btree index creation time  (Bruce Momjian <bruce@momjian.us>)
Re: Progress on fast path sorting, btree index creation time  (Peter Geoghegan <peter@2ndquadrant.com>)
Re: Progress on fast path sorting, btree index creation time  ("Jim \"Decibel!\" Nasby" <decibel@decibel.org>)
Список pgsql-hackers
On Fri, Jan 27, 2012 at 09:37:37AM -0500, Robert Haas wrote:
> On Fri, Jan 27, 2012 at 9:27 AM, Peter Geoghegan <peter@2ndquadrant.com> wrote:
> > Well, I don't think it's all that subjective - it's more the case that
> > it is just difficult, or it gets that way as you consider more
> > specialisations.
> 
> Sure it's subjective.  Two well-meaning people could have different
> opinions without either of them being "wrong".  If you do a lot of
> small, in-memory sorts, more of this stuff is going to seem worthwhile
> than if you don't.
> 
> > As for what types/specialisations may not make the cut, I'm
> > increasingly convinced that floats (in the following order: float4,
> > float8) should be the first to go. Aside from the fact that we cannot
> > use their specialisations for anything like dates and timestamps,
> > floats are just way less useful than integers in the context of
> > database applications, or at least those that I've been involved with.
> > As important as floats are in the broad context of computing, it's
> > usually only acceptable to store data in a database as floats within
> > scientific applications, and only then when their limitations are
> > well-understood and acceptable. I think we've all heard anecdotes at
> > one time or another, involving their limitations not being well
> > understood.
> 
> While we're waiting for anyone else to weigh in with an opinion on the
> right place to draw the line here, do you want to post an updated
> patch with the changes previously discussed?

Well, I think we have to ask not only how many people are using
float4/8, but how many people are sorting or creating indexes on them. 
I think it would be few and perhaps should be eliminated.

Peter Geoghegan obviously has done some serious work in improving
sorting, and worked well with the community process.  He has done enough
analysis that I am hard-pressed to see how we would get similar
improvement using a different method, so I think it comes down to
whether we want the 28% speedup by adding 55k (1%) to the binary.

I think Peter has shown us how to get that, and what it will cost --- we
just need to decide now whether it is worth it.  What I am saying is
there probably isn't a cheaper way to get that speedup, either now or in
the next few years.  (COPY might need similar help for speedups.)

I believe this is a big win and well worth the increased binary size
because the speed up is significant, and because it is of general
usefulness for a wide range of queries.  Either of these would be enough
to justify the additional 1% size, but both make it an easy decision for
me.  

FYI, I believe COPY needs similar optimizations; we have gotten repeated
complaints about its performance and this method of optmization might
also be our only option.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: freezing multixacts
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Assertion failure in AtCleanup_Portals