Re: [HACKERS] A Better External Sort?

Поиск
Список
Период
Сортировка
От Dann Corbit
Тема Re: [HACKERS] A Better External Sort?
Дата
Msg-id D425483C2C5C9F49B5B7A41F8944154757D148@postal.corporate.connx.com
обсуждение исходный текст
Ответ на [HACKERS] A Better External Sort?  (Ron Peacetree <rjpeace@earthlink.net>)
Список pgsql-performance
> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
> owner@postgresql.org] On Behalf Of Tom Lane
> Sent: Friday, September 30, 2005 11:02 PM
> To: Jeffrey W. Baker
> Cc: Luke Lonergan; Josh Berkus; Ron Peacetree; pgsql-
> hackers@postgresql.org; pgsql-performance@postgresql.org
> Subject: Re: [HACKERS] [PERFORM] A Better External Sort?
>
> "Jeffrey W. Baker" <jwbaker@acm.org> writes:
> > I think the largest speedup will be to dump the multiphase merge and
> > merge all tapes in one pass, no matter how large M.  Currently M is
> > capped at 6, so a sort of 60GB with 1GB sort memory needs 13 passes
over
> > the tape.  It could be done in a single pass heap merge with
N*log(M)
> > comparisons, and, more importantly, far less input and output.
>
> I had more or less despaired of this thread yielding any usable ideas
> :-( but I think you have one here.

I believe I made the exact same suggestion several days ago.

>The reason the current code uses a
> six-way merge is that Knuth's figure 70 (p. 273 of volume 3 first
> edition) shows that there's not much incremental gain from using more
> tapes ... if you are in the regime where number of runs is much
greater
> than number of tape drives.  But if you can stay in the regime where
> only one merge pass is needed, that is obviously a win.
>
> I don't believe we can simply legislate that there be only one merge
> pass.  That would mean that, if we end up with N runs after the
initial
> run-forming phase, we need to fit N tuples in memory --- no matter how
> large N is, or how small work_mem is.  But it seems like a good idea
to
> try to use an N-way merge where N is as large as work_mem will allow.
> We'd not have to decide on the value of N until after we've completed
> the run-forming phase, at which time we've already seen every tuple
> once, and so we can compute a safe value for N as work_mem divided by
> largest_tuple_size.  (Tape I/O buffers would have to be counted too
> of course.)

You only need to hold the sort column(s) in memory, except for the queue
you are exhausting at the time.  [And of those columns, only the values
for the smallest one in a sub-list.]  Of course, the more data from each
list that you can hold at once, the fewer the disk reads and seeks.

Another idea (not sure if it is pertinent):
Instead of having a fixed size for the sort buffers, size it to the
query.  Given a total pool of size M, give a percentage according to the
difficulty of the work to perform.  So a query with 3 small columns and
a cardinality of 1000 gets a small percentage and a query with 10 GB of
data gets a big percentage of available sort mem.

> It's been a good while since I looked at the sort code, and so I don't
> recall if there are any fundamental reasons for having a compile-time-
> constant value of the merge order rather than choosing it at runtime.
> My guess is that any inefficiencies added by making it variable would
> be well repaid by the potential savings in I/O.
>
>             regards, tom lane
>
> ---------------------------(end of
broadcast)---------------------------
> TIP 6: explain analyze is your friend

В списке pgsql-performance по дате отправления:

Предыдущее
От: "Kevin Grittner"
Дата:
Сообщение: Re: Is There Any Way ....
Следующее
От: "Dann Corbit"
Дата:
Сообщение: Re: [HACKERS] A Better External Sort?