Re: [HACKERS] sorting big tables :(

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: [HACKERS] sorting big tables :(
Дата
Msg-id 199805200150.VAA02065@candle.pha.pa.us
обсуждение исходный текст
Ответ на Re: [HACKERS] sorting big tables :(  (Michael Richards <miker@scifair.acadiau.ca>)
Ответы Re: [HACKERS] sorting big tables :(  (The Hermit Hacker <scrappy@hub.org>)
Список pgsql-hackers
>
> On Sun, 17 May 1998, Bruce Momjian wrote:
>
> > > > > I have a big table. 40M rows.
> > > > > On the disk, it's size is:
> > > > >  2,090,369,024 bytes. So 2 gigs. On a 9 gig drive I can't sort this table.
> > > > > How should one decide based on table size how much room is needed?
> >
> > Tape sort is a standard Knuth sorting.  It basically sorts in pieces,
> > and merges.  If you don't do this, the accessing around gets very poor
> > as you page fault all over the file, and the cache becomes useless.
> Right. I wasn't reading the right chapter. Internal sorting is much
> different than external sorts. Internal suggests the use of a Quicksort
> algorithim.
> Marc and I discussed over lunch. If I did a select * into, would it not
> make more sense to sort the results into the resulting table rather than
> into pieces and then copy into a table? From my limited knowlege, I think
> this should save 8/7 N the space.
> In this issue, I think there must be a lot more overhead than necessary.

Not sure if the internal tape is the same structure as a real table, but
I doubt it.  I seem to remember there is less overhead.

> The table consists of only
> int4, int4, int2
> I read 10 bytes / row of actual data here.
> Instead, 40M/2gigs is about
> 50 bytes / record
> What is there other than oid (4? bytes)

Internal stuff so it looks like a real table, even though it is a
result, I think.

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Thomas G. Lockhart"
Дата:
Сообщение: Re: [DOCS] Re: FE/BE protocol revision patch
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: [DOCS] Re: FE/BE protocol revision patch