Re: Merge algorithms for large numbers of "tapes"
От | Dann Corbit |
---|---|
Тема | Re: Merge algorithms for large numbers of "tapes" |
Дата | |
Msg-id | D425483C2C5C9F49B5B7A41F8944154757D5E8@postal.corporate.connx.com обсуждение исходный текст |
Ответ на | Merge algorithms for large numbers of "tapes" (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
There are some articles here that are worth reading if you want to sort fast: http://research.microsoft.com/barc/SortBenchmark/ > -----Original Message----- > From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers- > owner@postgresql.org] On Behalf Of Dann Corbit > Sent: Wednesday, March 08, 2006 1:59 PM > To: Luke Lonergan; Tom Lane; Jim C. Nasby > Cc: Simon Riggs; pgsql-hackers@postgreSQL.org > Subject: Re: [HACKERS] Merge algorithms for large numbers of "tapes" > > > -----Original Message----- > > From: Luke Lonergan [mailto:llonergan@greenplum.com] > > Sent: Wednesday, March 08, 2006 1:52 PM > > To: Dann Corbit; Tom Lane; Jim C. Nasby > > Cc: Simon Riggs; pgsql-hackers@postgreSQL.org > > Subject: Re: [HACKERS] Merge algorithms for large numbers of "tapes" > > > > Dann, > > > > On 3/8/06 12:39 PM, "Dann Corbit" <DCorbit@connx.com> wrote: > > > > > Here are some suggestions of things that I know work really, really > > > well: > > > > Can you point to an example? That might help move the discussion > along. > > I wrote all of the sorting and merging stuff for CONNX Solutions > http://www.connx.com > > I have carefully benched all of this stuff and (at least for our system) > the ideas I propose work well. Of course, every system is different and > the only way to know if it is an improvement is to try it in place. > > > The reason to interject about the tape goo in this discussion is that > we > > seem to be spending a lot of time optimizing around the tape goo > without > > tackling the overall structure of the external sort. I think we'll > just > > end > > up having to replace all of this goo when we really get around to > fixing > > the > > problem. > > I suggest trying several alternatives and benching them with real world > queries and especially with the open database benchmark suite. > > > Add to this that other commercial databases external sort in 1/4 the > time > > or > > better on the same hardware with the same CPU/memory resources using a > > 2-pass external sort. > > Our sort merge is so fast that I can join two tables on a column with no > index faster than on a database that has a unique clustered index on the > column. Benchmarked against Oracle, SQL*Server, and several others. > > If you check our ORDER BY on a large table with no index, you will see > that it is competitive with the best commercial systems. > > If you are interested, you could get an eval of CONNX and try it > yourself (eval is free for some number of days, I don't remember what). > > > > #1. Two pass merge (none of that silly poly-tape merge goo) > > > > Voice of reason here. It's what the other database systems do. > > > > > #2. Load ONLY the keys that are to be sorted into memory. Use a > > > pointer exchange sort, and do not move the physical rows of data at > all. > > > > Sounds right. Example of this in practice? > > It is what we use here. It is the only way to fly. This is well known, > and if you read a few articles from the ACM, you will see that it has > been known for decades. > > > > I am pretty sure from this thread that PostgreSQL is not doing #1, > and I > > > have no idea if it is doing #2. > > > > Yep. Even Knuth says that the tape goo is only interesting from a > > historical perspective and may not be relevant in an era of disk > drives. > > > > - Luke > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings
В списке pgsql-hackers по дате отправления:
Предыдущее
От: "Jonah H. Harris"Дата:
Сообщение: Re: Coverity Open Source Defect Scan of PostgreSQL
Следующее
От: Alvaro HerreraДата:
Сообщение: Re: [COMMITTERS] pgsql: Remove Christof Petig copyright on include file, per author