Re: polyphase merge?

Поиск
Список
Период
Сортировка
От Don Marvick
Тема Re: polyphase merge?
Дата
Msg-id d18e24870902032242v4d868296o39774f928ad76ff0@mail.gmail.com
обсуждение исходный текст
Ответ на polyphase merge?  (Don Marvick <donmarvick@gmail.com>)
Ответы Re: polyphase merge?
Re: polyphase merge?
Список pgsql-hackers
Dear All,<br /><br />Since nobody replied, I would give it a try. I am going to implement the merge pattern described
inKnuth Page 365 (5.4.9), essentially it is as follow:<br />- create initial runs using replacement selection
(basicallythis is as in the current implementation)<br /> - add enough dummy runs of size 0 so that the number of
sortedrun minus one can be divided by k-1 (k is merge fan in)<br />- repetitively merge k smallest runs until only 1
runleft<br /><br />I am new to postgresql, hence any advice would be appreciated.<br /><br />Can anybody give me some
adviceon how it can done? <br />1. how a run should be represented? should I reuse the tape mechanism? e.g. 1 tape 1
run<br/>- or should I use a temp buffer?<br /><br />2. How memory should be allocated? I assume I divide the memory
equallyto k runs, hence each run will get M/k memory. Each read of a run would be of size M/k bytes. <br /><br />3.
Prefetching.Then, we can precompute the read sequence of blocks of run during the entire merge process. Based on this,
weknow the blocks of run to be prefetched at a point of time.<br /><br />3. Is it possible to perform read/write I/O
whilethe merge is being performed? Hence we overlap I/O and computation.<br /><br />4. any other issue needs
consideration?<br/><br />Best regards,<br /><br />Don<br /><br /><div class="gmail_quote">On Thu, Jan 29, 2009 at 4:11
PM,Don Marvick <span dir="ltr"><<a href="mailto:donmarvick@gmail.com">donmarvick@gmail.com</a>></span> wrote:<br
/><blockquoteclass="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex;
padding-left:1ex;">Dear All,<br /><br />I apologize if this has been discussed before. I have tried to search to the
mailinglist and could not find an exact answer.<br /><br />Currently, Postgres uses Knuth's Algorithm 5.4.2D to perform
externalmerge sort. IMHO the algorithm is designed for tapes. <br /><br />Why don't the simple text book merge pattern
describedin Knuth Page 365 (5.4.9) is used for disk based system? The same merge pattern is also described in
Ramakrishnantext book and it is optimal if seek time is not counted, which of course not the real-world case.<br /><br
/>Bestregards,<br /><font color="#888888"><br />Don<br /><br /></font></blockquote></div><br /> 

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fujii Masao
Дата:
Сообщение: Re: Hot standby, recovery infra
Следующее
От: Fujii Masao
Дата:
Сообщение: Re: Synch Replication