Re: Seq scans roadmap
От | CK Tan |
---|---|
Тема | Re: Seq scans roadmap |
Дата | |
Msg-id | 833BC7B3-048A-4CFC-89C5-119725FA4773@greenplum.com обсуждение исходный текст |
Ответ на | Re: Seq scans roadmap ("Zeugswetter Andreas ADI SD" <ZeugswetterA@spardat.at>) |
Ответы |
Re: Seq scans roadmap
(Zeugswetter Andreas ADI SD <ZeugswetterA@spardat.at>)
|
Список | pgsql-hackers |
Sorry, 16x8K page ring is too small indeed. The reason we selected 16 is because greenplum db runs on 32K page size, so we are indeed reading 128K at a time. The #pages in the ring should be made relative to the page size, so you achieve 128K per read. Also agree that KillAndReadBuffer could be split into a KillPinDontRead(), and ReadThesePinnedPages() functions. However, we are thinking of AIO and would rather see a ReadNPagesAsync() function. -cktan Greenplum, Inc. On May 10, 2007, at 3:14 AM, Zeugswetter Andreas ADI SD wrote: > >> In reference to the seq scans roadmap, I have just submitted >> a patch that addresses some of the concerns. >> >> The patch does this: >> >> 1. for small relation (smaller than 60% of bufferpool), use >> the current logic 2. for big relation: >> - use a ring buffer in heap scan >> - pin first 12 pages when scan starts >> - on consumption of every 4-page, read and pin the next 4-page >> - invalidate used pages of in the scan so they do not >> force out other useful pages > > A few comments regarding the effects: > > I do not see how this speedup could be caused by readahead, so what > are > the effects ? > (It should make no difference to do the CPU work for count(*) > inbetween > reading each block when the pages are not dirtied) > Is the improvement solely reduced CPU because no search for a free > buffer is needed and/or L2 cache locality ? > > What effect does the advance pinnig have, avoid vacuum ? > > A 16 x 8k page ring is too small to allow the needed IO blocksize of > 256k. > The readahead is done 4 x one page at a time (=32k). > What is the reasoning behind 1/4 ring for readahead (why not 1/2), is > 3/4 the trail for followers and bgwriter ? > > I think in anticipation of doing a single IO call for more that one > page, the KillAndReadBuffer function should be split into two > parts. One > that does the killing > for n pages, and one that does the reading for n pages. > Killing n before reading n would also have the positive effect of > grouping perhaps needed writes (not interleaving them with the reads). > > I think the 60% Nbuffers is a very good starting point. I would only > introduce a GUC when we see evidence that it is needed (I agree with > Simon's partitioning comments, but I'd still wait and see). > > Andreas >
В списке pgsql-hackers по дате отправления: