Re: Seq scans roadmap
От | CK Tan |
---|---|
Тема | Re: Seq scans roadmap |
Дата | |
Msg-id | 30E8D12C-C5C1-48DA-BF06-08353C398C35@greenplum.com обсуждение исходный текст |
Ответ на | Re: Seq scans roadmap (Heikki Linnakangas <heikki@enterprisedb.com>) |
Ответы |
Re: Seq scans roadmap
("Zeugswetter Andreas ADI SD" <ZeugswetterA@spardat.at>)
|
Список | pgsql-hackers |
Hi, In reference to the seq scans roadmap, I have just submitted a patch that addresses some of the concerns. The patch does this: 1. for small relation (smaller than 60% of bufferpool), use the current logic 2. for big relation:- use a ring buffer in heap scan- pin first 12 pages when scan starts- on consumption of every 4-page,read and pin the next 4-page- invalidate used pages of in the scan so they do not force out other useful pages 4 files changed: bufmgr.c, bufmgr.h, heapam.c, relscan.h If there are interests, I can submit another scan patch that returns N tuples at a time, instead of current one-at-a-time interface. This improves code locality and further improve performance by another 10-20%. For TPCH 1G tables, we are seeing more than 20% improvement in scans on the same hardware. ------------------------------------------------------------------------ - ----- PATCHED VERSION ------------------------------------------------------------------------ - gptest=# select count(*) from lineitem; count --------- 6001215 (1 row) Time: 2117.025 ms ------------------------------------------------------------------------ - ----- ORIGINAL CVS HEAD VERSION ------------------------------------------------------------------------ - gptest=# select count(*) from lineitem; count --------- 6001215 (1 row) Time: 2722.441 ms Suggestions for improvement are welcome. Regards, -cktan Greenplum, Inc. On May 8, 2007, at 5:57 AM, Heikki Linnakangas wrote: > Luke Lonergan wrote: >>> What do you mean with using readahead inside the heapscan? >>> Starting an async read request? >> Nope - just reading N buffers ahead for seqscans. Subsequent >> calls use >> previously read pages. The objective is to issue contiguous reads to >> the OS in sizes greater than the PG page size (which is much smaller >> than what is needed for fast sequential I/O). > > Are you filling multiple buffers in the buffer cache with a single > read-call? The OS should be doing readahead for us anyway, so I > don't see how just issuing multiple ReadBuffers one after each > other helps. > >> Yes, I think the ring buffer strategy should be used when the >> table size >> is > 1 x bufcache and the ring buffer should be of a fixed size >> smaller >> than L2 cache (32KB - 128KB seems to work well). > > I think we want to let the ring grow larger than that for updating > transactions and vacuums, though, to avoid the WAL flush problem. > > -- > Heikki Linnakangas > EnterpriseDB http://www.enterprisedb.com > > ---------------------------(end of > broadcast)--------------------------- > TIP 6: explain analyze is your friend >
В списке pgsql-hackers по дате отправления:
Следующее
От: Tom LaneДата:
Сообщение: Re: Re: [COMMITTERS] psqlodbc - psqlodbc: Put Autotools-generated files into subdirectory