Re: Hash index todo list item

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: Hash index todo list item
Дата
Msg-id 200709281848.l8SIm8O22044@momjian.us
обсуждение исходный текст
Ответ на Re: Hash index todo list item  (Tom Raney <twraney@comcast.net>)
Список pgsql-hackers
This has been saved for the 8.4 release:
http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---------------------------------------------------------------------------

Tom Raney wrote:
> We are pleased to announce an upcoming patch to the hash index code
> which improves build time and index size, based on this item in the
> TODO list:
> During index creation, pre-sort the tuples to improve build speed
> http://archives.postgresql.org/pgsql-hackers/2007-03/msg01199.php
> 
> Using our implementation, build times and index sizes are
> comparable with btree index build times and index sizes.
> For example, for a particular 12 million row relation, the
> 8.2.4 release requires over 2.8 hours to build the hash index. 
> With our patch, the index is built in 80 seconds.
> Here is the link for a graph, showing a comparative analysis of
> btree and hash index builds and describing the benchmark data.
> http://web.cecs.pdx.edu/~raneyt/pg/
> 
> We are currently cleaning up the patch and will submit it asap.
> 
> Regards,
> Shreya Bhargava <shreya_bhargav@yahoo.com>
> Tom Raney <twraney@comcast.net>
> 
> 
> Kenneth Marshall wrote:
> > Dear PostgreSQL Hackers:
> >
> > After following the hackers mailing list for quite a while,
> > I am going to start investigating what will need to be done
> > to improve hash index performance. Below are the pieces of
> > this project that I am currently considering:
> >
> > 1. Characterize the current hash index implementation against
> >    the BTree index, with a focus on space utilization and
> >    lookup performance against a collection of test data. This
> >    will give a baseline performance test to evaluate the impact
> >    of changes. I initially do not plan to bench the hash creation
> >    process since my initial focus will be on lookup performance.
> >
> > 2. Evaluate the performance of different hash index implementations
> >    and/or changes to the current implementation. My current plan is
> >    to keep the implementation as simple as possible and still provide
> >    the desired performance. Several hash index suggestions deal with
> >    changing the layout of the keys on a page to improve lookup
> >    performance, including reducing the bucket size to a fraction of
> >    a page or only storing the hash value on the page, instead of
> >    the index value itself. My goal in this phase is to produce one
> >    or more versions with better performance than the current BTree.
> >    
> > 3. Look at build time and concurrency issues with the addition of
> >    some additional tests to the test bed. (1)
> >
> > 4. Repeat as needed.
> >
> > This is the rough plan. Does anyone see anything critical that
> > is missing at this point? Please send me any suggestions for test
> > data and various performance test ideas, since I will be working
> > on that first.
> >
> > Regards,
> > Ken Marshall 
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 5: don't forget to increase your free space map settings
> >
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://postgres.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Kevin Grittner"
Дата:
Сообщение: Re: [FEATURE REQUEST] Streaming Onlinebackup (Maybe OFFTOPIC)
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Enforcing database encoding and locale match