Обсуждение: Grouped Index Tuples
I've brought the GIT patch up-to-date with CVS head. The latest version can be found at http://community.enterprisedb.com/git/ I also reran the CPU bound test cases with the latest patch. I want this in 8.3 in some form, and I have the time to do any required changes. If someone wants to see more tests, I can arrange that as well. The patch is pretty big at the moment. I think the best way to proceed with this is to extract some smaller, incremental patches from it that just refactor the current b-tree code. After that, the final patch that implements GIT should be much smaller and more readable. And there's still a bunch of todo items there as well... But before I start doing that, I need some review and general agreement on the design. What I don't want to happen is that three days after the feature freeze, someone finally looks at it and finds a major issue or just thinks it's an unreadable mess, and we no longer have the time to fix it. One question that I'm sure someone will ask is do we need this if we have bitmap indexes? Both aim at having a smaller index, after all. The use cases are quite different; GIT is effective whenever you have a table that's reasonably well-clustered. Unlike the bitmap indexam, GIT's effectiveness doesn't depend on the number of distinct values, in particular it works well with unique indexes. GIT is comparable to clustered indexes in other DBMSs (in fact we might want to call GIT that in the end). -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Thu, 2007-02-22 at 12:47 +0000, Heikki Linnakangas wrote: > One question that I'm sure someone will ask is do we need this if we > have bitmap indexes? Both aim at having a smaller index, after all. > The use cases are quite different; GIT is effective whenever you have > a table that's reasonably well-clustered. Unlike the bitmap indexam, > GIT's effectiveness doesn't depend on the number of distinct values, > in particular it works well with unique indexes. GIT is comparable to > clustered indexes in other DBMSs (in fact we might want to call GIT > that in the end). A few thoughts: Whether its proof, or merely strong interest, we need to see where the cross-over point is between bitmap indexes (BMI) and GIT. I'm assuming there is one, but others may need convincing that there is a clear win in common use cases. Graphs, numbers etc.. On your TODO list, it would be good to differentiate between ideas and must-complete items. My suggested list would be: Must complete - cost model for optimizer - user documentation - performance tests Good to complete - duplicate key flag (should increase range of overlap with BMI) - range re-check optimization (based upon your results) - better index AM - HOT, so that GIT can provide longer term benefits Others ... I seem to recall you mentioning there was a way to record that the heap tuples are stored in monotonic order and so the sort can be avoided when you return the tuples. Is that still true? Also "the patch roughly doubles the amount of WAL generated by WAL" not sure what that means -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Heikki, are you going to revise this for 8.4? --------------------------------------------------------------------------- Heikki Linnakangas wrote: > I've brought the GIT patch up-to-date with CVS head. The latest version > can be found at http://community.enterprisedb.com/git/ > > I also reran the CPU bound test cases with the latest patch. > > I want this in 8.3 in some form, and I have the time to do any required > changes. If someone wants to see more tests, I can arrange that as well. > > The patch is pretty big at the moment. I think the best way to proceed > with this is to extract some smaller, incremental patches from it that > just refactor the current b-tree code. After that, the final patch that > implements GIT should be much smaller and more readable. And there's > still a bunch of todo items there as well... > > But before I start doing that, I need some review and general agreement > on the design. What I don't want to happen is that three days after the > feature freeze, someone finally looks at it and finds a major issue or > just thinks it's an unreadable mess, and we no longer have the time to > fix it. > > One question that I'm sure someone will ask is do we need this if we > have bitmap indexes? Both aim at having a smaller index, after all. The > use cases are quite different; GIT is effective whenever you have a > table that's reasonably well-clustered. Unlike the bitmap indexam, GIT's > effectiveness doesn't depend on the number of distinct values, in > particular it works well with unique indexes. GIT is comparable to > clustered indexes in other DBMSs (in fact we might want to call GIT that > in the end). > > -- > Heikki Linnakangas > EnterpriseDB http://www.enterprisedb.com > > ---------------------------(end of broadcast)--------------------------- > TIP 7: You can help support the PostgreSQL project by donating at > > http://www.postgresql.org/about/donate -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian wrote: > Heikki, are you going to revise this for 8.4? Probably not. I have other features I want to work on at the moment. > --------------------------------------------------------------------------- > > Heikki Linnakangas wrote: >> I've brought the GIT patch up-to-date with CVS head. The latest version >> can be found at http://community.enterprisedb.com/git/ >> >> I also reran the CPU bound test cases with the latest patch. >> >> I want this in 8.3 in some form, and I have the time to do any required >> changes. If someone wants to see more tests, I can arrange that as well. >> >> The patch is pretty big at the moment. I think the best way to proceed >> with this is to extract some smaller, incremental patches from it that >> just refactor the current b-tree code. After that, the final patch that >> implements GIT should be much smaller and more readable. And there's >> still a bunch of todo items there as well... >> >> But before I start doing that, I need some review and general agreement >> on the design. What I don't want to happen is that three days after the >> feature freeze, someone finally looks at it and finds a major issue or >> just thinks it's an unreadable mess, and we no longer have the time to >> fix it. >> >> One question that I'm sure someone will ask is do we need this if we >> have bitmap indexes? Both aim at having a smaller index, after all. The >> use cases are quite different; GIT is effective whenever you have a >> table that's reasonably well-clustered. Unlike the bitmap indexam, GIT's >> effectiveness doesn't depend on the number of distinct values, in >> particular it works well with unique indexes. GIT is comparable to >> clustered indexes in other DBMSs (in fact we might want to call GIT that >> in the end). >> >> -- >> Heikki Linnakangas >> EnterpriseDB http://www.enterprisedb.com >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 7: You can help support the PostgreSQL project by donating at >> >> http://www.postgresql.org/about/donate > -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com