Обсуждение: [HACKERS] Poor cost estimate with interaction between table correlation andpartial indexes
[HACKERS] Poor cost estimate with interaction between table correlation andpartial indexes
От
Michael Malis
Дата:
Hi. I'm looking to get started contributing code to Postgres. A small issue I'm aware of that I think would make a good first contribution is a poor cost estimate made due to an interaction between table correlation and partial indexes. Currently the planner assumes that when an index is perfectly correlated with a table and a range scan is performed on the index, all of the table page reads performed by the index scan except for the first one will be sequential reads. While this assumption is correct for regular indexes, it is not true for partial indexes. The assumption holds for regular indexes because the rows corresponding to two entries in a regular index that is perfectly correlated with the table are guaranteed to be next to each other in the table. On the other hand with a partial index perfectly correlated with a table, there may be rows in the table in between the two rows corresponding to two adjacent entries in the index that are not included in the index because they do not satisfy the partial index predicate. To make the cost calculation for this case more accurate, I want to apply the same estimate as the one currently used to estimate the cost of a bitmap heap scan. The bitmap heap scan cost estimate applies in this case because both cases involve reading pages from disk ordered by the location in the table, but where the pages may not be consecutive. For the relevant functions, see cost_index and cost_index_heap_scan in costsize.c. Thanks, Michael
Do you think this is a reasonable approach? Should I start working on a patch based on the solution I described or is there some other approach I should look into?
On Sat, Aug 26, 2017 at 05:50:26PM -0700, Michael Malis wrote: > Do you think this is a reasonable approach? Should I start working > on a patch based on the solution I described or is there some other > approach I should look into? You'll get more traction with a proof-of-concept patch accompanying the plan than without. Don't bother with any level of care past proof-of-concept until you get positive feedback. Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
(Sorry David. I initially replied only to you) Ok. I've attached a patch of a proof-of-concept. I have a few questions about tests. What is typical workflow to add tests for changes to the planner? Also I ran make check and it appears one of the existing tests is failing. What is a typical way for going about discovering why the query plan for a specific query changed? Also, how should I go about changing the old test? Should I replace the old test output with the new test output or modify the old test slightly to get it to produce the same case as before? Thanks, Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Вложения
Hmm... It seems the command I used for obtaining a patch I got from here https://wiki.postgresql.org/wiki/Working_with_Git truncated part of the patch. I've attached the file generated from git diff --patience master improve-partial-index-correlation-calculation --no-color > improve-correlated-partial-index-cost-v2.patch to this email. What is the correct command for generating a context diff? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Вложения
Michael Malis wrote: > Hmm... It seems the command I used for obtaining a patch I got from > here https://wiki.postgresql.org/wiki/Working_with_Git truncated part > of the patch. I've attached the file generated from git diff > --patience master improve-partial-index-correlation-calculation > --no-color > improve-correlated-partial-index-cost-v2.patch to this > email. What is the correct command for generating a context diff? Yeah, I've had patches truncated by that too and I've never cared enough to see about getting it fixed. I think it's a bug in the filterdiff utility, but I got a stupid answer from the Debian maintainer when I reported it and didn't care to follow up any further. Eventually I gave up on using context diffs because of this problem. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sun, Aug 27, 2017 at 8:31 PM, Michael Malis <michaelmalis2@gmail.com> wrote: > (Sorry David. I initially replied only to you) > > Ok. I've attached a patch of a proof-of-concept. I have a few > questions about tests. > > What is typical workflow to add tests for changes to the planner? Add submitted patches at commitfest.postgresql.org > Also > I ran make check and it appears one of the existing tests is failing. > What is a typical way for going about discovering why the query plan > for a specific query changed? I don't have any magic answer on this point. > Also, how should I go about changing the > old test? Should I replace the old test output with the new test > output or modify the old test slightly to get it to produce the same > case as before? That's a judgement call, based on what you think the point of the test was. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company