Обсуждение: Planner question - "bit" data types
Does the planner know how to use indices to optimize these queries? For reference, I was having SEVERE performance problems with the following comparison in an SQL statement where "mask" was an integer: "select ... from .... where ...... and (permission & mask = permission)" This resulted in the planner deciding to run a nested loop and extraordinarily poor performance. I can probably recode the application to use a field of type "bit(32)" and either cast to an integer or have the code do the conversion internally (its just a shift eh?) The question is whether the above statement will be reasonably planned if "mask" is a bit type. -- Karl Denninger
Вложения
Karl, > For reference, I was having SEVERE performance problems with the > following comparison in an SQL statement where "mask" was an integer: > > "select ... from .... where ...... and (permission & mask = permission)" AFAIK, the only way to use an index on these queries is through expression indexes. That's why a lot of folks use INTARRAY instead; it comes with a GIN index type. It would probably be possible to create a new index type using GiST or GIN which indexed bitstrings automatically, but I don't know that anyone has done it yet. Changing your integer to a bitstring will not, to my knowledge, improve this. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
On Fri, Sep 4, 2009 at 6:29 PM, Josh Berkus<josh@agliodbs.com> wrote: > Karl, > >> For reference, I was having SEVERE performance problems with the >> following comparison in an SQL statement where "mask" was an integer: >> >> "select ... from .... where ...... and (permission & mask = permission)" > > AFAIK, the only way to use an index on these queries is through > expression indexes. That's why a lot of folks use INTARRAY instead; it > comes with a GIN index type. > > It would probably be possible to create a new index type using GiST or > GIN which indexed bitstrings automatically, but I don't know that anyone > has done it yet. > > Changing your integer to a bitstring will not, to my knowledge, improve > this. agreed. also, gist/gin is no free lunch, maintaining these type of indexes is fairly expensive. If you are only interested in one or a very small number of cases of 'permission', you can use an expression index to target constant values: "select ... from .... where ...... and (permission & mask = permission)" create index foo_permission_xyz_idx on foo((64 & mask = 64)); select * from foo where 64 & mask = 64; --indexed! this optimizes a _particular_ case of permission into a boolean based index. this can be a big win if the # of matching cases is very small or you want to use this in a multi-column index. merlin
Merlin Moncure <mmoncure@gmail.com> writes: > If you are only interested in one or a very small number of cases of > 'permission', you can use an expression index to target constant > values: > "select ... from .... where ...... and (permission & mask = permission)" > create index foo_permission_xyz_idx on foo((64 & mask = 64)); > select * from foo where 64 & mask = 64; --indexed! A possibly more useful variant is to treat the permission condition as a partial index's WHERE condition. The advantage of that is that the index's actual content can be some other column, so that you can combine the permission check with a second indexable test. The index is still available for queries that don't use the other column, but it's more useful for those that do. regards, tom lane
Tom Lane wrote:
What I am trying to avoid is creating a boolean column for EACH potential bit (and an index on each), as that makes the schema non-portable for others and quite messy as well - while there are a handful of "known masks" the system also has a number of "user defined" bit positions that vary from installation to installation.
-- Karl
That doesn't help in this case as the returned set will typically be quite large, with the condition typically being valid on anywhere from 10-80% of the returned tuples.Merlin Moncure <mmoncure@gmail.com> writes:If you are only interested in one or a very small number of cases of 'permission', you can use an expression index to target constant values:"select ... from .... where ...... and (permission & mask = permission)"create index foo_permission_xyz_idx on foo((64 & mask = 64)); select * from foo where 64 & mask = 64; --indexed!A possibly more useful variant is to treat the permission condition as a partial index's WHERE condition. The advantage of that is that the index's actual content can be some other column, so that you can combine the permission check with a second indexable test. The index is still available for queries that don't use the other column, but it's more useful for those that do. regards, tom lane
What I am trying to avoid is creating a boolean column for EACH potential bit (and an index on each), as that makes the schema non-portable for others and quite messy as well - while there are a handful of "known masks" the system also has a number of "user defined" bit positions that vary from installation to installation.
-- Karl
Вложения
Karl Denninger <karl@denninger.net> writes: > That doesn't help in this case as the returned set will typically be > quite large, with the condition typically being valid on anywhere from > 10-80% of the returned tuples. In that case you'd be wasting your time to get it to use an index for the condition anyway. Maybe you need to take a step back and look at the query as a whole rather than focus on this particular condition. regards, tom lane
Tom Lane wrote:
It is only attempting to filter the returned tuples on the permission bit(s) involved that cause trouble.
-- Karl
The query, sans this condition, is extremely fast and contains a LOT of other conditions (none of which cause trouble.)Karl Denninger <karl@denninger.net> writes:That doesn't help in this case as the returned set will typically be quite large, with the condition typically being valid on anywhere from 10-80% of the returned tuples.In that case you'd be wasting your time to get it to use an index for the condition anyway. Maybe you need to take a step back and look at the query as a whole rather than focus on this particular condition. regards, tom lane
It is only attempting to filter the returned tuples on the permission bit(s) involved that cause trouble.
-- Karl
Вложения
Karl Denninger <karl@denninger.net> writes: > Tom Lane wrote: >> In that case you'd be wasting your time to get it to use an index >> for the condition anyway. Maybe you need to take a step back and >> look at the query as a whole rather than focus on this particular >> condition. > The query, sans this condition, is extremely fast and contains a LOT of > other conditions (none of which cause trouble.) > It is only attempting to filter the returned tuples on the permission > bit(s) involved that cause trouble. My comment stands: asking about how to use an index for this is the wrong question. You never showed us any EXPLAIN results, but I suspect what is happening is that the planner thinks the "permission & mask = permission" condition is fairly selective (offhand I think it'd default to DEFAULT_EQ_SEL or 0.005) whereas the true selectivity per your prior comment is only 0.1 to 0.8. This is causing it to change to a plan that would be good for a small number of rows, when it should stick to a plan that is good for a large number of rows. So the right question is "how do I fix the bad selectivity estimate?". Unfortunately there's no non-kluge answer. What I think I'd try is wrapping the condition into a function, say create function permission_match(perms int, mask int) returns bool as $$begin return perms & mask = mask; end$$ language plpgsql strict immutable; The planner won't know what to make of "where permission_match(perms, 64)" either, but the default selectivity estimate for a boolean function is 0.333, much closer to what you need. Or plan B, which I'd recommend, is to forget the mask business and go over to a boolean column per permission flag. Then the planner would actually have decent statistics about the flag selectivities, and the queries would be a lot more readable too. Your objection that you'd need an index per flag column is misguided --- at these selectivities an index is really pointless. And I entirely fail to understand the complaint about it being unportable; you think "&" is more portable than boolean? Only one of those things is in the SQL standard. regards, tom lane
Tom Lane wrote:
In any event it looks like that's the only reasonable way to do this, so thanks (I think)
-- Karl
Yes I did. Go back and look at the archives. I provided full EXPLAIN and EXPLAIN ANALYZE results for the original query. Sheesh.Karl Denninger <karl@denninger.net> writes:Tom Lane wrote:In that case you'd be wasting your time to get it to use an index for the condition anyway. Maybe you need to take a step back and look at the query as a whole rather than focus on this particular condition.The query, sans this condition, is extremely fast and contains a LOT of other conditions (none of which cause trouble.) It is only attempting to filter the returned tuples on the permission bit(s) involved that cause trouble.My comment stands: asking about how to use an index for this is the wrong question. You never showed us any EXPLAIN results,
The point isn't portability to other SQL engines - it is to other people's installations. The bitmask is (since it requires only changing the mask constants in the container file that makes the SQL calls by reference) where explicit columns is not by a long shot.Or plan B, which I'd recommend, is to forget the mask business and go over to a boolean column per permission flag. Then the planner would actually have decent statistics about the flag selectivities, and the queries would be a lot more readable too. Your objection that you'd need an index per flag column is misguided --- at these selectivities an index is really pointless. And I entirely fail to understand the complaint about it being unportable; you think "&" is more portable than boolean? Only one of those things is in the SQL standard. regards, tom lane
In any event it looks like that's the only reasonable way to do this, so thanks (I think)
-- Karl
Вложения
Karl Denninger escribió: > Tom Lane wrote: > > You never showed us any EXPLAIN results, > Yes I did. Go back and look at the archives. I provided full EXPLAIN > and EXPLAIN ANALYZE results for the original query. Sheesh. You did? Where? This is your first message in this thread: http://archives.postgresql.org/pgsql-performance/2009-09/msg00059.php No EXPLAINs anywhere to be seen. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
There was a previous thread and I referenced it. I don't have the other one in my email system any more to follow up to it.
I give up; the attack-dog crowd has successfully driven me off. Ciao.
Alvaro Herrera wrote:
I give up; the attack-dog crowd has successfully driven me off. Ciao.
Alvaro Herrera wrote:
Karl Denninger escribió:Tom Lane wrote:You never showed us any EXPLAIN results,Yes I did. Go back and look at the archives. I provided full EXPLAIN and EXPLAIN ANALYZE results for the original query. Sheesh.You did? Where? This is your first message in this thread: http://archives.postgresql.org/pgsql-performance/2009-09/msg00059.php No EXPLAINs anywhere to be seen.
Вложения
On Sat, Sep 5, 2009 at 8:19 PM, Karl Denninger<karl@denninger.net> wrote: > There was a previous thread and I referenced it. I don't have the other one > in my email system any more to follow up to it. > > I give up; the attack-dog crowd has successfully driven me off. Ciao. Another more standard sql approach is to push the flags out to a subordinate table. This is less efficient of course but now you get to use standard join tactics to match conditions... merlin
> -----Mensaje original----- > De: Karl Denninger > Enviado el: Sábado, 05 de Septiembre de 2009 21:19 > Para: Alvaro Herrera > CC: Tom Lane; Merlin Moncure; Josh Berkus; > pgsql-performance@postgresql.org > Asunto: Re: [PERFORM] Planner question - "bit" data types > > There was a previous thread and I referenced it. I don't have > the other one in my email system any more to follow up to it. > > I give up; the attack-dog crowd has successfully driven me off. Ciao. > > Alvaro Herrera wrote: > > Karl Denninger escribió: > > > Tom Lane wrote: > > > > > > You never showed us any EXPLAIN results, > > > Yes I did. Go back and look at the archives. > I provided full EXPLAIN > and EXPLAIN ANALYZE results for the original > query. Sheesh. > > > > You did? Where? This is your first message in this thread: > > http://archives.postgresql.org/pgsql-performance/2009-09/msg00059.php > No EXPLAINs anywhere to be seen. > I guess this is the post Karl refers to: http://archives.postgresql.org/pgsql-sql/2009-08/msg00088.php Still you can't hope that others will recall a post 2 weeks ago, with an other subject and in an other list!
On Sat, Sep 5, 2009 at 8:19 PM, Karl Denninger<karl@denninger.net> wrote: > There was a previous thread and I referenced it. I don't have the other one > in my email system any more to follow up to it. > > I give up; the attack-dog crowd has successfully driven me off. Ciao. Perhaps I'm biased by knowing some of the people involved, but I don't think anyone on this thread has been anything but polite. It would certainly be great if PostgreSQL could properly estimate the selectivity of expressions like this without resorting to nasty hacks, but it can't, and unfortunately, there's really no possibility of that changing any time soon. Even if someone implements a fix today, the soonest it will appear in a production release is June 2010. So, any suggestion for improvement is going to be in the form of suggesting that you modify the schema in some way. I know that's not really what you're looking for, but unfortunately it's the best we can do. As far as I can tell, it is not correct to say that you referenced the previous thread. I do not see any such reference. ...Robert
Robert Haas wrote:
The current schema is an integer being used as a bitmask. If the planner knows how to handle a type of "bit(X)" (and will at least FILTER rather than NESTED LOOP it on a select, as happens for an Integer used in this fashion), that change is easier than splitting it into individual boolean fields.
-- Karl
I was asking about modifying the schema.On Sat, Sep 5, 2009 at 8:19 PM, Karl Denninger<karl@denninger.net> wrote:There was a previous thread and I referenced it. I don't have the other one in my email system any more to follow up to it. I give up; the attack-dog crowd has successfully driven me off. Ciao.Perhaps I'm biased by knowing some of the people involved, but I don't think anyone on this thread has been anything but polite. It would certainly be great if PostgreSQL could properly estimate the selectivity of expressions like this without resorting to nasty hacks, but it can't, and unfortunately, there's really no possibility of that changing any time soon. Even if someone implements a fix today, the soonest it will appear in a production release is June 2010. So, any suggestion for improvement is going to be in the form of suggesting that you modify the schema in some way. I know that's not really what you're looking for, but unfortunately it's the best we can do. As far as I can tell, it is not correct to say that you referenced the previous thread. I do not see any such reference. ...Robert
The current schema is an integer being used as a bitmask. If the planner knows how to handle a type of "bit(X)" (and will at least FILTER rather than NESTED LOOP it on a select, as happens for an Integer used in this fashion), that change is easier than splitting it into individual boolean fields.
-- Karl
Вложения
On Mon, Sep 7, 2009 at 8:51 PM, Karl Denninger<karl@denninger.net> wrote: > Robert Haas wrote: > > On Sat, Sep 5, 2009 at 8:19 PM, Karl Denninger<karl@denninger.net> wrote: > > > There was a previous thread and I referenced it. I don't have the other one > in my email system any more to follow up to it. > > I give up; the attack-dog crowd has successfully driven me off. Ciao. > > > Perhaps I'm biased by knowing some of the people involved, but I don't > think anyone on this thread has been anything but polite. It would > certainly be great if PostgreSQL could properly estimate the > selectivity of expressions like this without resorting to nasty hacks, > but it can't, and unfortunately, there's really no possibility of that > changing any time soon. Even if someone implements a fix today, the > soonest it will appear in a production release is June 2010. So, any > suggestion for improvement is going to be in the form of suggesting > that you modify the schema in some way. I know that's not really what > you're looking for, but unfortunately it's the best we can do. > > As far as I can tell, it is not correct to say that you referenced the > previous thread. I do not see any such reference. > > ...Robert > > > > I was asking about modifying the schema. > > The current schema is an integer being used as a bitmask. If the planner > knows how to handle a type of "bit(X)" (and will at least FILTER rather than > NESTED LOOP it on a select, as happens for an Integer used in this fashion), > that change is easier than splitting it into individual boolean fields. Well, the first several replies seem to address that question - I think we all agree that won't help. I'm not sure what you mean by "at least FILTER rather than NESTED LOOP it on a select". However, typically, the time when you get a nested loop is when the planner believes that the loop will be executed very few times (in other words, the outer side will return very few rows). It probably isn't the case that the planner COULDN'T choose to execute the query in some other way; rather, the planner believes that the nested loop is faster because of a (mistaken) belief about how many rows the bitmap-criterion will actually match. All the suggestions you've gotten upthread are tricks to enable the planner to make a better estimate, which will hopefully cause it to choose a better plan. As a general statement, selectivity estimation problems are very painful to work around and often involve substantial application redesign. In all honesty, I think you've run across one of the easier variants. As painful as it is to hear the word easy applied to a problem that's killing you, there actually IS a good solution to this problem: use individual boolean fields. I know that's not what you want to do, but it's better than "sorry, you're hosed, no matter how you do this it ain't gonna work". And I do think there are a few in the archives that fall into that category. Good luck, and sorry for the bad news. ...Robert
"pgsql-performance-owner@postgresql.org" wrote: > On Sat, Sep 5, 2009 at 8:19 PM, Karl Denninger<karl@denninger.net> wrote: > > There was a previous thread and I referenced it. I don't have the > other one > > in my email system any more to follow up to it. > > > > I give up; the attack-dog crowd has successfully driven me off. Ciao. > > Perhaps I'm biased by knowing some of the people involved, but I don't > think anyone on this thread has been anything but polite. I use several online forums and this -- hands down -- is the best: not only for politeness even when the information I provided was misleading or the question I asked was, in retrospect, Duh? but also for 1) speed of response, 2) breadth of ideas and 3) accuracy of information -- often on complex issues with no simple solution from folk who probably have more to do than sit around waiting for the next post. My thanks to the knowledgeable people on this forum. Brian
Robert Haas wrote:
The problem with re-coding for them is extensibility (by those who install and administer the package); a mask leaves open lots of extra bits for "site-specific" use, where hard-coding booleans does not, and since the executable is a binary it instantly becomes a huge problem for everyone but me.
It does appear, however, that a bitfield doesn't evaluate any differently than does an integer used with a mask, so there you have it..... it is what it is, and if I want this sort of selectivity in the search I have no choice.
-- Karl
The individual boolean fields don't kill me and in terms of some of the application issues they're actually rather easy to code for.On Mon, Sep 7, 2009 at 8:51 PM, Karl Denninger<karl@denninger.net> wrote:Robert Haas wrote: On Sat, Sep 5, 2009 at 8:19 PM, Karl Denninger<karl@denninger.net> wrote: There was a previous thread and I referenced it. I don't have the other one in my email system any more to follow up to it. I give up; the attack-dog crowd has successfully driven me off. Ciao. Perhaps I'm biased by knowing some of the people involved, but I don't think anyone on this thread has been anything but polite. It would certainly be great if PostgreSQL could properly estimate the selectivity of expressions like this without resorting to nasty hacks, but it can't, and unfortunately, there's really no possibility of that changing any time soon. Even if someone implements a fix today, the soonest it will appear in a production release is June 2010. So, any suggestion for improvement is going to be in the form of suggesting that you modify the schema in some way. I know that's not really what you're looking for, but unfortunately it's the best we can do. As far as I can tell, it is not correct to say that you referenced the previous thread. I do not see any such reference. ...Robert I was asking about modifying the schema. The current schema is an integer being used as a bitmask. If the planner knows how to handle a type of "bit(X)" (and will at least FILTER rather than NESTED LOOP it on a select, as happens for an Integer used in this fashion), that change is easier than splitting it into individual boolean fields.Well, the first several replies seem to address that question - I think we all agree that won't help. I'm not sure what you mean by "at least FILTER rather than NESTED LOOP it on a select". However, typically, the time when you get a nested loop is when the planner believes that the loop will be executed very few times (in other words, the outer side will return very few rows). It probably isn't the case that the planner COULDN'T choose to execute the query in some other way; rather, the planner believes that the nested loop is faster because of a (mistaken) belief about how many rows the bitmap-criterion will actually match. All the suggestions you've gotten upthread are tricks to enable the planner to make a better estimate, which will hopefully cause it to choose a better plan. As a general statement, selectivity estimation problems are very painful to work around and often involve substantial application redesign. In all honesty, I think you've run across one of the easier variants. As painful as it is to hear the word easy applied to a problem that's killing you, there actually IS a good solution to this problem: use individual boolean fields. I know that's not what you want to do, but it's better than "sorry, you're hosed, no matter how you do this it ain't gonna work". And I do think there are a few in the archives that fall into that category. Good luck, and sorry for the bad news. ...Robert
The problem with re-coding for them is extensibility (by those who install and administer the package); a mask leaves open lots of extra bits for "site-specific" use, where hard-coding booleans does not, and since the executable is a binary it instantly becomes a huge problem for everyone but me.
It does appear, however, that a bitfield doesn't evaluate any differently than does an integer used with a mask, so there you have it..... it is what it is, and if I want this sort of selectivity in the search I have no choice.
-- Karl
Вложения
Karl Denninger escribió: > The individual boolean fields don't kill me and in terms of some of the > application issues they're actually rather easy to code for. > > The problem with re-coding for them is extensibility (by those who > install and administer the package); a mask leaves open lots of extra > bits for "site-specific" use, where hard-coding booleans does not, and > since the executable is a binary it instantly becomes a huge problem for > everyone but me. Did you try hiding the bitmask operations inside a function as Tom suggested? -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Mon, Sep 7, 2009 at 10:05 PM, Karl Denninger<karl@denninger.net> wrote: > The individual boolean fields don't kill me and in terms of some of the > application issues they're actually rather easy to code for. > > The problem with re-coding for them is extensibility (by those who install > and administer the package); a mask leaves open lots of extra bits for > "site-specific" use, where hard-coding booleans does not, and since the > executable is a binary it instantly becomes a huge problem for everyone but > me. > > It does appear, however, that a bitfield doesn't evaluate any differently > than does an integer used with a mask, so there you have it..... it is what > it is, and if I want this sort of selectivity in the search I have no > choice. You can always create 32 boolean fields and only use some of them, leaving the others for site-specific use... ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Sep 7, 2009 at 10:05 PM, Karl Denninger<karl@denninger.net> wrote: >> The problem with re-coding for them is extensibility (by those who install >> and administer the package); a mask leaves open lots of extra bits for >> "site-specific" use, where hard-coding booleans does not, > You can always create 32 boolean fields and only use some of them, > leaving the others for site-specific use... Indeed. Why is "user_defined_flag_24" so much worse that "mask & 16777216" ? Especially when the day comes that you need to add one more system-defined flag bit? regards, tom lane
Alvaro Herrera wrote: > Karl Denninger escribi?: > > > The individual boolean fields don't kill me and in terms of some of the > > application issues they're actually rather easy to code for. > > > > The problem with re-coding for them is extensibility (by those who > > install and administer the package); a mask leaves open lots of extra > > bits for "site-specific" use, where hard-coding booleans does not, and > > since the executable is a binary it instantly becomes a huge problem for > > everyone but me. > > Did you try hiding the bitmask operations inside a function as Tom > suggested? Yes. In addition, functions that are part of expression indexes do get their own optimizer statistics, so it does allow you to get optimizer stats for your test without having to use booleans. I see this documented in the 8.0 release notes: * "ANALYZE" now collects statistics for expression indexes (Tom) Expression indexes (also called functional indexes) allow users to index not just columns but the results of expressions and function calls. With this release, the optimizer can gather and use statistics about the contents of expression indexes. This will greatly improve the quality of planning for queries in which an expression index is relevant. Is this in our main documentation somewhere? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian wrote:
create function ispermitted(text, integer) returns boolean as $$
select permission & $2 = permission from forum where forum.name=$1;
$$ Language SQL STABLE;
then calling it with "ispermitted(post.forum, '4')" as one of the terms causes the query optimizer to treat it as a FILTER instead of a nested loop, and it works as expected.
However, I don't think I can index that - right - since there are two variables involved which are not part of the table being indexed.....
-- Karl
Interesting... declaring this:Alvaro Herrera wrote:Karl Denninger escribi?:The individual boolean fields don't kill me and in terms of some of the application issues they're actually rather easy to code for. The problem with re-coding for them is extensibility (by those who install and administer the package); a mask leaves open lots of extra bits for "site-specific" use, where hard-coding booleans does not, and since the executable is a binary it instantly becomes a huge problem for everyone but me.Did you try hiding the bitmask operations inside a function as Tom suggested?Yes. In addition, functions that are part of expression indexes do get their own optimizer statistics, so it does allow you to get optimizer stats for your test without having to use booleans. I see this documented in the 8.0 release notes: * "ANALYZE" now collects statistics for expression indexes (Tom) Expression indexes (also called functional indexes) allow users to index not just columns but the results of expressions and function calls. With this release, the optimizer can gather and use statistics about the contents of expression indexes. This will greatly improve the quality of planning for queries in which an expression index is relevant. Is this in our main documentation somewhere?
create function ispermitted(text, integer) returns boolean as $$
select permission & $2 = permission from forum where forum.name=$1;
$$ Language SQL STABLE;
then calling it with "ispermitted(post.forum, '4')" as one of the terms causes the query optimizer to treat it as a FILTER instead of a nested loop, and it works as expected.
However, I don't think I can index that - right - since there are two variables involved which are not part of the table being indexed.....
-- Karl
Вложения
Karl Denninger wrote: > > Yes. In addition, functions that are part of expression indexes do get > > their own optimizer statistics, so it does allow you to get optimizer > > stats for your test without having to use booleans. > > > > I see this documented in the 8.0 release notes: > > > > * "ANALYZE" now collects statistics for expression indexes (Tom) > > Expression indexes (also called functional indexes) allow users > > to index not just columns but the results of expressions and > > function calls. With this release, the optimizer can gather and > > use statistics about the contents of expression indexes. This will > > greatly improve the quality of planning for queries in which an > > expression index is relevant. > > > > Is this in our main documentation somewhere? > > > > > Interesting... declaring this: > > create function ispermitted(text, integer) returns boolean as $$ > select permission & $2 = permission from forum where forum.name=$1; > $$ Language SQL STABLE; > > then calling it with "ispermitted(post.forum, '4')" as one of the terms > causes the query optimizer to treat it as a FILTER instead of a nested > loop, and it works as expected. > > However, I don't think I can index that - right - since there are two > variables involved which are not part of the table being indexed..... That should index fine. It is an _expression_ index so it can be pretty complicated. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian wrote: > > Interesting... declaring this: > > > > create function ispermitted(text, integer) returns boolean as $$ > > select permission & $2 = permission from forum where forum.name=$1; > > $$ Language SQL STABLE; > > > > then calling it with "ispermitted(post.forum, '4')" as one of the terms > > causes the query optimizer to treat it as a FILTER instead of a nested > > loop, and it works as expected. > > > > However, I don't think I can index that - right - since there are two > > variables involved which are not part of the table being indexed..... > > That should index fine. It is an _expression_ index so it can be pretty > complicated. Oh, you have to use the exact same syntax in there WHERE clause for the expression index to be used, then use EXPLAIN to see if the index is used. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian wrote:
create index forum_ispermitted on forum using btree(ispermitted(name, permission));
ERROR: functions in index expression must be marked IMMUTABLE
ticker=#
The function is of course of class STATIC.
-- Karl
It does not appear I can create an index on that (not that it appears to be necessary for decent performance)Karl Denninger wrote:Yes. In addition, functions that are part of expression indexes do get their own optimizer statistics, so it does allow you to get optimizer stats for your test without having to use booleans. I see this documented in the 8.0 release notes: * "ANALYZE" now collects statistics for expression indexes (Tom) Expression indexes (also called functional indexes) allow users to index not just columns but the results of expressions and function calls. With this release, the optimizer can gather and use statistics about the contents of expression indexes. This will greatly improve the quality of planning for queries in which an expression index is relevant. Is this in our main documentation somewhere?Interesting... declaring this: create function ispermitted(text, integer) returns boolean as $$ select permission & $2 = permission from forum where forum.name=$1; $$ Language SQL STABLE; then calling it with "ispermitted(post.forum, '4')" as one of the terms causes the query optimizer to treat it as a FILTER instead of a nested loop, and it works as expected. However, I don't think I can index that - right - since there are two variables involved which are not part of the table being indexed.....That should index fine. It is an _expression_ index so it can be pretty complicated
create index forum_ispermitted on forum using btree(ispermitted(name, permission));
ERROR: functions in index expression must be marked IMMUTABLE
ticker=#
The function is of course of class STATIC.
-- Karl
Вложения
Bruce Momjian wrote: > Alvaro Herrera wrote: > > Karl Denninger escribi?: > > > > > The individual boolean fields don't kill me and in terms of some of the > > > application issues they're actually rather easy to code for. > > > > > > The problem with re-coding for them is extensibility (by those who > > > install and administer the package); a mask leaves open lots of extra > > > bits for "site-specific" use, where hard-coding booleans does not, and > > > since the executable is a binary it instantly becomes a huge problem for > > > everyone but me. > > > > Did you try hiding the bitmask operations inside a function as Tom > > suggested? > > Yes. In addition, functions that are part of expression indexes do get > their own optimizer statistics, so it does allow you to get optimizer > stats for your test without having to use booleans. > > I see this documented in the 8.0 release notes: > > * "ANALYZE" now collects statistics for expression indexes (Tom) > Expression indexes (also called functional indexes) allow users > to index not just columns but the results of expressions and > function calls. With this release, the optimizer can gather and > use statistics about the contents of expression indexes. This will > greatly improve the quality of planning for queries in which an > expression index is relevant. > > Is this in our main documentation somewhere? Added with attached, applied patch. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com PG East: http://www.enterprisedb.com/community/nav-pg-east-2010.do + If your life is a hard drive, Christ can be your backup. + Index: doc/src/sgml/maintenance.sgml =================================================================== RCS file: /cvsroot/pgsql/doc/src/sgml/maintenance.sgml,v retrieving revision 1.99 diff -c -c -r1.99 maintenance.sgml *** doc/src/sgml/maintenance.sgml 8 Feb 2010 04:33:51 -0000 1.99 --- doc/src/sgml/maintenance.sgml 23 Feb 2010 02:46:21 -0000 *************** *** 318,323 **** --- 318,331 ---- SET STATISTICS</>, or change the database-wide default using the <xref linkend="guc-default-statistics-target"> configuration parameter. </para> + + <para> + Also, by default there is limited information available about + the selectivity of functions. However, if you create an expression + index that uses a function call, useful statistics will be + gathered about the function, which can greatly improve query + plans that use the expression index. + </para> </tip> </sect2>
On Sep 7, 2009, at 7:05 PM, Karl Denninger wrote:
The individual boolean fields don't kill me and in terms of some of the application issues they're actually rather easy to code for.
The problem with re-coding for them is extensibility (by those who install and administer the package); a mask leaves open lots of extra bits for "site-specific" use, where hard-coding booleans does not, and since the executable is a binary it instantly becomes a huge problem for everyone but me.
It does appear, however, that a bitfield doesn't evaluate any differently than does an integer used with a mask, so there you have it..... it is what it is, and if I want this sort of selectivity in the search I have no choice.
Perhaps, use a view to encapsulate the extensible bit fields? Then custom installations just modify the view? I haven't thought through that too far, but it might work.
<karl.vcf>
-- Karl
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance