Обсуждение: SQL Query Performance - what gives?

Поиск
Список
Период
Сортировка

SQL Query Performance - what gives?

От
Karl Denninger
Дата:
Let's take the following EXPLAIN results:

ticker=# explain select * from post, forum where forum.name = post.forum
and invisible <> 1 and to_tsvector('english', message) @@
to_tsquery('violence') order by modified desc limit
100;

QUERY PLAN
---------------------------------------------------------------------------------------------------------------
 Limit  (cost=5951.85..5952.10 rows=100 width=706)
   ->  Sort  (cost=5951.85..5955.37 rows=1408 width=706)
         Sort Key: post.modified
         ->  Hash Join  (cost=613.80..5898.04 rows=1408 width=706)
               Hash Cond: (post.forum = forum.name)
               ->  Bitmap Heap Scan on post  (cost=370.93..5635.71
rows=1435 width=435)
                     Recheck Cond: (to_tsvector('english'::text,
message) @@ to_tsquery('violence'::text))
                     Filter: (invisible <> 1)
                     ->  Bitmap Index Scan on idx_message
(cost=0.00..370.57 rows=1435 width=0)
                           Index Cond: (to_tsvector('english'::text,
message) @@ to_tsquery('violence'::text))
               ->  Hash  (cost=242.07..242.07 rows=64 width=271)
                     ->  Index Scan using forum_name on forum
(cost=0.00..242.07 rows=64 width=271)
(12 rows)

ticker=#

And


ticker=# explain select * from post, forum where forum.name = post.forum
and invisible <> 1 and ((permission & '127') = permission) and (contrib
is null or contrib = ' ' or contrib like '%b%') and
to_tsvector('english', message) @@ to_tsquery('violence') order by
modified desc limit 100;

QUERY
PLAN

------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1329.81..1329.87 rows=22 width=706)
   ->  Sort  (cost=1329.81..1329.87 rows=22 width=706)
         Sort Key: post.modified
         ->  Nested Loop  (cost=978.96..1329.32 rows=22 width=706)
               ->  Index Scan using forum_name on forum
(cost=0.00..242.71 rows=1 width=271)
                     Filter: (((contrib IS NULL) OR (contrib = '
'::text) OR (contrib ~~ '%b%'::text)) AND ((permission & 127) = permission))
               ->  Bitmap Heap Scan on post  (cost=978.96..1086.28
rows=27 width=435)
                     Recheck Cond: ((to_tsvector('english'::text,
post.message) @@ to_tsquery('violence'::text)) AND (post.forum =
forum.name))
                     Filter: (post.invisible <> 1)
                     ->  BitmapAnd  (cost=978.96..978.96 rows=27 width=0)
                           ->  Bitmap Index Scan on idx_message
(cost=0.00..370.57 rows=1435 width=0)
                                 Index Cond:
(to_tsvector('english'::text, post.message) @@ to_tsquery('violence'::text))
                           ->  Bitmap Index Scan on post_forum
(cost=0.00..607.78 rows=26575 width=0)
                                 Index Cond: (post.forum = forum.name)
(14 rows)

ticker=#


The difference in these two queries is that the second qualifies the
returned search to check two permission blocks - one related to the
user's permission bit mask, and the second a mask of single-character
"flags" (the user's classification must be in the list of permitted
classifications)

Ok.  Notice that the top-line cost of the first query is HIGHER.

The first query runs almost instantly - average execution latency is
frequently in the few-hundred millisecond range.

The second query can take upward of 30 seconds (!) to run.

Neither hits the disk, the machine in question has scads of free RAM
available, and while busy is not particularly constrained.  Other
simultaneous users on the database are getting queries back immediately
(no unreasonable delays).

If I remove parts of the permission tests it does not matter.  If ANY of
those tests qualifies the returned values the performance goes in the
toilet.  If I re-order when the permission tests appear (e.g. at the end
of the search command) it makes no difference in the response time
either (it does, however, change the EXPLAIN output somewhat, and
thereby appears to change the query plan.

What's going on here?  I can usually figure out what's causing bad
performance and fix it with the judicious addition of an index or other
similar thing - this one has me completely mystified.

-- Karl

Вложения

Re: SQL Query Performance - what gives?

От
"Kevin Grittner"
Дата:
Karl Denninger <karl@denninger.net> wrote:

> Let's take the following EXPLAIN results:

We could tell a lot more from EXPLAIN ANALYZE results.

The table definitions (with index information) would help, too.

-Kevin

Re: SQL Query Performance - what gives?

От
Karl Denninger
Дата:
First query:


ticker=# explain analyze select * from post, forum where forum.name = post.forum and invisible <> 1 and to_tsvector('english', message) @@ to_tsquery('violence') order by modified desc limit 100;
                                                                   QUERY PLAN                                                                  
------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=5959.78..5960.03 rows=100 width=706) (actual time=49.847..50.264 rows=100 loops=1)
   ->  Sort  (cost=5959.78..5963.30 rows=1408 width=706) (actual time=49.843..49.982 rows=100 loops=1)
         Sort Key: post.modified
         Sort Method:  top-N heapsort  Memory: 168kB
         ->  Hash Join  (cost=621.72..5905.96 rows=1408 width=706) (actual time=4.050..41.238 rows=2055 loops=1)
               Hash Cond: (post.forum = forum.name)
               ->  Bitmap Heap Scan on post  (cost=370.93..5635.71 rows=1435 width=435) (actual time=3.409..32.648 rows=2055 loops=1)
                     Recheck Cond: (to_tsvector('english'::text, message) @@ to_tsquery('violence'::text))
                     Filter: (invisible <> 1)
                     ->  Bitmap Index Scan on idx_message  (cost=0.00..370.57 rows=1435 width=0) (actual time=2.984..2.984 rows=2085 loops=1)
                           Index Cond: (to_tsvector('english'::text, message) @@ to_tsquery('violence'::text))
               ->  Hash  (cost=249.97..249.97 rows=66 width=271) (actual time=0.596..0.596 rows=64 loops=1)
                     ->  Index Scan using forum_name on forum  (cost=0.00..249.97 rows=66 width=271) (actual time=0.093..0.441 rows=64 loops=1)
 Total runtime: 50.625 ms
(14 rows)

ticker=#

Second query:



ticker=# explain analyze select * from post, forum where forum.name = post.forum and invisible <> 1 and ((permission & '127') = permission) and (contrib is null or contrib = ' ' or contrib like '%b%') and  to_tsvector('english', message) @@ to_tsquery('violence') order by modified desc limit 100;
                                                                       QUERY PLAN                                                                      
--------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1337.71..1337.76 rows=21 width=706) (actual time=31121.317..31121.736 rows=100 loops=1)
   ->  Sort  (cost=1337.71..1337.76 rows=21 width=706) (actual time=31121.313..31121.452 rows=100 loops=1)
         Sort Key: post.modified
         Sort Method:  top-N heapsort  Memory: 168kB
         ->  Nested Loop  (cost=978.97..1337.25 rows=21 width=706) (actual time=2.841..31108.926 rows=2055 loops=1)
               ->  Index Scan using forum_name on forum  (cost=0.00..250.63 rows=1 width=271) (actual time=0.013..0.408 rows=63 loops=1)
                     Filter: (((contrib IS NULL) OR (contrib = ' '::text) OR (contrib ~~ '%b%'::text)) AND ((permission & 127) = permission))
               ->  Bitmap Heap Scan on post  (cost=978.97..1086.28 rows=27 width=435) (actual time=109.832..493.648 rows=33 loops=63)
                     Recheck Cond: ((to_tsvector('english'::text, post.message) @@ to_tsquery('violence'::text)) AND (post.forum = forum.name))
                     Filter: (post.invisible <> 1)
                     ->  BitmapAnd  (cost=978.97..978.97 rows=27 width=0) (actual time=98.832..98.832 rows=0 loops=63)
                           ->  Bitmap Index Scan on idx_message  (cost=0.00..370.57 rows=1435 width=0) (actual time=0.682..0.682 rows=2085 loops=63)
                                 Index Cond: (to_tsvector('english'::text, post.message) @@ to_tsquery('violence'::text))
                           ->  Bitmap Index Scan on post_forum  (cost=0.00..607.78 rows=26575 width=0) (actual time=97.625..97.625 rows=22616 loops=63)
                                 Index Cond: (post.forum = forum.name)
 Total runtime: 31122.781 ms
(16 rows)

ticker=#
ticker=# \d post
                                      Table "public.post"
  Column   |           Type           |                       Modifiers                       
-----------+--------------------------+--------------------------------------------------------
 forum     | text                     |
 number    | integer                  |
 toppost   | integer                  |
 views     | integer                  | default 0
 login     | text                     |
 subject   | text                     |
 message   | text                     |
 inserted  | timestamp with time zone |
 modified  | timestamp with time zone |
 replied   | timestamp with time zone |
 who       | text                     |
 reason    | text                     |
 ordinal   | integer                  | not null default nextval('post_ordinal_seq'::regclass)
 replies   | integer                  | default 0
 invisible | integer                  |
 sticky    | integer                  |
 ip        | inet                     |
 lock      | integer                  | default 0
 pinned    | integer                  | default 0
 marked    | boolean                  |
Indexes:
    "post_pkey" PRIMARY KEY, btree (ordinal)
    "idx_message" gin (to_tsvector('english'::text, message))
    "idx_subject" gin (to_tsvector('english'::text, subject))
    "post_forum" btree (forum)
    "post_getlastpost" btree (forum, modified)
    "post_inserted" btree (inserted)
    "post_login" btree (login)
    "post_modified" btree (modified)
    "post_number" btree (number)
    "post_order" btree (number, inserted)
    "post_ordinal" btree (ordinal)
    "post_top" btree (toppost)
    "post_toppost" btree (forum, toppost, inserted)
Foreign-key constraints:
    "forum_fk" FOREIGN KEY (forum) REFERENCES forum(name) ON UPDATE CASCADE ON DELETE CASCADE
    "login_fk" FOREIGN KEY (login) REFERENCES usertable(login) ON UPDATE CASCADE ON DELETE CASCADE
Triggers:
    _tickerforum_logtrigger AFTER INSERT OR DELETE OR UPDATE ON post FOR EACH ROW EXECUTE PROCEDURE _tickerforum.logtrigger('_tickerforum', '20', 'vvvvvvvvvvvvk')
Disabled triggers:
    _tickerforum_denyaccess BEFORE INSERT OR DELETE OR UPDATE ON post FOR EACH ROW EXECUTE PROCEDURE _tickerforum.denyaccess('_tickerforum')

ticker=# \d forum
                Table "public.forum"
   Column    |           Type           | Modifiers
-------------+--------------------------+-----------
 name        | text                     | not null
 description | text                     |
 long_desc   | text                     |
 forum_type  | integer                  |
 forum_order | integer                  |
 lastpost    | timestamp with time zone |
 lastperson  | text                     |
 permission  | integer                  | default 0
 modtime     | integer                  |
 numposts    | integer                  | default 0
 type        | integer                  | default 0
 readonly    | integer                  | default 0
 moderated   | integer                  | default 0
 flags       | integer                  |
 rsslength   | text                     |
 contrib     | text                     |
 autolock    | text                     |
 autodest    | text                     |
 open        | text                     |
Indexes:
    "forum_pkey" PRIMARY KEY, btree (name)
    "forum_name" UNIQUE, btree (name)
    "forum_order" UNIQUE, btree (forum_order)
Triggers:
    _tickerforum_logtrigger AFTER INSERT OR DELETE OR UPDATE ON forum FOR EACH ROW EXECUTE PROCEDURE _tickerforum.logtrigger('_tickerforum', '7', 'k')
Disabled triggers:
    _tickerforum_denyaccess BEFORE INSERT OR DELETE OR UPDATE ON forum FOR EACH ROW EXECUTE PROCEDURE _tickerforum.denyaccess('_tickerforum')

(The triggers exist due to replication via Slony)


Kevin Grittner wrote:
Karl Denninger <karl@denninger.net> wrote: 
Let's take the following EXPLAIN results:   
 
We could tell a lot more from EXPLAIN ANALYZE results.
The table definitions (with index information) would help, too.
-Kevin
 
Вложения

Re: SQL Query Performance - what gives?

От
"Kevin Grittner"
Дата:
Karl Denninger <karl@denninger.net> wrote:
>                ->  Index Scan using forum_name on forum
> (cost=0.00..250.63 rows=1 width=271) (actual time=0.013..0.408
> rows=63 loops=1)
>                      Filter: (((contrib IS NULL) OR (contrib = '
> '::text) OR (contrib ~~ '%b%'::text)) AND ((permission & 127) =
> permission))

The biggest issue, as far as I can see, is that it thinks that the
selection criteria on forum will limit to one row, while it really
matches 63 rows.

You might be able to coerce it into a faster plan with something like
this (untested):

select *
  from (select * from post
          where invisible <> 1
            and to_tsvector('english', message)
             @@ to_tsquery('violence')
       ) p,
  forum
  where forum.name = p.forum
    and (permission & '127') = permission
    and (contrib is null or contrib = ' ' or contrib like '%b%')
  order by modified desc
  limit 100
;

-Kevin

Re: SQL Query Performance - what gives?

От
Karl Denninger
Дата:
Kevin Grittner wrote:
Karl Denninger <karl@denninger.net> wrote: 
               ->  Index Scan using forum_name on forum 
(cost=0.00..250.63 rows=1 width=271) (actual time=0.013..0.408
rows=63 loops=1)                    Filter: (((contrib IS NULL) OR (contrib = '
'::text) OR (contrib ~~ '%b%'::text)) AND ((permission & 127) =
permission))   
 
The biggest issue, as far as I can see, is that it thinks that the
selection criteria on forum will limit to one row, while it really
matches 63 rows.
You might be able to coerce it into a faster plan with something like
this (untested):
select * from (select * from post         where invisible <> 1           and to_tsvector('english', message)            @@ to_tsquery('violence')      ) p, forum where forum.name = p.forum   and (permission & '127') = permission   and (contrib is null or contrib = ' ' or contrib like '%b%') order by modified desc limit 100
;
-Kevin 

That didn't help.

The FTS alone returns 2,000 records on that table, and does so VERY quickly:

ticker=# explain analyze select count(ordinal) from post, forum where post.forum=forum.name and invisible <> 1
            and to_tsvector('english', message)
             @@ to_tsquery('violence');
                                                               QUERY PLAN                                                              
----------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=5901.57..5901.58 rows=1 width=4) (actual time=17.492..17.494 rows=1 loops=1)
   ->  Hash Join  (cost=613.80..5898.04 rows=1408 width=4) (actual time=1.436..14.620 rows=2056 loops=1)
         Hash Cond: (post.forum = forum.name)
         ->  Bitmap Heap Scan on post  (cost=370.93..5635.71 rows=1435 width=14) (actual time=1.123..7.944 rows=2056 loops=1)
               Recheck Cond: (to_tsvector('english'::text, message) @@ to_tsquery('violence'::text))
               Filter: (invisible <> 1)
               ->  Bitmap Index Scan on idx_message  (cost=0.00..370.57 rows=1435 width=0) (actual time=0.738..0.738 rows=2099 loops=1)
                     Index Cond: (to_tsvector('english'::text, message) @@ to_tsquery('violence'::text))
         ->  Hash  (cost=242.07..242.07 rows=64 width=9) (actual time=0.300..0.300 rows=64 loops=1)
               ->  Index Scan using forum_name on forum  (cost=0.00..242.07 rows=64 width=9) (actual time=0.011..0.182 rows=64 loops=1)
 Total runtime: 17.559 ms
(11 rows)

ticker=#

Ok, but now when we check the permission mask....


ticker=# explain analyze select count(ordinal) from post, forum where post.forum=forum.name and invisible <> 1
            and to_tsvector('english', message)
             @@ to_tsquery('violence') and (permission & 4 = permission);
                                                                    QUERY PLAN                                                                   
--------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1329.07..1329.08 rows=1 width=4) (actual time=29819.293..29819.295 rows=1 loops=1)
   ->  Nested Loop  (cost=978.97..1329.01 rows=22 width=4) (actual time=2.575..29815.530 rows=2056 loops=1)
         ->  Index Scan using forum_name on forum  (cost=0.00..242.39 rows=1 width=13) (actual time=0.016..0.355 rows=62 loops=1)
               Filter: ((permission & 4) = permission)
         ->  Bitmap Heap Scan on post  (cost=978.97..1086.28 rows=27 width=14) (actual time=97.997..480.746 rows=33 loops=62)
               Recheck Cond: ((to_tsvector('english'::text, post.message) @@ to_tsquery('violence'::text)) AND (post.forum = forum.name))
               Filter: (post.invisible <> 1)
               ->  BitmapAnd  (cost=978.97..978.97 rows=27 width=0) (actual time=91.106..91.106 rows=0 loops=62)
                     ->  Bitmap Index Scan on idx_message  (cost=0.00..370.57 rows=1435 width=0) (actual time=0.680..0.680 rows=2099 loops=62)
                           Index Cond: (to_tsvector('english'::text, post.message) @@ to_tsquery('violence'::text))
                     ->  Bitmap Index Scan on post_forum  (cost=0.00..607.78 rows=26575 width=0) (actual time=89.927..89.927 rows=22980 loops=62)
                           Index Cond: (post.forum = forum.name)
 Total runtime: 29819.376 ms
(13 rows)

ticker=#

The problem appearsa to lie in the "nested loop", and I don't understand why that's happening.  Isn't a **LINEAR** check on each returned value (since we do the aggregate first?) sufficient?  Why is the query planner creating a nested loop - the aggregate contains the tested field and it is not subject to change once aggregated?!


Вложения

Re: [SQL] SQL Query Performance - what gives?

От
Tom Lane
Дата:
Karl Denninger <karl@denninger.net> writes:
> The problem appearsa to lie in the "nested loop", and I don't understand
> why that's happening.

It looks to me like there are several issues here.

One is the drastic underestimate of the number of rows satisfying the
permission condition. That leads the planner to think that a nestloop
join with the other table will be fast, which is only right if there are
just one or a few rows coming out of "forum".  With sixty-some rows you
get sixty-some repetitions of the scan of the other table, which loses.

Problem number two is the overeager use of a BitmapAnd to add on another
index that isn't really very selective.  That might be a correct
decision but it looks fishy here.  We rewrote choose_bitmap_and a couple
of times to try to fix that problem ... what PG version is this exactly?

The third thing that looks fishy is that it's using unqualified index
scans for no apparent reason.  Have you got enable_seqscan turned off,
and if so what happens when you fix that?  What other nondefault planner
settings are you using?

But anyway, the big problem seems to be poor selectivity estimates for
conditions like "(permission & 127) = permission".  I have bad news for
you: there is simply no way in the world that Postgres is not going to
suck at estimating that, because the planner has no knowledge whatsoever
of the behavior of "&".  You could consider writing and submitting a
patch that would teach it something about that, but in the near term
it would be a lot easier to reconsider your representation of
permissions.  You'd be likely to get significantly better results,
not to mention have more-readable queries, if you stored them as a group
of simple boolean columns.

            regards, tom lane

Re: [SQL] SQL Query Performance - what gives?

От
Karl Denninger
Дата:
Tom Lane wrote:
> Karl Denninger <karl@denninger.net> writes:
>
>> The problem appearsa to lie in the "nested loop", and I don't understand
>> why that's happening.
>>
> It looks to me like there are several issues here.
>
> One is the drastic underestimate of the number of rows satisfying the
> permission condition. That leads the planner to think that a nestloop
> join with the other table will be fast, which is only right if there are
> just one or a few rows coming out of "forum".  With sixty-some rows you
> get sixty-some repetitions of the scan of the other table, which loses.
>
"Loses" isn't quite the right word... :)
> Problem number two is the overeager use of a BitmapAnd to add on another
> index that isn't really very selective.  That might be a correct
> decision but it looks fishy here.  We rewrote choose_bitmap_and a couple
> of times to try to fix that problem ... what PG version is this exactly?
>
$ psql ticker
Welcome to psql 8.3.6, the PostgreSQL interactive terminal.

> The third thing that looks fishy is that it's using unqualified index
> scans for no apparent reason.  Have you got enable_seqscan turned off,
> and if so what happens when you fix that?  What other nondefault planner
> settings are you using?
>
None; here is the relevant section of the postgresql.conf file:

# - Planner Method Configuration -

#enable_bitmapscan = on
#enable_hashagg = on
#enable_hashjoin = on
#enable_indexscan = on
#enable_mergejoin = on
#enable_nestloop = on
#enable_seqscan = on
#enable_sort = on
#enable_tidscan = on

# - Planner Cost Constants -

#seq_page_cost = 1.0                    # measured on an arbitrary scale
#random_page_cost = 4.0                 # same scale as above
#cpu_tuple_cost = 0.01                  # same scale as above
#cpu_index_tuple_cost = 0.005           # same scale as above
#cpu_operator_cost = 0.0025             # same scale as above
#effective_cache_size = 128MB

# - Genetic Query Optimizer -

#geqo = on
#geqo_threshold = 12
#geqo_effort = 5                        # range 1-10
#geqo_pool_size = 0                     # selects default based on effort
#geqo_generations = 0                   # selects default based on effort
#geqo_selection_bias = 2.0              # range 1.5-2.0

# - Other Planner Options -

default_statistics_target = 100         # range 1-1000
#constraint_exclusion = off
#from_collapse_limit = 8
#join_collapse_limit = 8                # 1 disables collapsing of explicit
                                        # JOIN clauses

All commented out - nothing set to non-defaults, other than the default
statistics target.
> But anyway, the big problem seems to be poor selectivity estimates for
> conditions like "(permission & 127) = permission".  I have bad news for
> you: there is simply no way in the world that Postgres is not going to
> suck at estimating that, because the planner has no knowledge whatsoever
> of the behavior of "&".  You could consider writing and submitting a
> patch that would teach it something about that, but in the near term
> it would be a lot easier to reconsider your representation of
> permissions.  You'd be likely to get significantly better results,
> not to mention have more-readable queries, if you stored them as a group
> of simple boolean columns.
>
>             regards, tom lane
>
Ugh.

The issue here is that the permission structure is quite extensible by
the users of the code; there are defined bits (Bit 4, for example, means
that the user is an "ordinary user" and has a login account) but the
upper bits are entirely administrator-defined and may vary from one
installation to another (and do)

The bitmask allows the setting of multiple permissions but the table
definition doesn't have to change (well, so long as the bits fit into a
word!)  Finally, this is a message forum - the actual code itself is
template-driven and the bitmask permission structure is ALL OVER the
templates; getting that out of there would be a really nasty rewrite,
not to mention breaking the user (non-developer, but owner)
extensibility of the current structure.

Is there a way to TELL the planner how to deal with this, even if it
makes the SQL non-portable or is a hack on the source mandatory?

For the particular instance where this came up it won't be murderous to
omit the bitmask check from the query, as there are no "owner/moderator
only" sub-forums (the one place where not checking that would bite HARD
as it would allow searches of "hidden" content by ordinary users.)
However, there are other installations where this will be a bigger deal;
I can in the immediate term put that query into the config file (instead
of hard-coding it) so for people who can't live with the performance
they can make the tradeoff decision.


-- Karl

Вложения

Re: [SQL] SQL Query Performance - what gives?

От
Pierre Frédéric Caillaud
Дата:
> The bitmask allows the setting of multiple permissions but the table
> definition doesn't have to change (well, so long as the bits fit into a
> word!)  Finally, this is a message forum - the actual code itself is
> template-driven and the bitmask permission structure is ALL OVER the
> templates; getting that out of there would be a really nasty rewrite,
> not to mention breaking the user (non-developer, but owner)
> extensibility of the current structure.
>
> Is there a way to TELL the planner how to deal with this, even if it
> makes the SQL non-portable or is a hack on the source mandatory?

    You could use an integer array instead of a bit mask, make a gist index
on it, and instead of doing "mask & xxx" do "array contains xxx", which is
indexable with gist. The idea is that it can get much better row
estimation. Instead of 1,2,3, you can use 1,2,4,8, etc if you like. you'd
probably need a function to convert a bitmask into ints and another to do
the conversion back, so the rest of your app gets the expected bitmasks.
Or add a bitmask type to postgres with ptoper statistics...

Re: SQL Query Performance - what gives?

От
Ivan Voras
Дата:
Karl Denninger wrote:

> The bitmask allows the setting of multiple permissions but the table
> definition doesn't have to change (well, so long as the bits fit into a
> word!)  Finally, this is a message forum - the actual code itself is
> template-driven and the bitmask permission structure is ALL OVER the
> templates; getting that out of there would be a really nasty rewrite,
> not to mention breaking the user (non-developer, but owner)
> extensibility of the current structure.
>
> Is there a way to TELL the planner how to deal with this, even if it
> makes the SQL non-portable or is a hack on the source mandatory?

You could maybe create function indexes for common bitmap operations;
for example if it's common to check a single bit you could create 32
indexes, on (field & 1), (field & 2), (field & 4), etc. You could also
maybe extend this so if you need to query multiple bits you decompose
them into individual single-bit queries, e.g. instead of (field & 3) you
do ((field & 1) and (field & 2)).

I suppose there will be a break-even point in complexity before which
the above approach will be very slow but after it it should scale better
then the alternative.