Обсуждение: Performance Anomaly with "col in (A,B)" vs. "col = A OR col = B" ver. 9.0.3

Поиск
Список
Период
Сортировка

Performance Anomaly with "col in (A,B)" vs. "col = A OR col = B" ver. 9.0.3

От
Timothy Garnett
Дата:
Hi All,

We are currently using PostgreSQL 9.0.3 and we noticed a performance anomaly from a framework (ActiveRecord) generated query to one of our tables.  The query uses an in clause to check an indexed column for the presence of either of two values.  In this particular case neither of them is present (but in other cases one or more might be).  The framework generates a limit 1 query to test for existence.  This query ends up using a seq scan and is quite slow, however rewriting it using OR = rather then IN uses the index (as does removing the limit or raising it to a large value).  The table has 36 million rows (more details are below) and is read only in typical usage.  I was wondering if IN vs OR planning being so differently represented a bug and/or if we might have some misconfiguration somewhere that leads the query planner to pick what in best case can only be a slightly faster plan then using the index but in worst case is much much slower.  I would also think the cluster on the table would argue against using a sequence scan for this kind of query (since the hts_code_id's will be colocated, perf, if the id is present, will very greatly depending on what order the seq scan walks the table which we've observed...; if the id(s) are not present then this plan is always terrible).  We can use set enable_seqscan TO off around this query if need be, but it seems like something the planner should have done better with unless we have something weird somewhere (conf file details are below).

psql (9.0.3)
Type "help" for help.

-- Table info
dev=> ANALYZE exp_detls;
ANALYZE
dev=> select count(*) from exp_detls;
 36034391
dev=>explain analyze select count(*) from exp_detls;
 Aggregate  (cost=1336141.30..1336141.31 rows=1 width=0) (actual time=43067.620..43067.621 rows=1 loops=1)
   ->  Seq Scan on exp_detls  (cost=0.00..1246046.84 rows=36037784 width=0) (actual time=0.011..23703.177 rows=36034391 loops=1)
 Total runtime: 43067.675 ms
dev=>select pg_size_pretty(pg_table_size('exp_detls'));
 6919 MB


-- Problematic Query
dev=> explain analyze SELECT "exp_detls".id FROM "exp_detls" WHERE ("exp_detls"."hts_code_id" IN (12,654)) LIMIT 1;
 Limit  (cost=0.00..158.18 rows=1 width=4) (actual time=9661.363..9661.363 rows=0 loops=1)
   ->  Seq Scan on exp_detls  (cost=0.00..1336181.90 rows=8447 width=4) (actual time=9661.360..9661.360 rows=0 loops=1)
         Filter: (hts_code_id = ANY ('{12,654}'::integer[]))
 Total runtime: 9661.398 ms
(4 rows)


-- Using OR =, much faster, though more complicated plan then below
dev=> explain analyze SELECT "exp_detls".id FROM "exp_detls" WHERE ("exp_detls"."hts_code_id" = 12 OR
"exp_detls"."hts_code_id" = 654) LIMIT 1;
 Limit  (cost=162.59..166.29 rows=1 width=4) (actual time=0.029..0.029 rows=0 loops=1)
   ->  Bitmap Heap Scan on exp_detls  (cost=162.59..31188.14 rows=8370 width=4) (actual time=0.028..0.028 rows=0 loops=1)
         Recheck Cond: ((hts_code_id = 12) OR (hts_code_id = 654))
         ->  BitmapOr  (cost=162.59..162.59 rows=8370 width=0) (actual time=0.027..0.027 rows=0 loops=1)
               ->  Bitmap Index Scan on index_exp_detls_on_hts_code_id_and_data_month  (cost=0.00..79.20 rows=4185 width=0) (actual time=0.017..0.017 rows=0 loops=1)
                     Index Cond: (hts_code_id = 12)
               ->  Bitmap Index Scan on index_exp_detls_on_hts_code_id_and_data_month  (cost=0.00..79.20 rows=4185 width=0) (actual time=0.007..0.007 rows=0 loops=1)
                     Index Cond: (hts_code_id = 654)
 Total runtime: 0.051 ms
(9 rows)


-- No limit, much faster, also a cleaner looking plan (of course problematic when there are many matching rows)
dev=>explain analyze SELECT "exp_detls".id FROM "exp_detls" WHERE ("exp_detls"."hts_code_id" IN (12,654));
 Bitmap Heap Scan on exp_detls  (cost=156.93..31161.56 rows=8370 width=4) (actual time=0.028..0.028 rows=0 loops=1)
   Recheck Cond: (hts_code_id = ANY ('{12,654}'::integer[]))
   ->  Bitmap Index Scan on index_exp_detls_on_hts_code_id_and_data_month  (cost=0.00..154.84 rows=8370 width=0) (actual time=0.026..0.026 rows=0 loops=1)
         Index Cond: (hts_code_id = ANY ('{12,654}'::integer[]))
 Total runtime: 0.045 ms
(5 rows)


-- Table Schema

                                        Table "public.exp_detls"
      Column      |            Type             |                       Modifiers                       
------------------+-----------------------------+--------------------------------------------------------
 id               | integer                     | not null default nextval('exp_detls_id_seq'::regclass)
 created_at       | timestamp without time zone | not null
 df               | integer                     |
 hts_code_id      | integer                     | not null
 uscb_country_id  | integer                     |
 country_id       | integer                     |
 uscb_district_id | integer                     |
 cards_mo         | numeric(15,0)               | not null
 qty_1_mo         | numeric(15,0)               | not null
 qty_2_mo         | numeric(15,0)               |
 all_val_mo       | numeric(15,0)               | not null
 air_val_mo       | numeric(15,0)               | not null
 air_wgt_mo       | numeric(15,0)               | not null
 ves_val_mo       | numeric(15,0)               | not null
 ves_wgt_mo       | numeric(15,0)               | not null
 cnt_val_mo       | numeric(15,0)               | not null
 cnt_wgt_mo       | numeric(15,0)               | not null
 cards_yr         | numeric(15,0)               | not null
 qty_1_yr         | numeric(15,0)               | not null
 qty_2_yr         | numeric(15,0)               |
 all_val_yr       | numeric(15,0)               | not null
 air_val_yr       | numeric(15,0)               | not null
 air_wgt_yr       | numeric(15,0)               | not null
 ves_val_yr       | numeric(15,0)               | not null
 ves_wgt_yr       | numeric(15,0)               | not null
 cnt_val_yr       | numeric(15,0)               | not null
 cnt_wgt_yr       | numeric(15,0)               | not null
 data_month       | date                        | not null
 parent_id        | integer                     |
Indexes:
    "exp_detls_pkey" PRIMARY KEY, btree (id)
    "index_exp_detls_on_data_month" btree (data_month) WITH (fillfactor=100)
    "index_exp_detls_on_hts_code_id_and_data_month" btree (hts_code_id, data_month) WITH (fillfactor=100) CLUSTER
    "index_exp_detls_on_parent_id" btree (parent_id) WITH (fillfactor=100) WHERE parent_id IS NOT NULL
<Several FK's>



postgresql.conf non-default settings
listen_addresses = '*'    # what IP address(es) to listen on;
port = 5432               # (change requires restart)
max_connections = 230     # (change requires restart)
tcp_keepalives_idle = 180   # TCP_KEEPIDLE, in seconds;
shared_buffers = 4GB      # min 128kB, DEFAULT 32MB
work_mem = 512MB      # min 64kB, DEFAULT 1MB
maintenance_work_mem = 256MB    # min 1MB, DEFAULT 16MB
effective_io_concurrency = 2    # 1-1000. 0 disables prefetching
synchronous_commit = off    # immediate fsync at commit, DEFAULT on
wal_buffers = 16MB      # min 32kB, DEFAULT 64kB
wal_writer_delay = 330ms    # 1-10000 milliseconds, DEFAULT 200ms
checkpoint_segments = 24    # in logfile segments, min 1, 16MB each
checkpoint_completion_target = 0.9  # checkpoint target duration, 0.0 - 1.0
effective_cache_size = 24GB      # DEFAULT 128MB
logging_collector = on      # Enable capturing of stderr and csvlog
log_checkpoints = on # DEFAULT off
log_connections = on # DEFAULT off
log_disconnections = on # DEFAULT off
log_hostname = on # DEFAULT off
log_line_prefix = '%t'     # special values:
track_activity_query_size = 8192   # (change requires restart)
bytea_output = 'escape'      # hex, escape, Default hex
datestyle = 'iso, mdy'
lc_messages = 'en_US.UTF-8'      # locale for system error message strings
lc_monetary = 'en_US.UTF-8'      # locale for monetary formatting
lc_numeric = 'en_US.UTF-8'      # locale for number formatting
lc_time = 'en_US.UTF-8'        # locale for time formatting
default_text_search_config = 'pg_catalog.english'

Re: Performance Anomaly with "col in (A,B)" vs. "col = A OR col = B" ver. 9.0.3

От
Tom Lane
Дата:
Timothy Garnett <tgarnett@panjiva.com> writes:
> -- Problematic Query
> dev=> explain analyze SELECT "exp_detls".id FROM "exp_detls" WHERE
> ("exp_detls"."hts_code_id" IN (12,654)) LIMIT 1;
>  Limit  (cost=0.00..158.18 rows=1 width=4) (actual time=9661.363..9661.363
> rows=0 loops=1)
>    ->  Seq Scan on exp_detls  (cost=0.00..1336181.90 rows=8447 width=4)
> (actual time=9661.360..9661.360 rows=0 loops=1)
>          Filter: (hts_code_id = ANY ('{12,654}'::integer[]))
>  Total runtime: 9661.398 ms
> (4 rows)

> -- Using OR =, much faster, though more complicated plan then below
> dev=> explain analyze SELECT "exp_detls".id FROM "exp_detls" WHERE
> ("exp_detls"."hts_code_id" = 12 OR "exp_detls"."hts_code_id" = 654) LIMIT 1;
>  Limit  (cost=162.59..166.29 rows=1 width=4) (actual time=0.029..0.029
> rows=0 loops=1)
>    ->  Bitmap Heap Scan on exp_detls  (cost=162.59..31188.14 rows=8370
> width=4) (actual time=0.028..0.028 rows=0 loops=1)
>          Recheck Cond: ((hts_code_id = 12) OR (hts_code_id = 654))
>          ->  BitmapOr  (cost=162.59..162.59 rows=8370 width=0) (actual
> time=0.027..0.027 rows=0 loops=1)
>                ->  Bitmap Index Scan on
> index_exp_detls_on_hts_code_id_and_data_month  (cost=0.00..79.20 rows=4185
> width=0) (actual time=0.017..0.017 rows=0 loops=1)
>                      Index Cond: (hts_code_id = 12)
>                ->  Bitmap Index Scan on
> index_exp_detls_on_hts_code_id_and_data_month  (cost=0.00..79.20 rows=4185
> width=0) (actual time=0.007..0.007 rows=0 loops=1)
>                      Index Cond: (hts_code_id = 654)
>  Total runtime: 0.051 ms
> (9 rows)

Well, the reason it likes the first plan is that that has a smaller
estimated cost ;-).  Basically this is a startup-time-versus-total-time
issue: the cost of the seqscan+limit is estimated to be about 1/8447'th
of the time to read the whole table, since it's estimating 8447
candidate matches and assuming that those are uniformly distributed in
the table.  Meanwhile, the bitmap scan has a significant startup cost
because the entire indexscan is completed before we start to do any
fetching from the heap.  The overestimate of the number of matching
rows contributes directly to overestimating the cost of the indexscan,
too.  It ends up being a near thing --- 158 vs 166 cost units --- but
on the basis of these estimates the planner did the right thing.

So, what you need to do to make this better is to get it to have a
better idea of how many rows match the query condition; the overestimate
is both making the expensive plan look cheap, and making the cheaper
plan look expensive.  Cranking up the statistics target for the
hts_code_id column (and re-ANALYZEing) ought to fix it.  If all your
tables are this large you might want to just increase
default_statistics_target across the board.

            regards, tom lane

Re: Performance Anomaly with "col in (A,B)" vs. "col = A OR col = B" ver. 9.0.3

От
Craig James
Дата:
On 9/26/11 10:07 AM, Tom Lane wrote:
> Cranking up the statistics target for the hts_code_id column (and re-ANALYZEing) ought to fix it. If all your tables
arethis large you might want to just increase default_statistics_target across the board. regards, tom lane  
This is common advice in this forum ....  but what's the down side to increasing statistics?  With so many questions
comingto this forum that are due to insufficient statistics, why not just increase the default_statistics_target?  I
assumethere is a down side, but I've never seen it discussed.  Does it increase planning time?  Analyze time?  Take
lotsof space? 

Thanks,
Craig

Re: Performance Anomaly with "col in (A,B)" vs. "col = A OR col = B" ver. 9.0.3

От
Tom Lane
Дата:
Craig James <craig_james@emolecules.com> writes:
> On 9/26/11 10:07 AM, Tom Lane wrote:
>> Cranking up the statistics target for the hts_code_id column (and re-ANALYZEing) ought to fix it. If all your tables
arethis large you might want to just increase default_statistics_target across the board. regards, tom lane  

> This is common advice in this forum ....  but what's the down side to increasing statistics?  With so many questions
comingto this forum that are due to insufficient statistics, why not just increase the default_statistics_target?  I
assumethere is a down side, but I've never seen it discussed.  Does it increase planning time?  Analyze time?  Take
lotsof space? 

Yes, yes, and yes.  We already did crank up the default
default_statistics_target once (in 8.4), so I'm hesitant to do it again.

            regards, tom lane

Re: Performance Anomaly with "col in (A,B)" vs. "col = A OR col = B" ver. 9.0.3

От
Craig Ringer
Дата:
On 27/09/2011 1:35 AM, Tom Lane wrote:
> Craig James<craig_james@emolecules.com>  writes:
>> On 9/26/11 10:07 AM, Tom Lane wrote:
>>> Cranking up the statistics target for the hts_code_id column (and re-ANALYZEing) ought to fix it. If all your
tablesare this large you might want to just increase default_statistics_target across the board. regards, tom lane 
>> This is common advice in this forum ....  but what's the down side to increasing statistics?  With so many questions
comingto this forum that are due to insufficient statistics, why not just increase the default_statistics_target?  I
assumethere is a down side, but I've never seen it discussed.  Does it increase planning time?  Analyze time?  Take
lotsof space? 
> Yes, yes, and yes.  We already did crank up the default
> default_statistics_target once (in 8.4), so I'm hesitant to do it again.

This has me wondering about putting together a maintenance/analysis tool
that generates and captures stats from several ANALYZE runs and compares
them to see if they're reasonably consistent. It then re-runs with
higher targets as a one-off, again to see if the stats agree, before
restoring the targets to defaults. The tool could crunch comparisons of
the resulting stats and warn about tables or columns where the default
stats targets aren't sufficient.

In the long run this might even be something it'd be good to have Pg do
automatically behind the scenes (like autovacuum) - auto-raise stats
targets where repeat samplings are inconsistent.

Thoughts? Is this reasonable to explore, or a totally bogus idea? I'll
see if I can have a play if there's any point to trying it out.

--
Craig Ringer

Re: Performance Anomaly with "col in (A,B)" vs. "col = A OR col = B" ver. 9.0.3

От
Tom Lane
Дата:
Craig Ringer <ringerc@ringerc.id.au> writes:
> This has me wondering about putting together a maintenance/analysis tool
> that generates and captures stats from several ANALYZE runs and compares
> them to see if they're reasonably consistent. It then re-runs with
> higher targets as a one-off, again to see if the stats agree, before
> restoring the targets to defaults. The tool could crunch comparisons of
> the resulting stats and warn about tables or columns where the default
> stats targets aren't sufficient.

It would certainly be useful to have such a tool, but I suspect it's
easier said than done.  The problem is to know whether the queries on
that table are particularly sensitive to having better stats.  I think
we've largely solved issues having to do with the quality of the
histogram (eg, what fraction of the table has values falling into some
range), and the remaining trouble spots have to do with predicting the
frequency of specific values that are too infrequent to have made it
into the most-common-values list.  Enlarging the MCV list helps not so
much by getting these long-tail values into the MCV list --- they
probably still aren't there --- as by allowing us to tighten the upper
bound on what the frequency of an unrepresented value must be.  So what
you have to ask is how many queries care about that.  Craig James'
example query in this thread is sort of a worst case, because the values
it's searching for are in fact not in the table at all.

            regards, tom lane

Re: Performance Anomaly with "col in (A,B)" vs. "col = A OR col = B" ver. 9.0.3

От
Timothy Garnett
Дата:

Well, the reason it likes the first plan is that that has a smaller
estimated cost ;-).  Basically this is a startup-time-versus-total-time
issue: the cost of the seqscan+limit is estimated to be about 1/8447'th
of the time to read the whole table, since it's estimating 8447
candidate matches and assuming that those are uniformly distributed in
the table.  Meanwhile, the bitmap scan has a significant startup cost
because the entire indexscan is completed before we start to do any
fetching from the heap.  The overestimate of the number of matching
rows contributes directly to overestimating the cost of the indexscan,
too.  It ends up being a near thing --- 158 vs 166 cost units --- but
on the basis of these estimates the planner did the right thing.

So, what you need to do to make this better is to get it to have a
better idea of how many rows match the query condition; the overestimate
is both making the expensive plan look cheap, and making the cheaper
plan look expensive.  Cranking up the statistics target for the
hts_code_id column (and re-ANALYZEing) ought to fix it.  If all your
tables are this large you might want to just increase
default_statistics_target across the board.

                       regards, tom lane


Thanks for the great description of what's happening.  Very informative.  Upping the stats to the max 10000 (from the default 100) makes my example query use a faster plan, but not other problematic queries we have in the same vein (just add a few more values to the in clause). For ex. (this is with stats set to 10000 and re-analyzed).

=>explain analyze SELECT "exp_detls".id FROM "exp_detls" WHERE ("exp_detls"."hts_code_id" IN (8127,8377,8374,10744,11125,8375,8344,9127,9345)) LIMIT 1;
                                                        QUERY PLAN                                                        
---------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..152.94 rows=1 width=4) (actual time=12057.637..12057.637 rows=0 loops=1)
   ->  Seq Scan on exp_detls  (cost=0.00..1651399.83 rows=10798 width=4) (actual time=12057.635..12057.635 rows=0 loops=1)
         Filter: (hts_code_id = ANY ('{8127,8377,8374,10744,11125,8375,8344,9127,9345}'::integer[]))
 Total runtime: 12057.678 ms


From your description I think the planner is making two problematic assumptions that are leading to our issue:

First is that the potential matches are uniformly distributed across the physical table.  While there are a number of reasons this may not be the case (correlated insertion order or update patterns etc.), in this case there's a very clear reason which is that the table is clustered on an index that leads with the column we're querying against ('hts_code_id') and nothing has been inserted or updated in the table since the last time we ran cluster on it (see the schema in the original e-mail).

Second is that is that it's assuming that the IN clause values are actually present in the table (and at some reasonably large frequency), in the worst cases (like above) they aren't present at all, forcing a full table scan.  I don't know how the expected freq of values not present in the frequent values are estimated, but I'm guessing something based on the residual probability in the stats table and the est. number of distinct values?  If so that assumes that the supplied values actually are in the set of distinct values which seems unjustified.

Perhaps there should be some estimation on whether the supplied value is one of the distinct values or not (could probably do something pretty statistically solid with a not too large bloom filter), or scaling by the width of the histogram bucket it occurs in (i.e. assuming an even distribution across the bucket). Though maybe in a lot of common use situations people only supply values that are known present so maybe this would make things worse more often then better (maybe limit 1 or better EXISTS would be a hint the value is not known present). Experimenting a bit it doesn't seem to matter which values are selected so it's not taking into account any kind of distribution over the histogram boundaries.  For ex this query:

explain analyze SELECT "exp_detls".id FROM "exp_detls" WHERE ("exp_detls"."hts_code_id" IN (20000,20001,200002,20003,20004,20005,20006,20007,20008)) LIMIT 1;
                    QUERY PLAN                                                        
---------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..152.94 rows=1 width=4) (actual time=12925.646..12925.646 rows=0 loops=1)
   ->  Seq Scan on exp_detls  (cost=0.00..1651399.83 rows=10798 width=4) (actual time=12925.644..12925.644 rows=0 loops=1)
         Filter: (hts_code_id = ANY ('{20000,20001,200002,20003,20004,20005,20006,20007,20008}'::integer[]))
 Total runtime: 12925.677 ms


Has the same expected row count as the earlier query with the same number of not present IN values, but looking at the histogram boundaries these values are much, much less likely to find rows (assuming an even distribution across the bucket) then the values in the earlier query (histogram boundaries are something like 1,5,10,15,...11995,12000,80000,80005...) I can send the actual stats if interested (it's large though). Feels like this should factor into the estimates somehow.

Also, we occasionally run into surprises like this esp. as tables grow (fast query plans hit some planning tipping point and turn into very slow plans, or some new combination of values is requested) etc.  Would be nice if there was some straightforward way to tweak the planner towards less risky queries, not necessarily worst case planning but maybe pick the plan based on expected cost + n% * 'worst case' cost for some reasonable estimation of worst case or the like (or using some probability distribution around the expected cost and picking the nth percentile time for ranking).  Just thinking out loud.

Btw, thanks for all your work on what is a really great and useful tool!

Tim 

Re: Performance Anomaly with "col in (A,B)" vs. "col = A OR col = B" ver. 9.0.3

От
Samuel Gendler
Дата:


On Mon, Sep 26, 2011 at 2:11 PM, Timothy Garnett <tgarnett@panjiva.com> wrote:

Though maybe in a lot of common use situations people only supply values that are known present so maybe this would make things worse more often then better (maybe limit 1 or better EXISTS would be a hint the value is not known present). Experimenting a bit it doesn't seem to matter which values are selected so it's not taking into account any kind of distribution over the histogram boundaries. 

If I'm not mistaken, the problem here is actually the LIMIT 1, yes?  The planner is opting for the sequential scan because it assumes it will interrupt the scan relatively quickly when a row is matched?  So in the case where you are really looking for existence, perhaps the better solution is to select a count of the number of matching rows (perhaps grouped by id so you know which ones match)? That would emulate the behaviour of select without a limit, which would be more likely to use the index. It all depends on just what you are actually doing with the row you are returning, of course, and if there is some other way to get it once you know of its existence.

SELECT count(1), exp_detls.id FROM exp_detls WHERE (exp_detls.hts_code_id IN (12,654)) GROUP BY exp_detls.id

might work, depending upon how many different values of exp_detls.id you are likely to see for any given set of hts_code_ids.  Actually, I know little about the query planner, but it seems to me that the aggregation and grouping might be sufficient to force it away from preferring the sequential scan, even if you leave a 'limit 1' on the query, since it will have to find more than 1 row in order to return a single row, since that single row contains an aggregate.  So if your concern is about the potential of transferring millions of rows across the network, I think that might fix it, though it is definitely a kludge.  Of course, the downside is that the index won't be as fast as a sequential scan in the cases where the scan does get interrupted quickly, but you've clearly already considered that for your use patterns.

Re: Performance Anomaly with "col in (A,B)" vs. "col = A OR col = B" ver. 9.0.3

От
Timothy Garnett
Дата:
Actually thinking about this a little more what we really want the planner to do is to consider the codes one at a time till it finds one that exists.  If we write that out explicitly we get a good plan whether the ids are select many rows or none.

=> explain analyze
select 1 from (
  select * from (select 1 from exp_detls where hts_code_id in (469169) limit 1) a
 union all
  select * from (select 1 from exp_detls where hts_code_id in (15289) limit 1) b
 union all
  select * from (select 1 from exp_detls where hts_code_id in (468137) limit 1) c
 union all
  select * from (select 1 from exp_detls where hts_code_id in (14655) limit 1) d
 union all
  select * from (select 1 from exp_detls where hts_code_id in (14670) limit 1) e
 union all
  select * from (select 1 from exp_detls where hts_code_id in (15291) limit 1) f
) dummy limit 1;
                                                                                           QUERY PLAN                                                                                          
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..1.75 rows=1 width=0) (actual time=0.031..0.032 rows=1 loops=1)
   ->  Result  (cost=0.00..10.52 rows=6 width=0) (actual time=0.029..0.029 rows=1 loops=1)
         ->  Append  (cost=0.00..10.52 rows=6 width=0) (actual time=0.029..0.029 rows=1 loops=1)
               ->  Subquery Scan on a  (cost=0.00..1.72 rows=1 width=0) (actual time=0.027..0.027 rows=1 loops=1)
                     ->  Limit  (cost=0.00..1.71 rows=1 width=0) (actual time=0.025..0.025 rows=1 loops=1)
                           ->  Index Scan using index_exp_detls_on_hts_code_id_and_data_month on exp_detls  (cost=0.00..144890.56 rows=84657 width=0) (actual time=0.025..0.025 rows=1 loops=1)
                                 Index Cond: (hts_code_id = 469169)
               ->  Subquery Scan on b  (cost=0.00..1.74 rows=1 width=0) (never executed)
                     ->  Limit  (cost=0.00..1.73 rows=1 width=0) (never executed)
                           ->  Index Scan using index_exp_detls_on_hts_code_id_and_data_month on exp_detls  (cost=0.00..118206.85 rows=68477 width=0) (never executed)
                                 Index Cond: (hts_code_id = 15289)
               ->  Subquery Scan on c  (cost=0.00..1.74 rows=1 width=0) (never executed)
                     ->  Limit  (cost=0.00..1.73 rows=1 width=0) (never executed)
                           ->  Index Scan using index_exp_detls_on_hts_code_id_and_data_month on exp_detls  (cost=0.00..102645.38 rows=59168 width=0) (never executed)
                                 Index Cond: (hts_code_id = 468137)
               ->  Subquery Scan on d  (cost=0.00..1.81 rows=1 width=0) (never executed)
                     ->  Limit  (cost=0.00..1.80 rows=1 width=0) (never executed)
                           ->  Index Scan using index_exp_detls_on_hts_code_id_and_data_month on exp_detls  (cost=0.00..2155.38 rows=1200 width=0) (never executed)
                                 Index Cond: (hts_code_id = 14655)
               ->  Subquery Scan on e  (cost=0.00..1.75 rows=1 width=0) (never executed)
                     ->  Limit  (cost=0.00..1.74 rows=1 width=0) (never executed)
                           ->  Index Scan using index_exp_detls_on_hts_code_id_and_data_month on exp_detls  (cost=0.00..90309.37 rows=51853 width=0) (never executed)
                                 Index Cond: (hts_code_id = 14670)
               ->  Subquery Scan on f  (cost=0.00..1.75 rows=1 width=0) (never executed)
                     ->  Limit  (cost=0.00..1.74 rows=1 width=0) (never executed)
                           ->  Index Scan using index_exp_detls_on_hts_code_id_and_data_month on exp_detls  (cost=0.00..84767.69 rows=48586 width=0) (never executed)
                                 Index Cond: (hts_code_id = 15291)
 Total runtime: 0.089 ms
(28 rows)


This is sub millisecond for all combinations of ids present or not that we've tried, so we'll definitely go with this.  Thanks for the help and explanations!

Tim

On Tue, Sep 27, 2011 at 8:40 AM, Timothy Garnett <tgarnett@panjiva.com> wrote:
Hi Sam,

The purpose of this (framework generated) code is to find out if there is at least one row that has one of the selected hts_code_ids.  We don't care about anything that's returned other then whether at least one row exists or not (rewriting the query with EXISTS generates that same plan).  The actual selectivity can vary greatly anywhere from 0 to > 500k rows depending on the codes chosen.  On the higher end of the range a select count(*) takes ~14 times longer then a limit 1 query that does an index scan (~475ms vs. 36ms).

What we're doing now for the moment is putting a straight-jacket on the planner

begin ; set local enable_seqscan = off ; SELECT "exp_detls".id FROM "exp_detls" WHERE
("exp_detls"."hts_code_id" IN (...)) LIMIT 1;
commit ;

As even when the where clause selects a good fraction of the table seq_scan has highly variable performance because of the clustered index (depends on what was last selected out of the table), so we pretty much never want to use it (maybe the planner should be taking the cluster into consideration?).  What we'll probably move to is maintaining a table of ids that are present in that column and running the query against that as the above, while acceptable, can still be on the slow side when the where clause is not very selective and many rows are scanned in the index before the limit can be applied.  Would be nice if the bitmap index scan could be done piecemeal alternating with heap scan when a tight limit is present so not so much work has to be done, but I could see that being really problematic to implement and use.

Tim


On Tue, Sep 27, 2011 at 1:06 AM, Samuel Gendler <sgendler@ideasculptor.com> wrote:


On Mon, Sep 26, 2011 at 2:11 PM, Timothy Garnett <tgarnett@panjiva.com> wrote:

Though maybe in a lot of common use situations people only supply values that are known present so maybe this would make things worse more often then better (maybe limit 1 or better EXISTS would be a hint the value is not known present). Experimenting a bit it doesn't seem to matter which values are selected so it's not taking into account any kind of distribution over the histogram boundaries. 

If I'm not mistaken, the problem here is actually the LIMIT 1, yes?  The planner is opting for the sequential scan because it assumes it will interrupt the scan relatively quickly when a row is matched?  So in the case where you are really looking for existence, perhaps the better solution is to select a count of the number of matching rows (perhaps grouped by id so you know which ones match)? That would emulate the behaviour of select without a limit, which would be more likely to use the index. It all depends on just what you are actually doing with the row you are returning, of course, and if there is some other way to get it once you know of its existence.

SELECT count(1), exp_detls.id FROM exp_detls WHERE (exp_detls.hts_code_id IN (12,654)) GROUP BY exp_detls.id

might work, depending upon how many different values of exp_detls.id you are likely to see for any given set of hts_code_ids.  Actually, I know little about the query planner, but it seems to me that the aggregation and grouping might be sufficient to force it away from preferring the sequential scan, even if you leave a 'limit 1' on the query, since it will have to find more than 1 row in order to return a single row, since that single row contains an aggregate.  So if your concern is about the potential of transferring millions of rows across the network, I think that might fix it, though it is definitely a kludge.  Of course, the downside is that the index won't be as fast as a sequential scan in the cases where the scan does get interrupted quickly, but you've clearly already considered that for your use patterns.


Re: Performance Anomaly with "col in (A,B)" vs. "col = A OR col = B" ver. 9.0.3

От
Timothy Garnett
Дата:
Hi Sam,

The purpose of this (framework generated) code is to find out if there is at least one row that has one of the selected hts_code_ids.  We don't care about anything that's returned other then whether at least one row exists or not (rewriting the query with EXISTS generates that same plan).  The actual selectivity can vary greatly anywhere from 0 to > 500k rows depending on the codes chosen.  On the higher end of the range a select count(*) takes ~14 times longer then a limit 1 query that does an index scan (~475ms vs. 36ms).

What we're doing now for the moment is putting a straight-jacket on the planner

begin ; set local enable_seqscan = off ; SELECT "exp_detls".id FROM "exp_detls" WHERE
("exp_detls"."hts_code_id" IN (...)) LIMIT 1;
commit ;

As even when the where clause selects a good fraction of the table seq_scan has highly variable performance because of the clustered index (depends on what was last selected out of the table), so we pretty much never want to use it (maybe the planner should be taking the cluster into consideration?).  What we'll probably move to is maintaining a table of ids that are present in that column and running the query against that as the above, while acceptable, can still be on the slow side when the where clause is not very selective and many rows are scanned in the index before the limit can be applied.  Would be nice if the bitmap index scan could be done piecemeal alternating with heap scan when a tight limit is present so not so much work has to be done, but I could see that being really problematic to implement and use.

Tim

On Tue, Sep 27, 2011 at 1:06 AM, Samuel Gendler <sgendler@ideasculptor.com> wrote:


On Mon, Sep 26, 2011 at 2:11 PM, Timothy Garnett <tgarnett@panjiva.com> wrote:

Though maybe in a lot of common use situations people only supply values that are known present so maybe this would make things worse more often then better (maybe limit 1 or better EXISTS would be a hint the value is not known present). Experimenting a bit it doesn't seem to matter which values are selected so it's not taking into account any kind of distribution over the histogram boundaries. 

If I'm not mistaken, the problem here is actually the LIMIT 1, yes?  The planner is opting for the sequential scan because it assumes it will interrupt the scan relatively quickly when a row is matched?  So in the case where you are really looking for existence, perhaps the better solution is to select a count of the number of matching rows (perhaps grouped by id so you know which ones match)? That would emulate the behaviour of select without a limit, which would be more likely to use the index. It all depends on just what you are actually doing with the row you are returning, of course, and if there is some other way to get it once you know of its existence.

SELECT count(1), exp_detls.id FROM exp_detls WHERE (exp_detls.hts_code_id IN (12,654)) GROUP BY exp_detls.id

might work, depending upon how many different values of exp_detls.id you are likely to see for any given set of hts_code_ids.  Actually, I know little about the query planner, but it seems to me that the aggregation and grouping might be sufficient to force it away from preferring the sequential scan, even if you leave a 'limit 1' on the query, since it will have to find more than 1 row in order to return a single row, since that single row contains an aggregate.  So if your concern is about the potential of transferring millions of rows across the network, I think that might fix it, though it is definitely a kludge.  Of course, the downside is that the index won't be as fast as a sequential scan in the cases where the scan does get interrupted quickly, but you've clearly already considered that for your use patterns.