Обсуждение: Why does the query planner use two full indexes, when a dedicated partial index exists?

Поиск
Список
Период
Сортировка

Why does the query planner use two full indexes, when a dedicated partial index exists?

От
Richard Neill
Дата:
Dear All,

I've just joined this list, and I'd like to request some advice.

I have a table (1 GB in size) with 24 columns, and 5.6 million rows. Of
these, we're interested in two columns, parcel_id_code, and exit_state.

    parcel_id_code has a fairly uniform distribution of integers
    from 1-99999, it's never null.

    exit_state has 3 possible values, 1,2 and null.
    Almost all the rows are 1, about 0.1% have the value 2, and
    only 153 rows are null


The query I'm trying to optimise looks like this:

    SELECT * from  tbl_tracker
    WHERE parcel_id_code='53030' AND exit_state IS NULL;

So, I have a partial index:

    "tbl_tracker_performance_1_idx" btree (parcel_id_code) WHERE
    exit_state IS NULL

which works fine if it's the only index.


BUT, for other queries (unrelated to this question), I also have to have
full indexes on these columns:

     "tbl_tracker_exit_state_idx" btree (exit_state)
     "tbl_tracker_parcel_id_code_idx" btree (parcel_id_code)


The problem is, when I now run my query, the planner ignores the
dedicated index "tbl_tracker_performance_1_idx", and instead uses both
of the full indexes... resulting in a much much slower query (9ms vs
0.08ms).

A psql session is below.  This shows that, if I force the planner to use
the partial index, by dropping the others, then it's fast. But as soon
as I put the full indexes back (which I need for other queries), the
query planner chooses them instead, and is slow.


Thanks very much for your help,

Richard










fsc_log => \d tbl_tracker

        Column        |           Type           |   Modifiers
---------------------+--------------------------+------------------
  id                  | bigint                   | not null default
nextval('master_id_seq'::regclass)
  dreq_timestamp_1    | timestamp with time zone |
  barcode_1           | character varying(13)    |
  barcode_2           | character varying(13)    |
  barcode_best        | character varying(13)    |
  entrance_point      | character varying(13)    |
  induct              | character varying(5)     |
  entrance_state_x    | integer                  |
  dreq_count          | integer                  |
  parcel_id_code      | integer                  |
  host_id_code        | bigint                   |
  original_dest       | integer                  |
  drep_timestamp_n    | timestamp with time zone |
  actual_dest         | integer                  |
  exit_state          | integer                  |
  chute               | integer                  |
  original_dest_state | integer                  |
  srep_timestamp      | timestamp with time zone |
  asn                 | character varying(9)     |
  is_asn_token        | boolean                  |
  track_state         | integer                  |
  warning             | boolean                  |
Indexes:
     "tbl_tracker_pkey" PRIMARY KEY, btree (id) CLUSTER
     "tbl_tracker_barcode_best_idx" btree (barcode_best)
     "tbl_tracker_chute_idx" btree (chute)
     "tbl_tracker_drep_timestamp_n_idx" btree (drep_timestamp_n) WHERE
drep_timestamp_n IS NOT NULL
     "tbl_tracker_dreq_timestamp_1_idx" btree (dreq_timestamp_1) WHERE
dreq_timestamp_1 IS NOT NULL
     "tbl_tracker_exit_state_idx" btree (exit_state)
     "tbl_tracker_parcel_id_code_idx" btree (parcel_id_code)
     "tbl_tracker_performance_1_idx" btree (parcel_id_code) WHERE
exit_state IS NULL
     "tbl_tracker_performance_2_idx" btree (host_id_code, id)
     "tbl_tracker_performance_3_idx" btree (srep_timestamp) WHERE
exit_state = 1 AND srep_timestamp IS NOT NULL
     "tbl_tracker_srep_timestamp_idx" btree (srep_timestamp) WHERE
srep_timestamp IS NOT NULL




fsc_log=> explain analyse select * from  tbl_tracker where
parcel_id_code='53030' AND exit_state IS NULL;

QUERY  PLAN
-----------------------------------------------------------------------
  Bitmap Heap Scan on tbl_tracker  (cost=8.32..10.84 rows=1 width=174)
(actual time=9.334..9.334 rows=0 loops=1)
    Recheck Cond: ((parcel_id_code = 53030) AND (exit_state IS NULL))
    ->  BitmapAnd  (cost=8.32..8.32 rows=1 width=0) (actual
time=9.329..9.329 rows=0 loops=1)
          ->  Bitmap Index Scan on tbl_tracker_parcel_id_code_idx
(cost=0.00..3.67 rows=57 width=0) (actual time=0.026..0.026 rows=65 loops=1)
                Index Cond: (parcel_id_code = 53030)
          ->  Bitmap Index Scan on tbl_tracker_exit_state_idx
(cost=0.00..4.40 rows=150 width=0) (actual time=9.289..9.289 rows=93744
loops=1)
                Index Cond: (exit_state IS NULL)
  Total runtime: 9.366 ms
(8 rows)



fsc_log=> drop index tbl_tracker_exit_state_idx;
DROP INDEX

fsc_log=> explain analyse select * from  tbl_tracker where
parcel_id_code='53030' AND exit_state IS NULL;

QUERY  PLAN
----------------------------------------------------------------------------------------
  Bitmap Heap Scan on tbl_tracker  (cost=3.67..145.16 rows=1 width=174)
(actual time=0.646..0.646 rows=0 loops=1)
    Recheck Cond: (parcel_id_code = 53030)
    Filter: (exit_state IS NULL)
    ->  Bitmap Index Scan on tbl_tracker_parcel_id_code_idx
(cost=0.00..3.67 rows=57 width=0) (actual time=0.024..0.024 rows=65 loops=1)
          Index Cond: (parcel_id_code = 53030)
  Total runtime: 0.677 ms
(6 rows)




fsc_log=> drop index tbl_tracker_parcel_id_code_idx;
DROP INDEX

fsc_log=> explain analyse select * from  tbl_tracker where
parcel_id_code='53030' AND exit_state IS NULL;

QUERY PLAN
--------------------------------------------------------------------------
  Index Scan using tbl_tracker_performance_1_idx on tbl_tracker
(cost=0.00..5440.83 rows=1 width=174) (actual time=0.052..0.052 rows=0
loops=1)
    Index Cond: (parcel_id_code = 53030)
  Total runtime: 0.080 ms
(3 rows)



Server hardware: 8 core, 2.5 GHz, 24 GB, SSD in RAID-1.

Postgresql config (non-default):

  version                     | PostgreSQL 9.1.6 on x86_64
  checkpoint_segments         | 128
  client_encoding             | UTF8
  commit_delay                | 50000
  commit_siblings             | 5
  default_statistics_target   | 5000
  effective_cache_size        | 12000MB
  lc_collate                  | en_GB.UTF-8
  lc_ctype                    | en_GB.UTF-8
  log_line_prefix             | %t
  log_min_duration_statement  | 50
  maintenance_work_mem        | 2GB
  max_connections             | 100
  max_stack_depth             | 4MB
  port                        | 5432
  random_page_cost            | 2.5
  server_encoding             | UTF8
  shared_buffers              | 6000MB
  ssl                         | on
  standard_conforming_strings | off
  temp_buffers                | 128MB
  TimeZone                    | GB
  wal_buffers                 | 16MB
  work_mem                    | 256MB





Re: Why does the query planner use two full indexes, when a dedicated partial index exists?

От
Sergey Konoplev
Дата:
Hi,

On Wed, Dec 19, 2012 at 1:13 PM, Richard Neill <rn214@richardneill.org> wrote:
>  Index Scan using tbl_tracker_performance_1_idx on tbl_tracker
> (cost=0.00..5440.83 rows=1 width=174) (actual time=0.052..0.052 rows=0
> loops=1)
>    Index Cond: (parcel_id_code = 53030)

It looks like your index is bloated. Have you had a lot of
updates/deletes on rows with exit_state is null?

Try to reindex tbl_tracker_performance_1_idx.

To reindex it without locks create a new index with temporary name
concurrently, delete the old one and rename the new one using the old
name.

--
Sergey Konoplev
Database and Software Architect
http://www.linkedin.com/in/grayhemp

Phones:
USA +1 415 867 9984
Russia, Moscow +7 901 903 0499
Russia, Krasnodar +7 988 888 1979

Skype: gray-hemp
Jabber: gray.ru@gmail.com


Re: Why does the query planner use two full indexes, when a dedicated partial index exists?

От
Richard Neill
Дата:

On 19/12/12 22:59, Sergey Konoplev wrote:

> On Wed, Dec 19, 2012 at 1:13 PM, Richard Neill <rn214@richardneill.org> wrote:
>>   Index Scan using tbl_tracker_performance_1_idx on tbl_tracker
>> (cost=0.00..5440.83 rows=1 width=174) (actual time=0.052..0.052 rows=0
>> loops=1)
>>     Index Cond: (parcel_id_code = 53030)
>
> It looks like your index is bloated. Have you had a lot of
> updates/deletes on rows with exit_state is null?
>
> Try to reindex tbl_tracker_performance_1_idx.
>
> To reindex it without locks create a new index with temporary name
> concurrently, delete the old one and rename the new one using the old
> name.
>

Hi Sergey,

Thanks for your suggestion. Yes, I can see what you mean: over the 3
weeks during which we deployed the system, every single row has at one
point had the exit_state as null, before being updated.

Essentially, as time moves on, new rows are added, initially with
exit_state null, then a few minutes later we update them to exit_state
1, then a few weeks later, they are removed.

[Explanation: the system tracks books around a physical sortation
machine; the sorter uses a "parcel_id_code" which (for some really daft
reason suffers wraparound at 99999, i.e. about every 3 hours), books
whose exit_state is null are those which are still on the sortation
machine; once they exit, the state is either 1 (successful delivery) or
2 (collision, and down the dump chute).]

BUT....

* The reindex solution doesn't work. I just tried it, and the query
planner is still using the wrong indexes.

* If the tbl_tracker_performance_1_idx had indeed become bloated,
wouldn't that have meant that when the query planner was forced to use
it (by deleting the alternative indexes), it would have been slow?

Also, I thought that reindex wasn't supposed to be needed in normal
operation.

Best wishes,

Richard




Re: Why does the query planner use two full indexes, when a dedicated partial index exists?

От
Sergey Konoplev
Дата:
On Wed, Dec 19, 2012 at 3:49 PM, Richard Neill <rn214@richardneill.org> wrote:
> * The reindex solution doesn't work. I just tried it, and the query planner
> is still using the wrong indexes.

Can you show the explain analyze with tbl_tracker_performance_1_idx
straight after reindex (eg. before it has been bloated again)?

> * If the tbl_tracker_performance_1_idx had indeed become bloated, wouldn't
> that have meant that when the query planner was forced to use it (by
> deleting the alternative indexes), it would have been slow?

It is hard to say. There might be a bloating threshold after with it
will be slow. Also it depends on the index column values.

--
Sergey Konoplev
Database and Software Architect
http://www.linkedin.com/in/grayhemp

Phones:
USA +1 415 867 9984
Russia, Moscow +7 901 903 0499
Russia, Krasnodar +7 988 888 1979

Skype: gray-hemp
Jabber: gray.ru@gmail.com


Re: Why does the query planner use two full indexes, when a dedicated partial index exists?

От
Richard Neill
Дата:
Thanks for your help,

On 20/12/12 00:08, Sergey Konoplev wrote:
> On Wed, Dec 19, 2012 at 3:49 PM, Richard Neill <rn214@richardneill.org> wrote:
>> * The reindex solution doesn't work. I just tried it, and the query planner
>> is still using the wrong indexes.
>
> Can you show the explain analyze with tbl_tracker_performance_1_idx
> straight after reindex (eg. before it has been bloated again)?

Sure. Just done it now... the system has been fairly lightly loaded for
the last few hours - though I did have to change the specific number of
the parcel_id_code in the query.



fsc_log=> explain analyse select * from tbl_tracker where
parcel_id_code=92223 and exit_state is null;

QUERY PLAN
-----------------------------------------------------------
  Index Scan using tbl_tracker_exit_state_idx on tbl_tracker
(cost=0.00..6.34 rows=1 width=174) (actual time=0.321..1.871 rows=1 loops=1)
    Index Cond: (exit_state IS NULL)
    Filter: (parcel_id_code = 92223)
  Total runtime: 1.905 ms
(4 rows)



And now, force it, by dropping the other index (temporarily):

fsc_log=> drop index tbl_tracker_exit_state_idx;
DROP INDEX


fsc_log=> explain analyse select * from tbl_tracker where
parcel_id_code=92223 and exit_state is null;

QUERY PLAN
---------------------------------------------------------------------
  Index Scan using tbl_tracker_performance_1_idx on tbl_tracker
(cost=0.00..7.78 rows=1 width=174) (actual time=0.040..0.041 rows=1 loops=1)
    Index Cond: (parcel_id_code = 92223)
  Total runtime: 0.077 ms
(3 rows)



As far as I can tell, the query planner really is just getting it wrong.

BTW, there is a significant effect on speed caused by running the same
query twice (it pulls stuff from disk into the OS disk-cache), but I've
already accounted for this.


Richard



Re: Why does the query planner use two full indexes, when a dedicated partial index exists?

От
Tom Lane
Дата:
Richard Neill <rn214@richardneill.org> writes:
> The problem is, when I now run my query, the planner ignores the
> dedicated index "tbl_tracker_performance_1_idx", and instead uses both
> of the full indexes... resulting in a much much slower query (9ms vs
> 0.08ms).

> A psql session is below.  This shows that, if I force the planner to use
> the partial index, by dropping the others, then it's fast. But as soon
> as I put the full indexes back (which I need for other queries), the
> query planner chooses them instead, and is slow.

[ experiments with a similar test case ... ]  I think the reason why the
planner is overestimating the cost of using the partial index is that
9.1 and earlier fail to account for the partial-index predicate when
estimating the number of index rows that will be visited.  Because the
partial-index predicate is so highly selective in this case, that
results in a significant overestimate of how much of the index will be
traversed.

We fixed this for 9.2 in
http://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=21a39de5809cd3050a37d2554323cc1d0cbeed9d
but did not want to risk back-patching such a behavioral change.  If
you're stuck on 9.1 you might want to think about applying that as a
local patch though.

(BTW, the "fudge factor" change in that patch has been criticized
recently; we've changed it again already for 9.3 and might choose to
back-patch that into 9.2.3.  But it's the rest of it that you care about
anyway.)

            regards, tom lane


Re: Why does the query planner use two full indexes, when a dedicated partial index exists?

От
Jeff Janes
Дата:
On Wednesday, December 19, 2012, Richard Neill wrote:
Thanks for your help,

On 20/12/12 00:08, Sergey Konoplev wrote:
On Wed, Dec 19, 2012 at 3:49 PM, Richard Neill <rn214@richardneill.org> wrote:
* The reindex solution doesn't work. I just tried it, and the query planner
is still using the wrong indexes.


It switched to a better one of the wrong indices, though, and got several times faster.

How did it get so bloated in the first place?  Is the table being updated so rapidly that the statistics might be wrong even immediately after analyze finishes?

In any case, I can't get it to prefer the full index in 9.1.6 at all.  The partial index wins hands down unless the table is physically clustered by the parcel_id_code column.  In which that case, the partial index wins by only a little bit.

This is what I did for the table:

create table tbl_tracker as select case when random()<0.001 then 2 else case when random()< 0.00003 then NULL else 1 end end as exit_state, (random()*99999)::int as parcel_id_code from generate_series(1,5000000) ;

Cheers,

Jeff


Re: Why does the query planner use two full indexes, when a dedicated partial index exists?

От
Richard Neill
Дата:
Dear Tom,

Thanks very much for your advice.

>> A psql session is below.  This shows that, if I force the planner to use
>> the partial index, by dropping the others, then it's fast. But as soon
>> as I put the full indexes back (which I need for other queries), the
>> query planner chooses them instead, and is slow.
>
> [ experiments with a similar test case ... ]  I think the reason why the
> planner is overestimating the cost of using the partial index is that
> 9.1 and earlier fail to account for the partial-index predicate when
> estimating the number of index rows that will be visited.  Because the
> partial-index predicate is so highly selective in this case, that
> results in a significant overestimate of how much of the index will be
> traversed.

I think that seems likely to me.

I'll try out 9.2 and see if it helps. As it's a production server, I
have to wait for some downtime, probably Friday night before I can find
out - will report back.

Best wishes,

Richard


Re: Why does the query planner use two full indexes, when a dedicated partial index exists?

От
Richard Neill
Дата:
Dear Jeff,

Thanks for your help,

>             * The reindex solution doesn't work. I just tried it, and
>             the query planner
>             is still using the wrong indexes.
>
>
> It switched to a better one of the wrong indices, though, and got
> several times faster.
>

I think that this is a red herring. The switching between the two
"wrong" indices seems to be caused by non-uniformity in the
parcel_id_code: although it's distributed fairly well across 1-99999,
it's not perfect.

As for the speed-up, I think that's mostly caused by the fact that
running "Analyse" is pulling the entire table (and the relevant index)
into RAM and flushing other things out of that cache.

> How did it get so bloated in the first place?  Is the table being
> updated so rapidly that the statistics might be wrong even immediately
> after analyze finishes?

I don't think it is. We're doing about 10 inserts and 20 updates per
second on that table. But when I tested it, production had stopped for
the night - so the system was quiescent between the analyse and the select.

> In any case, I can't get it to prefer the full index in 9.1.6 at all.
>   The partial index wins hands down unless the table is physically
> clustered by the parcel_id_code column.  In which that case, the partial
> index wins by only a little bit.

Interesting that you should say that... the original setup script did
choose to cluster the table on that column.

Also, I wonder whether it matters which order the indexes are created in?


Best wishes,

Richard


Re: Why does the query planner use two full indexes, when a dedicated partial index exists?

От
Tom Lane
Дата:
Richard Neill <rn214@richardneill.org> writes:
> Also, I wonder whether it matters which order the indexes are created in?

IIRC, if the estimated costs of using two different indexes come out the
same (to within 1% or so), then the planner keeps the first-generated
path, which will result in preferring the index with smaller OID.  This
effect doesn't apply to your problem query though, since we can see from
the drop-experiments that the estimated costs are quite a bit different.

A more likely explanation if you see some effect that looks like order
dependency is that the more recently created index has accumulated less
bloat, and thus has a perfectly justifiable cost advantage.

            regards, tom lane


Re: Why does the query planner use two full indexes, when a dedicated partial index exists?

От
Tom Lane
Дата:
Jeff Janes <jeff.janes@gmail.com> writes:
> In any case, I can't get it to prefer the full index in 9.1.6 at all.  The
> partial index wins hands down unless the table is physically clustered by
> the parcel_id_code column.  In which that case, the partial index wins by
> only a little bit.

> This is what I did for the table:

> create table tbl_tracker as select case when random()<0.001 then 2 else
> case when random()< 0.00003 then NULL else 1 end end as exit_state,
> (random()*99999)::int as parcel_id_code from generate_series(1,5000000) ;

What I did to try to duplicate Richard's situation was to create a test
table in which all the exit_state values were NULL, then build the
index, then UPDATE all but a small random fraction of the rows to 1,
then vacuum.  This results in a rather bloated partial index, but I
think that's probably what he's got given that every record initially
enters the table with NULL exit_state.  It would take extremely frequent
vacuuming to keep the partial index from accumulating a lot of dead
entries.

In this scenario, with 9.1, I got overly large estimates for the cost of
using the partial index; which matches up with his reports.

            regards, tom lane


Re: Why does the query planner use two full indexes, when a dedicated partial index exists?

От
Richard Neill
Дата:
Dear Tom,

Thanks againg for your help on this.

On 20/12/12 03:06, Tom Lane wrote:
> Richard Neill <rn214@richardneill.org> writes:
>> The problem is, when I now run my query, the planner ignores the
>> dedicated index "tbl_tracker_performance_1_idx", and instead uses both
>> of the full indexes... resulting in a much much slower query (9ms vs
>> 0.08ms).
>

I've now installed 9.2. As you said, thanks to the change in 9.2 it
initially prefers the partial index.

BUT, after 1 cycle of inserting 500k rows, then deleting them all, then
starting to insert again, I find that the planner has reverted to the
former bad behaviour.

Reindexing only takes a couple of seconds, and restores correctness.

What's going on? Do I need to run reindex in a cron-job? I thought that
reindex wasn't "normally" needed, and that index bloat happened only
after every row had changed value hundreds of times.

Thanks,

Richard


---------------------
Here's the same session again.

[Please ignore the dreq_1_timestamp check - I mistakenly failed to
simplify it out of the query, and now that I reindexed, I can't redo the
experiment. I don't think it makes any difference.]


fsc_log=> explain analyse select * from tbl_tracker WHERE
parcel_id_code='90820' AND exit_state IS NULL AND (dreq_timestamp_1 >
timestamp '2012-12-20 13:02:36.652' - INTERVAL '36 hours');

QUERY PLAN
---------------------------------------------------------------
  Bitmap Heap Scan on tbl_tracker  (cost=17.35..19.86 rows=1 width=174)
(actual time=8.056..8.056 rows=0 loops=1)
    Recheck Cond: ((exit_state IS NULL) AND (parcel_id_code = 90820))
    Filter: (dreq_timestamp_1 > '2012-12-19 01:02:36.652'::timestamp
without time zone)
    ->  BitmapAnd  (cost=17.35..17.35 rows=1 width=0) (actual
time=8.053..8.053 rows=0 loops=1)
          ->  Bitmap Index Scan on tbl_tracker_exit_state_idx
(cost=0.00..8.36 rows=151 width=0) (actual time=7.946..7.946 rows=20277
loops=1)
                Index Cond: (exit_state IS NULL)
          ->  Bitmap Index Scan on tbl_tracker_parcel_id_code_idx
(cost=0.00..8.73 rows=58 width=0) (actual time=0.025..0.025 rows=72 loops=1)
                Index Cond: (parcel_id_code = 90820)
  Total runtime: 8.090 ms
(9 rows)


fsc_log=> REINDEX index tbl_tracker_performance_1_idx;
#This only took a couple of seconds to do.

fsc_log=> explain analyse select * from tbl_tracker WHERE
parcel_id_code='90820' AND exit_state IS NULL AND (dreq_timestamp_1 >
timestamp '2012-12-20 13:02:36.652' - INTERVAL '36 hours');

QUERY PLAN
---------------------------------------------------------------


  Index Scan using tbl_tracker_performance_1_idx on tbl_tracker
(cost=0.00..5.27 rows=1 width=174) (actual time=0.019..0.019 rows=0 loops=1)
    Index Cond: (parcel_id_code = 90820)
    Filter: (dreq_timestamp_1 > '2012-12-19 01:02:36.652'::timestamp
without time zone)
  Total runtime: 0.047 ms
(4 rows)





Re: Why does the query planner use two full indexes, when a dedicated partial index exists?

От
Richard Neill
Дата:

On 21/12/12 02:34, Richard Neill wrote:
>
> Reindexing only takes a couple of seconds, and restores correctness.
>

Interestingly, the partial index (after reindexing) is only 16kB in
size; whereas the table is 1.1 GB, and the normal single-column indexes
are about 250MB in size.

In terms of what's physically happening in reality,

- tbl_tracker keeps a record of all books that move through the system
   over a period of one month (at a rate of about 20/second, or 1
   million/day), after which they are deleted.

- the partial index, tbl_tracker_performance_1_idx tracks only those
   books which are currently "in flight" - books remain in flight for
   about 200 seconds as they go round the machine.
   (While in flight, these have exit_state = NULL)

- the partial index is used to overcome a design defect(*) in the
   sorter machine, namely that it doesn't number each book uniquely,
   but wraps the parcel_id_code every few hours. Worse, some books can
   remain on the sorter for several days (if they jam), so the numbering
   isn't a clean "wraparound", but more like a fragmented (and
   occasionally lossy) filesystem.

- What I'm trying to do is trace the history of the books
   through the system and assign each one a proper unique id.
   So, if I see a book with "parcel_id_code = 37",
   is it a new book (after pid wrap), or is it the same book I saw 1
   minute ago, that hasn't exited the sorter?


So... is there some way to, for example, set a trigger that will reindex
every time the index exceeds 1000 rows?


Richard



(*)Readers of The Daily WTF might appreciate another curious anomaly:
this machine originally had an RS-232 port; it now uses ethernet, but
TxD and RxD use different TCP sockets on different network ports!


Re: Why does the query planner use two full indexes, when a dedicated partial index exists?

От
Jeff Janes
Дата:
On Thursday, December 20, 2012, Tom Lane wrote:
Jeff Janes <jeff.janes@gmail.com> writes:
> In any case, I can't get it to prefer the full index in 9.1.6 at all.  The
> partial index wins hands down unless the table is physically clustered by
> the parcel_id_code column.  In which that case, the partial index wins by
> only a little bit.

> This is what I did for the table:

> create table tbl_tracker as select case when random()<0.001 then 2 else
> case when random()< 0.00003 then NULL else 1 end end as exit_state,
> (random()*99999)::int as parcel_id_code from generate_series(1,5000000) ;

What I did to try to duplicate Richard's situation was to create a test
table in which all the exit_state values were NULL, then build the
index, then UPDATE all but a small random fraction of the rows to 1,
then vacuum.  This results in a rather bloated partial index, but I
think that's probably what he's got given that every record initially
enters the table with NULL exit_state.  It would take extremely frequent
vacuuming to keep the partial index from accumulating a lot of dead
entries.

I played with this scenario too, but still the only way I could get it to spurn the partial index is if I rebuild the full one (to remove the bloat) but not the partial one.  But still, the cost were always in the 8 to 11 range for either index with default cost settings.  It is hard to imagine the amount of bloat needed to drive it up to 5000, like in his initial report before he rebuilt it.

Cheers,

Jeff

Re: Why does the query planner use two full indexes, when a dedicated partial index exists?

От
Jeff Janes
Дата:
On Thursday, December 20, 2012, Richard Neill wrote:
Dear Tom,

Thanks againg for your help on this.

On 20/12/12 03:06, Tom Lane wrote:
Richard Neill <rn214@richardneill.org> writes:
The problem is, when I now run my query, the planner ignores the
dedicated index "tbl_tracker_performance_1_idx", and instead uses both
of the full indexes... resulting in a much much slower query (9ms vs
0.08ms).


I've now installed 9.2. As you said, thanks to the change in 9.2 it initially prefers the partial index.

BUT, after 1 cycle of inserting 500k rows, then deleting them all, then starting to insert again, I find that the planner has reverted to the former bad behaviour.

Presumably the real work load has this type of turn over happen one row at a time, rather than all in one giant mass update transaction, right?  That makes a big difference in the way space is re-used. 
 

Reindexing only takes a couple of seconds, and restores correctness.

Even your slow query is pretty fast.  If you can't afford that, can you afford to take an exclusive lock for a couple of seconds every few minutes?
 

What's going on? Do I need to run reindex in a cron-job? I thought that reindex wasn't "normally" needed, and that index bloat happened only after every row had changed value hundreds of times.

The partial index is highly leveraged.  If every tuple in the table is updated once, that amounts to every tuple in the index being updated 25,000 times.

For the same reason, it is probably not getting vacuum often enough.  The default settings have the table vacuumed once 20% of its rows turns over, but that means the partial index has been turned over many many times.  You could crank down the auto-vacuum settings for that table, or run manual vacuum with a cron job.

Vacuum will not unbloat the index, but if you run it often enough it will keep the bloat from getting to bad in the first place.

But what I think I'd do is change one of your full indexes to contain the other column as well, and get rid of the partial index.  It might not be quite as efficient as the partial index might theoretically be, but it should be pretty good and also be less fragile.
 


         ->  Bitmap Index Scan on tbl_tracker_exit_state_idx (cost=0.00..8.36 rows=151 width=0) (actual time=7.946..7.946 rows=20277 loops=1)

This is finding 100 times more rows than it thinks it will.  If that could be fixed, surely this plan would not look as good.  But then, it would probably just switch to another plan that is not the one you want, either.


Cheers,

Jeff


Re: Why does the query planner use two full indexes, when a dedicated partial index exists?

От
Jeff Janes
Дата:


On Thursday, December 20, 2012, Richard Neill wrote:



- What I'm trying to do is trace the history of the books
  through the system and assign each one a proper unique id.
  So, if I see a book with "parcel_id_code = 37",
  is it a new book (after pid wrap), or is it the same book I saw 1
  minute ago, that hasn't exited the sorter?

I'm not sure how you are implementing this goal, but I don't think it is best done by looping over all books (presumably from some other table?) and issuing an individual query for each one, if that is what you are doing.  Some kind of bulk join would probably be more efficient.

Cheers,

Jeff

Re: Why does the query planner use two full indexes, when a dedicated partial index exists?

От
Richard Neill
Дата:
>     I've now installed 9.2. As you said, thanks to the change in 9.2 it
>     initially prefers the partial index.
>
>     BUT, after 1 cycle of inserting 500k rows, then deleting them all,
>     then starting to insert again, I find that the planner has reverted
>     to the former bad behaviour.
>
>
> Presumably the real work load has this type of turn over happen one row
> at a time, rather than all in one giant mass update transaction, right?
>   That makes a big difference in the way space is re-used.

Sorry - I meant a "real" workload here. I replayed a whole day's worth
of real data into the DB, and that's what I meant by a cycle. Everything
was row-at-a-time.
(It currently takes about an hour to do this)

>
>     Reindexing only takes a couple of seconds, and restores correctness.
>
>
> Even your slow query is pretty fast.  If you can't afford that, can you
> afford to take an exclusive lock for a couple of seconds every few minutes?

Yes, I can. If that's the root cause, I'll do that. But it seems to me
that I've stumbled upon some rather awkward behaviour that I need to
understand fully, and if the index is bloating that badly and that
quickly, then perhaps it's a PG bug (or at least cause for a logfile
warning).

BTW, The index has gone from 16kB to 4.5MB in 6 hours of runtime today.
It still only has 252 matching rows.


>     What's going on? Do I need to run reindex in a cron-job? I thought
>     that reindex wasn't "normally" needed, and that index bloat happened
>     only after every row had changed value hundreds of times.
>
>
> The partial index is highly leveraged.  If every tuple in the table is
> updated once, that amounts to every tuple in the index being updated
> 25,000 times.

How so? That sounds like O(n_2) behaviour.


>
> For the same reason, it is probably not getting vacuum often enough.
>   The default settings have the table vacuumed once 20% of its rows
> turns over, but that means the partial index has been turned over many
> many times.  You could crank down the auto-vacuum settings for that
> table, or run manual vacuum with a cron job.
>
> Vacuum will not unbloat the index, but if you run it often enough it
> will keep the bloat from getting too bad in the first place.

Thanks. I've reduced  autovacuum_vacuum_scale_factor from 0.2 to 0.05
(and set autovacuum_analyze_scale_factor = 0.05 for good measure)

As I understand it, both of these can run in parallel, and I have 7
cores usually idle, while the other is maxed out.

> But what I think I'd do is change one of your full indexes to contain
> the other column as well, and get rid of the partial index.  It might
> not be quite as efficient as the partial index might theoretically be,
> but it should be pretty good and also be less fragile.

I'll try that.

Thanks,

Richard


Re: Why does the query planner use two full indexes, when a dedicated partial index exists?

От
Richard Neill
Дата:

On 21/12/12 05:15, Jeff Janes wrote:
>
>
>     - What I'm trying to do is trace the history of the books
>        through the system and assign each one a proper unique id.
>        So, if I see a book with "parcel_id_code = 37",
>        is it a new book (after pid wrap), or is it the same book I saw 1
>        minute ago, that hasn't exited the sorter?
>
> I'm not sure how you are implementing this goal, but I don't think it is
> best done by looping over all books (presumably from some other table?)
> and issuing an individual query for each one, if that is what you are
> doing.  Some kind of bulk join would probably be more efficient.

It would be nice to do a bulk join, but it's not possible: the query is
time sensitive. Consider:

id/pkey pid      timestamp   exit_state       destination

1       77    -24 hours    1        212
2    77    -18 hours    1        213
3    77    -12 hours    1        45
4    77    -6 hours    1        443
5    77    0 hours        null

[in future...]
5    77    0 hours        1        92
6    77    4 hours        null


At time +5 minutes, I receive a report that a book with parcel_id 77 has
successfully been delivered to destination 92.  So, what I have to do is:

* First, find the id of the most recent book which had pid=77 and where
the exit state is null. (hopefully, but not always, this yields exactly
one row, which in this case is id=5)

* Then update the table to set the destination to 92, where the id=5.


It's a rather cursed query, because:
  - the sorter machine doesn't give me full info in each message, only
    deltas, and I have to reconstruct the global state.
  - pids are reused within hours, but don't increase monotonically,
    (more like drawing repeatedly from a shuffled deck, where cards
    are only returned to the deck sporadically.
  - some pids get double-reported
  - 1% of books fall off the machine, or get stuck on it.
  - occasionally, messages are lost.
  - the sorter state isn't self-consistent (it can be restarted)


The tracker table is my attempt to consistently combine all the state we
know, and merge in the deltas as we receive messages from the sorter
machine. It ends up reflecting reality about 99% of the time.


Richard






Re: Why does the query planner use two full indexes, when a dedicated partial index exists? (solved?)

От
Richard Neill
Дата:
Dear All,

I think periodic reindex may be the solution. Even after reducing the
autovacuum fraction to 0.05, the index still seems to bloat.

After another couple of days runtime, the index is using 11MB, and
I get a query that takes 2.448ms. Then I reindex (takes about 3 sec),
and the index falls to 16kB, and the query takes 0.035ms.

So... problem solved for me: I just have to reindex every few hours.
BUT, this suggests a few remaining things:


1. The documentation still suggests that reindex should not be needed in
"normal" operation...  is this true? Or are the docs wrong? Or have I
got such an edge case? Does this suggest that an auto-reindexer would be
a useful feature?


2. Is there any way to force the planner to use (or ignore) a specific
index, for testing purposes, short of actually dropping the index?
This would be very useful for debugging, especially given that query
plans can only really be fully tested on production systems, and that
dropping indexes is rather a bad thing to do when live operation is
simultaneously happening on that server!

Thanks again for your help.

Best wishes,

Richard





fsc_log=> explain analyse select * from tbl_tracker WHERE
parcel_id_code='32453' AND exit_state IS NULL;

QUERY PLAN
------------------------------------------------------
  Bitmap Heap Scan on tbl_tracker  (cost=20.81..23.32 rows=1 width=174)
(actual time=2.408..2.408 rows=1 loops=1)
    Recheck Cond: ((exit_state IS NULL) AND (parcel_id_code = 32453))
    ->  BitmapAnd  (cost=20.81..20.81 rows=1 width=0) (actual
time=2.403..2.403 rows=0 loops=1)
          ->  Bitmap Index Scan on tbl_tracker_exit_state_idx
(cost=0.00..9.25 rows=132 width=0) (actual time=2.378..2.378 rows=5 loops=1)
                Index Cond: (exit_state IS NULL)
          ->  Bitmap Index Scan on tbl_tracker_parcel_id_code_idx
(cost=0.00..11.30 rows=62 width=0) (actual time=0.022..0.022 rows=65
loops=1)
                Index Cond: (parcel_id_code = 32453)
  Total runtime: 2.448 ms


fsc_log => REINDEX;


fsc_log=> explain analyse select * from tbl_tracker WHERE
parcel_id_code='32453' AND exit_state IS NULL;

QUERY PLAN
-------------------------------------------------
Index Scan using tbl_tracker_performance_1_idx on tbl_tracker
(cost=0.00..5.27 rows=1 width=174) (actual time=0.007..0.008 rows=1 loops=1)
    Index Cond: (parcel_id_code = 32453)
  Total runtime: 0.035 ms
(3 rows)









Re: Why does the query planner use two full indexes, when a dedicated partial index exists? (solved?)

От
John Rouillard
Дата:
On Mon, Dec 24, 2012 at 06:37:11PM +0000, Richard Neill wrote:
> [...]
> So... problem solved for me: I just have to reindex every few hours.
> BUT, this suggests a few remaining things:
> [...]
> 2. Is there any way to force the planner to use (or ignore) a
> specific index, for testing purposes, short of actually dropping the
> index?
> This would be very useful for debugging, especially given that query
> plans can only really be fully tested on production systems, and
> that dropping indexes is rather a bad thing to do when live
> operation is simultaneously happening on that server!

I believe that:

  BEGIN;
  drop index ....
  explain analyze ...
  explain analyze ...
  ROLLBACK;

will do what you want. IIUC Postgres allows DDL within transactions
and thus be rolled back and the operations within the transaction
aren't visible to your other queries running outside the transaction.

  http://wiki.postgresql.org/wiki/Transactional_DDL_in_PostgreSQL:_A_Competitive_Analysis

and

  http://www.postgresql.org/docs/9.2/static/sql-dropindex.html

--
                -- rouilj

John Rouillard       System Administrator
Renesys Corporation  603-244-9084 (cell)  603-643-9300 x 111


Re: Why does the query planner use two full indexes, when a dedicated partial index exists?

От
Jeff Janes
Дата:
On Saturday, December 22, 2012, Richard Neill wrote:


The partial index is highly leveraged.  If every tuple in the table is
updated once, that amounts to every tuple in the index being updated
25,000 times.

How so? That sounds like O(n_2) behaviour.


If the table has 5 million rows while the index has 200 (active) rows at any given time, then to update every row in the table to null and back again would be 100% turn over of the table.  But each such change would lead to an addition and then a deletion from the index.  So 100% turnover of the table would be a 5 million / 200 = 25,000 fold turn of the index.

There is some code that allows a btree index entry to get killed (and so the slot to be reused) without any vacuum, if a scan follows that entry and finds the corresponding tuple in the table no longer visible to anyone.  I have not examined this code, and don't know whether it is doing its job but just isn't enough to prevent the bloat, or if for some reason it is not applicable to your situation.


 Cheers,

Jeff
On Monday, December 24, 2012, John Rouillard wrote:
On Mon, Dec 24, 2012 at 06:37:11PM +0000, Richard Neill wrote:
> [...]
> So... problem solved for me: I just have to reindex every few hours.
> BUT, this suggests a few remaining things:
> [...]
> 2. Is there any way to force the planner to use (or ignore) a
> specific index, for testing purposes, short of actually dropping the
> index?
> This would be very useful for debugging, especially given that query
> plans can only really be fully tested on production systems, and
> that dropping indexes is rather a bad thing to do when live
> operation is simultaneously happening on that server!

I believe that:

  BEGIN;
  drop index ....
  explain analyze ...
  explain analyze ...
  ROLLBACK;

There are two cautions here.  One is that doing the drop index takes an access exclusive lock on the table, and so brings all other connections to a screeching halt.  That is not much nicer to do on a production system than actually dropping the index, so don't dilly-dally around before doing the rollback.  rollback first, then ruminate on the results of the explain.

Also, this will forcibly cancel any autovacuums occurring on the table.  I think one of the reasons he needs to reindex so much is that he is already desperately short of vacuuming behavior.

Cheers,

Jeff

On Monday, December 24, 2012, Richard Neill wrote:
Dear All,

I think periodic reindex may be the solution. Even after reducing the autovacuum fraction to 0.05, the index still seems to bloat.


Since your index is so tiny compared to the table, I'd lower it even more. 0.0001, maybe.

However, vacuums can't overlap a table, so you can't get the table to be vacuumed more often than the time it takes to run the vacuum (which took around 4 minutes at default vacuum cost settings in a toy system) and so you may need to lower the autovacuum_vacuum_cost_delay to 0 for this table.  (I suspect it is all in cache anyway, so the default settings are too pessimistic.)  Finally, you might need to lower autovacuum_naptime, because the usually table also won't be vacuumed any more often than that.

 

1. The documentation still suggests that reindex should not be needed in "normal" operation...  is this true? Or are the docs wrong? Or have I got such an edge case?

Your case seems pretty far out there to me.  
 
Cheers,

Jeff

Re: Why does the query planner use two full indexes, when a dedicated partial index exists?

От
Richard Neill
Дата:
>         The partial index is highly leveraged.  If every tuple in the
>         table is updated once, that amounts to every tuple in the index
>         being updated 25,000 times.
>
>     How so? That sounds like O(n_2) behaviour.
>
> If the table has 5 million rows while the index has 200 (active) rows at
> any given time, then to update every row in the table to null and back
> again would be 100% turn over of the table.  But each such change would
> lead to an addition and then a deletion from the index.  So 100%
> turnover of the table would be a 5 million / 200 = 25,000 fold turn of
> the index.

Sorry, I was being dense. I misread that as:
    "every time a single tuple in the table is updated, the entire index
     (every row) is updated".
Yes, of course your explanation makes sense.

>
> There is some code that allows a btree index entry to get killed (and so
> the slot to be reused) without any vacuum, if a scan follows that entry
> and finds the corresponding tuple in the table no longer visible to
> anyone.  I have not examined this code, and don't know whether it is
> doing its job but just isn't enough to prevent the bloat, or if for some
> reason it is not applicable to your situation.
>

It looks like my solution is going to be a REINDEX invoked from cron, or
maybe just every 100k inserts.


In terms of trying to improve this behaviour for other PG users in the
future, are there any more diagnostics I can do for you? Having found a
special case, I'd like to help permanently resolve it if I can.


Thanks very much again.

Best wishes,

Richard







Why does the query planner use two full indexes, when a dedicated partial index exists?

От
Jeff Janes
Дата:
On Thursday, December 20, 2012, Jeff Janes wrote:
On Thursday, December 20, 2012, Richard Neill wrote:


         ->  Bitmap Index Scan on tbl_tracker_exit_state_idx (cost=0.00..8.36 rows=151 width=0) (actual time=7.946..7.946 rows=20277 loops=1)

This is finding 100 times more rows than it thinks it will.  If that could be fixed, surely this plan would not look as good.  But then, it would probably just switch to another plan that is not the one you want, either.

I guess the issue here is that the histogram postgres uses to estimate the number of rows that will be found is based on visible rows, and it is correctly estimating the number of visible rows that will be found.  And that is the relevant thing to pass up to a higher join for its estimation.  But for estimating the number of blocks a given index scan will access, the right thing would be the number of tuples visited, not the number of them found to be visible.  So that is where this plan goes systematically wrong.  

I guess the correct thing would be for postgres to keep two histograms, one of all tuples and one of all visible tuples, and to produce different selectivity estimates for different purposes.  But I don't see that change getting made.  It is only meaningful in cases where there is a fundamental skew in distribution between visible tuples and invisible-but-as-yet-unvacuumed tuples.

I think that that fundamental skew is the source of both the underestimation of the bitmap scan cost, and overestimation of the partial index scan (although I can't get it to overestimate that be anywhere near the amount you were seeing).

I still think your best bet is to get rid of the partial index and trade the full one on (parcel_id_code) for one on (parcel_id_code,exit_state).  I think that will be much less fragile than reindexing in a cron job.

Cheers,

Jeff

Re: Why does the query planner use two full indexes, when a dedicated partial index exists?

От
Richard Neill
Дата:

On 27/12/12 16:17, Jeff Janes wrote:
>
> I still think your best bet is to get rid of the partial index and trade
> the full one on (parcel_id_code) for one on (parcel_id_code,exit_state).
>   I think that will be much less fragile than reindexing in a cron job.
>

So, at the moment, I have 3 indexes:
   full:     parcel_id_code
   full:     exit_state
   full:     parcel_id_code where exit state is null

Am I right that when you suggest just a single, joint index
     (parcel_id_code,exit_state)
instead of all 3 of the others,

it will allow me to optimally run all of the following? :

1.  SELECT * from tbl_tracker WHERE parcel_id_code=22345 AND exit_state
IS NULL

(this is the one we've been discussing)


2.  SELECT * from tbl_tracker where parcel_id_code=44533

3.  SELECT * from tbl_tracker where exit_code = 2

(2 and 3 are examples of queries I need to run for other purposes,
unrelated to this thread, but which use the other indexes.).


Thanks,

Richard




Re: Why does the query planner use two full indexes, when a dedicated partial index exists?

От
Tom Lane
Дата:
Richard Neill <rn214@richardneill.org> writes:
> So, at the moment, I have 3 indexes:
>    full:     parcel_id_code
>    full:     exit_state
>    full:     parcel_id_code where exit state is null

> Am I right that when you suggest just a single, joint index
>      (parcel_id_code,exit_state)
> instead of all 3 of the others,

I think he was just recommending replacing the first and third indexes.

> it will allow me to optimally run all of the following? :
> 1.  SELECT * from tbl_tracker WHERE parcel_id_code=22345 AND exit_state
> IS NULL
> 2.  SELECT * from tbl_tracker where parcel_id_code=44533
> 3.  SELECT * from tbl_tracker where exit_code = 2

You will need an index with exit_state as the first column to make #3
perform well --- at least, assuming that an index is going to help at
all anyway.  The rule of thumb is that if a query is going to fetch
more than a few percent of a table, an index is not useful because
it's going to be touching most table pages anyway, so a seqscan will
win.  I've forgotten now what you said the stats for exit_code values
other than null were, but it's quite possible that an index is useless
for #3.

These considerations are mostly covered in the manual:
http://www.postgresql.org/docs/9.2/static/indexes.html

            regards, tom lane


Re: Why does the query planner use two full indexes, when a dedicated partial index exists?

От
Jeff Janes
Дата:
On Thursday, December 27, 2012, Richard Neill wrote:


On 27/12/12 16:17, Jeff Janes wrote:

I still think your best bet is to get rid of the partial index and trade
the full one on (parcel_id_code) for one on (parcel_id_code,exit_state).
  I think that will be much less fragile than reindexing in a cron job.


So, at the moment, I have 3 indexes:
  full:     parcel_id_code
  full:     exit_state
  full:     parcel_id_code where exit state is null

Am I right that when you suggest just a single, joint index
    (parcel_id_code,exit_state)
instead of all 3 of the others,

No, just instead of 1 and 3.  You still need an index on (exit_state) in order to efficiently satisfy query 3 below.

Alternative, you could keep index 1, and replace 2 and 3 with one on (exit_state, parcel_id_code).  And in fact this might be the better way to go, because a big problem you are facing is that the (exit_state) index is looking falsely attractive, and the easiest way to overcome that is to get rid of that index and replace it with one that can do everything that it can do, but more.

Theoretically there is technique called "loose scan" or "skip scan" which could allow you to make one index, (exit_state, parcel_id_code) to replace all 3 of the above, but postgres does not yet implement that technique.  I think there is a way to achieve the same thing using recursive sql.  But I doubt it would be worth it, as too much index maintenance is not your root problem.

 
3.  SELECT * from tbl_tracker where exit_code = 2

Cheers,

Jeff 

Re: Why does the query planner use two full indexes, when a dedicated partial index exists?

От
Richard Neill
Дата:
>         The partial index is highly leveraged.  If every tuple in the
>         table is updated once, that amounts to every tuple in the index
 >         being updated 25,000 times.
>
>     How so? That sounds like O(n_2) behaviour.
>
> If the table has 5 million rows while the index has 200 (active) rows at
> any given time, then to update every row in the table to null and back
> again would be 100% turn over of the table.  But each such change would
> lead to an addition and then a deletion from the index.  So 100%
> turnover of the table would be a 5 million / 200 = 25,000 fold turn of
> the index.

Sorry, I was being dense. I misread that as:
    "every time a single tuple in the table is updated, the entire index
     (every row) is updated".
Yes, of course your explanation makes sense.

>
> There is some code that allows a btree index entry to get killed (and so
> the slot to be reused) without any vacuum, if a scan follows that entry
> and finds the corresponding tuple in the table no longer visible to
> anyone.  I have not examined this code, and don't know whether it is
> doing its job but just isn't enough to prevent the bloat, or if for some
> reason it is not applicable to your situation.
>

It looks like my solution is going to be a REINDEX invoked from cron, or
maybe just every 100k inserts.


In terms of trying to improve this behaviour for other PG users in the
future, are there any more diagnostics I can do for you? Having found a
special case, I'd like to help permanently resolve it if I can.


Thanks very much again.

Best wishes,

Richard







Re: Why does the query planner use two full indexes, when a dedicated partial index exists?

От
Jeff Janes
Дата:
On Thursday, December 20, 2012, Jeff Janes wrote:
On Thursday, December 20, 2012, Tom Lane wrote:

What I did to try to duplicate Richard's situation was to create a test
table in which all the exit_state values were NULL, then build the
index, then UPDATE all but a small random fraction of the rows to 1,
then vacuum.  This results in a rather bloated partial index, but I
think that's probably what he's got given that every record initially
enters the table with NULL exit_state.  It would take extremely frequent
vacuuming to keep the partial index from accumulating a lot of dead
entries.


Once I cranked up default_statistics_target, I could start reproducing the very high estimates (5000) for the partial index in 9.1.

As you say, switching to 9.2 or above lowers it quite a bit, I still get some pretty high estimates, ~100 when 8 would be more accurate.

The problem is in genericcostestimate

    if (index->pages > 1 && index->tuples > 1)
        numIndexPages = ceil(numIndexTuples * index->pages / index->tuples);

The index->pages should probably not include index pages which are empty.  Even with aggressive vacuuming, most of the pages in the partial index seem to be empty at any given time.

However, I don't know if that number is exposed readily.  And it seems to be updated too slowly to be useful, if pg_freespace is to be believed.

But I wonder if it couldn't be clamped to so that we there can be no more pages than there are tuples.

        numIndexPages = ceil(numIndexTuples * Min(1,index->pages / index->tuples));


 Cheers,

Jeff