Re: Why does the query planner use two full indexes, when a dedicated partial index exists?

Поиск
Список
Период
Сортировка
От Jeff Janes
Тема Re: Why does the query planner use two full indexes, when a dedicated partial index exists?
Дата
Msg-id CAMkU=1w1yqYepDpmKo5fuL9Akkk4t_AOiwoc6u2k5ysfAEt1kQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Why does the query planner use two full indexes, when a dedicated partial index exists?  (Richard Neill <rn214@richardneill.org>)
Ответы Re: Why does the query planner use two full indexes, when a dedicated partial index exists?  (Richard Neill <rn214@richardneill.org>)
Why does the query planner use two full indexes, when a dedicated partial index exists?  (Jeff Janes <jeff.janes@gmail.com>)
Список pgsql-performance
On Thursday, December 20, 2012, Richard Neill wrote:
Dear Tom,

Thanks againg for your help on this.

On 20/12/12 03:06, Tom Lane wrote:
Richard Neill <rn214@richardneill.org> writes:
The problem is, when I now run my query, the planner ignores the
dedicated index "tbl_tracker_performance_1_idx", and instead uses both
of the full indexes... resulting in a much much slower query (9ms vs
0.08ms).


I've now installed 9.2. As you said, thanks to the change in 9.2 it initially prefers the partial index.

BUT, after 1 cycle of inserting 500k rows, then deleting them all, then starting to insert again, I find that the planner has reverted to the former bad behaviour.

Presumably the real work load has this type of turn over happen one row at a time, rather than all in one giant mass update transaction, right?  That makes a big difference in the way space is re-used. 
 

Reindexing only takes a couple of seconds, and restores correctness.

Even your slow query is pretty fast.  If you can't afford that, can you afford to take an exclusive lock for a couple of seconds every few minutes?
 

What's going on? Do I need to run reindex in a cron-job? I thought that reindex wasn't "normally" needed, and that index bloat happened only after every row had changed value hundreds of times.

The partial index is highly leveraged.  If every tuple in the table is updated once, that amounts to every tuple in the index being updated 25,000 times.

For the same reason, it is probably not getting vacuum often enough.  The default settings have the table vacuumed once 20% of its rows turns over, but that means the partial index has been turned over many many times.  You could crank down the auto-vacuum settings for that table, or run manual vacuum with a cron job.

Vacuum will not unbloat the index, but if you run it often enough it will keep the bloat from getting to bad in the first place.

But what I think I'd do is change one of your full indexes to contain the other column as well, and get rid of the partial index.  It might not be quite as efficient as the partial index might theoretically be, but it should be pretty good and also be less fragile.
 


         ->  Bitmap Index Scan on tbl_tracker_exit_state_idx (cost=0.00..8.36 rows=151 width=0) (actual time=7.946..7.946 rows=20277 loops=1)

This is finding 100 times more rows than it thinks it will.  If that could be fixed, surely this plan would not look as good.  But then, it would probably just switch to another plan that is not the one you want, either.


Cheers,

Jeff


В списке pgsql-performance по дате отправления:

Предыдущее
От: Jeff Janes
Дата:
Сообщение: Re: Why does the query planner use two full indexes, when a dedicated partial index exists?
Следующее
От: Jeff Janes
Дата:
Сообщение: Re: Why does the query planner use two full indexes, when a dedicated partial index exists?