Обсуждение: trick the query optimiser to skip some optimisations

Поиск
Список
Период
Сортировка

trick the query optimiser to skip some optimisations

От
Дмитрий Шалашов
Дата:
Hi!

I have a table called 'feed'. It's a big table accessed by many types of queries, so I have quite a lot of indices on it.

Those that are relevant looks like this:

"feed_user_id_active_id_added_idx" btree (user_id, active_id, added)
"feed_user_id_added_idx" btree (user_id, added DESC)
"feed_user_id_added_idx2" btree (user_id, added DESC) WHERE active_id = user_id AND type = 1

last one is very small and tailored for the specific query.
"added" field is timestamp, everything else is integers.

That specific query looks like this:

SELECT * FROM feed WHERE user_id = ? AND type = 1 AND active_id = user_id ORDER BY added DESC LIMIT 31;

But it doesn't use the last index. EXPLAIN shows this:

 Limit  (cost=0.00..463.18 rows=31 width=50)
   ->  Index Scan Backward using feed_user_id_active_id_added_idx on user_feed  (cost=0.00..851.66 rows=57 width=50)
         Index Cond: ((user_id = 7) AND (active_id = 7))
         Filter: (type = 1)

So as we can see optimiser changes "active_id = user_id" to "active_id = <whatever value user_id takes>". And it brokes my nice fast partial index :(
Can I do something here so optimiser would use the feed_user_id_added_idx2 index? It's around ten times smaller than the 'generic' feed_user_id_active_id_added_idx index.

I have PostgreSQL 9.2.6 on Debian.

Best regards,
Dmitriy Shalashov

Re: trick the query optimiser to skip some optimisations

От
Jeff Janes
Дата:
On Wed, Jan 29, 2014 at 3:38 PM, Дмитрий Шалашов <skaurus@gmail.com> wrote:
 
"feed_user_id_added_idx2" btree (user_id, added DESC) WHERE active_id = user_id AND type = 1

 ...
 
SELECT * FROM feed WHERE user_id = ? AND type = 1 AND active_id = user_id ORDER BY added DESC LIMIT 31;

But it doesn't use the last index. EXPLAIN shows this:

 Limit  (cost=0.00..463.18 rows=31 width=50)
   ->  Index Scan Backward using feed_user_id_active_id_added_idx on user_feed  (cost=0.00..851.66 rows=57 width=50)
         Index Cond: ((user_id = 7) AND (active_id = 7))
         Filter: (type = 1)

So as we can see optimiser changes "active_id = user_id" to "active_id = <whatever value user_id takes>". And it brokes my nice fast partial index :(
Can I do something here so optimiser would use the feed_user_id_added_idx2 index? It's around ten times smaller than the 'generic' feed_user_id_active_id_added_idx index.

How about "where user_id+0=?" 

Cheers,

Jeff

Re: trick the query optimiser to skip some optimisations

От
Дмитрий Шалашов
Дата:
Thanks for the tip!

Well, index is now used but...

 Limit  (cost=264291.67..264291.75 rows=31 width=50)
   ->  Sort  (cost=264291.67..264292.80 rows=453 width=50)
         Sort Key: added
         ->  Bitmap Heap Scan on feed  (cost=1850.99..264278.18 rows=453 width=50)
               Recheck Cond: ((active_id = user_id) AND (type = 1))
               Filter: ((user_id + 0) = 7)
               ->  Bitmap Index Scan on feed_user_id_added_idx2  (cost=0.00..1850.88 rows=90631 width=0)


Best regards,
Dmitriy Shalashov


2014-01-30 Jeff Janes <jeff.janes@gmail.com>
On Wed, Jan 29, 2014 at 3:38 PM, Дмитрий Шалашов <skaurus@gmail.com> wrote:
 
"feed_user_id_added_idx2" btree (user_id, added DESC) WHERE active_id = user_id AND type = 1

 ...
 
SELECT * FROM feed WHERE user_id = ? AND type = 1 AND active_id = user_id ORDER BY added DESC LIMIT 31;

But it doesn't use the last index. EXPLAIN shows this:

 Limit  (cost=0.00..463.18 rows=31 width=50)
   ->  Index Scan Backward using feed_user_id_active_id_added_idx on user_feed  (cost=0.00..851.66 rows=57 width=50)
         Index Cond: ((user_id = 7) AND (active_id = 7))
         Filter: (type = 1)

So as we can see optimiser changes "active_id = user_id" to "active_id = <whatever value user_id takes>". And it brokes my nice fast partial index :(
Can I do something here so optimiser would use the feed_user_id_added_idx2 index? It's around ten times smaller than the 'generic' feed_user_id_active_id_added_idx index.

How about "where user_id+0=?" 

Cheers,

Jeff

Re: trick the query optimiser to skip some optimisations

От
Sergey Konoplev
Дата:
On Wed, Jan 29, 2014 at 3:38 PM, Дмитрий Шалашов <skaurus@gmail.com> wrote:
> I have a table called 'feed'. It's a big table accessed by many types of
> queries, so I have quite a lot of indices on it.
>
> Those that are relevant looks like this:
>
> "feed_user_id_active_id_added_idx" btree (user_id, active_id, added)
> "feed_user_id_added_idx" btree (user_id, added DESC)
> "feed_user_id_added_idx2" btree (user_id, added DESC) WHERE active_id =
> user_id AND type = 1
>
> last one is very small and tailored for the specific query.
> "added" field is timestamp, everything else is integers.
[..]
>  Limit  (cost=0.00..463.18 rows=31 width=50)
>    ->  Index Scan Backward using feed_user_id_active_id_added_idx on
> user_feed  (cost=0.00..851.66 rows=57 width=50)
>          Index Cond: ((user_id = 7) AND (active_id = 7))
>          Filter: (type = 1)
[...]
> Can I do something here so optimiser would use the feed_user_id_added_idx2
> index? It's around ten times smaller than the 'generic'
> feed_user_id_active_id_added_idx index.
>
> I have PostgreSQL 9.2.6 on Debian.

Could you please show EXPLAIN ANALYZE for both cases, the current one
and with feed_user_id_active_id_added_idx dropped?

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray.ru@gmail.com


Re: trick the query optimiser to skip some optimisations

От
Jeff Janes
Дата:
On Wed, Jan 29, 2014 at 4:17 PM, Дмитрий Шалашов <skaurus@gmail.com> wrote:
Thanks for the tip!

Well, index is now used but...

 Limit  (cost=264291.67..264291.75 rows=31 width=50)
   ->  Sort  (cost=264291.67..264292.80 rows=453 width=50)
         Sort Key: added
         ->  Bitmap Heap Scan on feed  (cost=1850.99..264278.18 rows=453 width=50)
               Recheck Cond: ((active_id = user_id) AND (type = 1))
               Filter: ((user_id + 0) = 7)
               ->  Bitmap Index Scan on feed_user_id_added_idx2  (cost=0.00..1850.88 rows=90631 width=0)

Ah, of course.  It prevents the optimization you want, as well as the one you don't want.

This is getting very ugly, but maybe change the index to match the degenerate query:

"feed_user_id_added_idx3" btree ((user_id + 0), added DESC) WHERE active_id = user_id AND type = 1

Long term I would probably look into refactoring the table so that "active_id = user_id" is not a magical condition, like it seems to be for you currently.  Maybe introduce a boolean column.

Cheers,

Jeff