RE: select query does not pick up the right index

Поиск
Список
Период
Сортировка
От Abadie Lana
Тема RE: select query does not pick up the right index
Дата
Msg-id 49406fb0c239480980ffc6c2a250d4e9@iter.org
обсуждение исходный текст
Ответ на Re: select query does not pick up the right index  (David Rowley <david.rowley@2ndquadrant.com>)
Ответы Re: select query does not pick up the right index  (David Rowley <david.rowley@2ndquadrant.com>)
Список pgsql-performance


Lana ABADIE
Database Engineer
CODAC Section

ITER Organization, Building 72/4108, SCOD, Control System Division
Route de Vinon-sur-Verdon - CS 90 046 - 13067 St Paul Lez Durance Cedex - France
Phone: +33 4 42 17 84 02
Get the latest ITER news on http://www.iter.org/whatsnew

-----Original Message-----
From: David Rowley <david.rowley@2ndquadrant.com> 
Sent: 03 January 2019 01:16
To: Abadie Lana <Lana.Abadie@iter.org>
Cc: pgsql-performance@lists.postgresql.org
Subject: Re: select query does not pick up the right index

On Thu, 3 Jan 2019 at 05:28, Abadie Lana <Lana.Abadie@iter.org> wrote:
> I would appreciate any hints as this problem looks to me rather strange…I tried to google it but in vain.
>
> select t.name, c.smpl_time,c.nanosecs,c.float_val,c.num_val,c.str_val,c.datatype,c.array_val from sample c, channel t
wheret.channel_id=c.channel_id and t.name='BUIL-B36-VA-RT-RT1:CL0001-2-ABW'   order by c.smpl_time desc limit 5;
 
>
> takes 20mn to execute because it picks up the wrong index…see explain analyse below. I would expect this query to use
the(channel_id,smpl_time) but it uses the smpl_time index.
 

[...]

> Any ideas, why the planner is not taking the right index?

The planner assumes that the required channel values are evenly distributed through the scan of the index on smpl_time.
If your required 5 rows were found quickly (i.e channels with recent sample values), then the plan would have worked
outwell.  It looks like 'BUIL-B36-VA-RT-RT1:CL0001-2-ABW' is probably a channel which has some very old sample values.
Ican see that from "Rows Removed by Join
 
Filter: 322099471",  meaning that on backwards scanning the smpl_time index, that many rows were found not to match the
channelyou requested.
 

The planner, by default only has statistics to say how common each channel is in the sample table. I think in this case
sincethe planner has no knowledge of which channel_id it will be searching for (that's only determined during
execution),then I suppose it must be using the n_distinct of the sample.channel_id table.  It would be interesting to
knowhow far off the n_distinct estimation is. You can find out with:
 

select stadistinct from pg_statistic where starelid='sample'::regclass and staattnum = 1; select count(*) from (select
distinctchannel_id from sample) s; -- this may take a while to run...
 

If the stadistinct estimate is far out from the reality, then you could consider setting this manually with:

alter table sample alter column channel_id set (n_distinct = <actual value here>);

but keep in mind, that as the table evolves, whatever you set there could become outdated.

Another method to fix you could try would be to coax the planner into doing something different would be to give it a
betterindex to work with.
 

create index on channel(name, channel_id);

You didn't show us the details from the channel table, but if there's not an index like this then this might reduce the
costof a Merge Join, but since the order rows output from that join would be in channel_id order, a Sort would be
required,which would require joining all matching rows, not just the first 5 matches. Depending on how many rows
actuallymatch will determine if that's faster or not.
 

If you don't have luck with either of the above then, one other thing you could try would be to disallow the planner
fromusing the smpl_time index by changing the order by to "ORDER BY c.smpl_time +
 
INTERVAL '0 sec';   that's a bit of a hack, but we don't have anything
we officially call "query hints" in PostgreSQL, so often we're left to solve issues like this with ugly tricks like
that.

Also, going by:

> ->  Seq Scan on channel t  (cost=0.00..915.83 rows=1 width=41) (actual 
> -> time=4.683..7.885 rows=1 loops=1)

perhaps "name" is unique on the channel table?  (I doubt there's an index/constraint to back that up, however, since
suchan index would have likely been used here instead of the Seq Scan)
 

If so, and you can add a constraint to back that up, you might  be able to reform the query to be:

select 'BUIL-B36-VA-RT-RT1:CL0001-2-ABW',
c.smpl_time,c.nanosecs,c.float_val,c.num_val,c.str_val,c.datatype,c.array_val
from sample c
WHERE c.channel_id = (SELECT channel_id FROM channel WHERE
name='BUIL-B36-VA-RT-RT1:CL0001-2-ABW')
order by c.smpl_time desc limit 5;

If you can do that then it's highly likely to be *very* fast to execute since I see there's an index on (channel_id,
smpl_time)on each of the inherited tables.
 

(If our planner was smarter then in the presence of the correct unique index, we could have rewritten the query as such
automatically....but it's not / we don't.  I believe I've mentioned about improving this somewhere in the distant past
ofthe -hackers mailing list, although I can't find it right now. I recall naming the idea "scalar value lookup joins",
butdevelopment didn't get much beyond thinking of that name)
 

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services




Hi David
Thanks for your tips. So answers to your questions/comments
1) About n_distinct
first query returns 2136, second query returns 33425. So it seems that there is some discrepancies...Also the sample
tableis very big...roughly 322099474 rows.
 
I did the alter statement but without success. Still long execution time with wrong index
2) index on channel(name,channel_id)
There was no indexes on channel. So I created it. Same execution time, still wrong index used regardless of the
n_distinctvalues
 
3)The "trick" (+ interval '0s') did the job. The index on channel_id, smpl_time is used. Query time can vary between a
fewms to 25 sec
 
4) name is unique, constraint and index created. Right index is picked up and query time is rather constant there
40sec.

A few comments : 
- I have disabled force_parallel_mode when running all the tests. 
- The difference between the two plans is in the case of query with the trick, the planner is using a bitmap index
scan,in the second one it uses index scan backward.
 
- when I execute the initial query, there is a big read access on disk almost 17.7 GB...whereas the total size of the
smpl_timeindex is roughly 7GB...Could it be a wrong configuration on my side?
 
During the tests, no insert/delete/or update was performed...only my select queries...
Main parameters : effective_cache_size : 4GB, shared_buffers 4GB, work_mem 4MB

Thanks a lot !







В списке pgsql-performance по дате отправления:

Предыдущее
От: David Rowley
Дата:
Сообщение: Re: select query does not pick up the right index
Следующее
От: Abadie Lana
Дата:
Сообщение: RE: select query does not pick up the right index