Обсуждение: Select query ignores index on large table

Поиск
Список
Период
Сортировка

Select query ignores index on large table

От
Michael Kemanetzis
Дата:
Hello, I'm experiencing a strange behavior running a simple select query on a table that has about 12 million rows. Specifically, changing the "LIMIT" value seems to change the execution plan but the result in one of the cases is unjustifiably slow, as if it ignores all indexes.

The table structure, the queries used and the results are here: http://pastebin.com/fn36BuKs

Is there anything I can do to improve the speed of this query?

Are there any configuration settings that I should change?

Thanks,
Mike

Re: Select query ignores index on large table

От
hubert depesz lubaczewski
Дата:
On Thu, Jan 27, 2011 at 04:04:02PM +0200, Michael Kemanetzis wrote:
> Hello, I'm experiencing a strange behavior running a simple select query on
> a table that has about 12 million rows. Specifically, changing the "LIMIT"
> value seems to change the execution plan but the result in one of the cases
> is unjustifiably slow, as if it ignores all indexes.
>
> The table structure, the queries used and the results are here:
> http://pastebin.com/fn36BuKs
>
> Is there anything I can do to improve the speed of this query?

create index qq on vehicleevent (veh_id, event_id)
could help.

Best regards,

depesz

--
Linkedin: http://www.linkedin.com/in/depesz  /  blog: http://www.depesz.com/
jid/gtalk: depesz@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007

Re: Select query ignores index on large table

От
Alban Hertroys
Дата:
On 27 Jan 2011, at 15:04, Michael Kemanetzis wrote:

> Hello, I'm experiencing a strange behavior running a simple select query on a table that has about 12 million rows.
Specifically,changing the "LIMIT" value seems to change the execution plan but the result in one of the cases is
unjustifiablyslow, as if it ignores all indexes. 
>
> The table structure, the queries used and the results are here: http://pastebin.com/fn36BuKs
>
> Is there anything I can do to improve the speed of this query?

What does explain analyse say about query B?

According to the query plan there are about 30k rows with veh_id = 3. From the amount of disk I/O you describe it would
appearthat the rows corresponding to that ID are all over the place. I expect that clustering that table on the veh_id
indexwill help in that case. 

It does seem a bit strange that the planner is choosing an index scan for 30k records, I'd expect a sequential scan to
bemore efficient. That seems to be another indication that your records are very fragmented with regards to the veh_id. 

That, or you are running out of memory (the setting to look at is work_mem I think). Did you change any settings from
thedefaults? 


BTW, 12M records isn't considered a large table by many Postgres users. It's certainly not small though, I suppose
"average"fits the bill. 

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4d41b62211732046819744!



Re: Select query ignores index on large table

От
Michael Kemanetzis
Дата:
Thanks, it did help. Now queries run in zero time.
I had thought of doing that....but since the same configuration was working ok on MSSQL I thought it should also here.
Now with that index the server query times are a lot faster than MSSQL without it.

Since it is working I will leave it like that. At least till I know more on how postgre behaves (and databases in general) with indexes.

Thanks again!

On Thu, Jan 27, 2011 at 4:12 PM, hubert depesz lubaczewski <depesz@depesz.com> wrote:
On Thu, Jan 27, 2011 at 04:04:02PM +0200, Michael Kemanetzis wrote:
> Hello, I'm experiencing a strange behavior running a simple select query on
> a table that has about 12 million rows. Specifically, changing the "LIMIT"
> value seems to change the execution plan but the result in one of the cases
> is unjustifiably slow, as if it ignores all indexes.
>
> The table structure, the queries used and the results are here:
> http://pastebin.com/fn36BuKs
>
> Is there anything I can do to improve the speed of this query?

create index qq on vehicleevent (veh_id, event_id)
could help.

Best regards,

depesz

--
Linkedin: http://www.linkedin.com/in/depesz  /  blog: http://www.depesz.com/
jid/gtalk
: depesz@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007

Re: Select query ignores index on large table

От
Michael Kemanetzis
Дата:
I am answering just for the sake of answering your questions.
What hubert depesz lubaczewski suggested  had fixed the problem i had.


I have other queries that need event_id to be the clustered index
Veh_id is spread all over the table. (for veh_id 3 there are no records)
Due to the spread of records all over the table the second plan that scans the whole table to get enough records takes ages.
A planner should probably predict this (the fragmentation of veh_id) and mostly consider the second solution.
Maybe it does that..who knows...all I know is my problem is fixed with the second two field index.

Thank you for your effort to help.

On Thu, Jan 27, 2011 at 8:14 PM, Alban Hertroys <dalroi@solfertje.student.utwente.nl> wrote:
On 27 Jan 2011, at 15:04, Michael Kemanetzis wrote:

> Hello, I'm experiencing a strange behavior running a simple select query on a table that has about 12 million rows. Specifically, changing the "LIMIT" value seems to change the execution plan but the result in one of the cases is unjustifiably slow, as if it ignores all indexes.
>
> The table structure, the queries used and the results are here: http://pastebin.com/fn36BuKs
>
> Is there anything I can do to improve the speed of this query?

What does explain analyse say about query B?

According to the query plan there are about 30k rows with veh_id = 3. From the amount of disk I/O you describe it would appear that the rows corresponding to that ID are all over the place. I expect that clustering that table on the veh_id index will help in that case.

It does seem a bit strange that the planner is choosing an index scan for 30k records, I'd expect a sequential scan to be more efficient. That seems to be another indication that your records are very fragmented with regards to the veh_id.

That, or you are running out of memory (the setting to look at is work_mem I think). Did you change any settings from the defaults?


BTW, 12M records isn't considered a large table by many Postgres users. It's certainly not small though, I suppose "average" fits the bill.

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:1227,4d41b61c11731997733516!