Обсуждение: query doesn't always follow 'correct' path..

Поиск
Список
Период
Сортировка

query doesn't always follow 'correct' path..

От
Bert
Дата:
Hello,

This is more of a query optimalisation question.

I have the following query:
SELECT sm.TICK_SERVER_ID
, sm.ITEMSEAT_ID AS ITEMSEAT_ID_master
, coalesce(sm.ITEMSEAT_STATUS, -999) AS ITEMSEAT_STATUS_master
, zm.ITEM_ID AS ITEM_ID_master
, coalesce(sm.ITEMSEAT_POS_X, -999) AS ITEMSEAT_POS_X
, coalesce(sm.ITEMSEAT_POS_Y, -999) AS ITEMSEAT_POS_Y
, sc.ITEMSEAT_ID AS ITEMSEAT_ID_child
, coalesce(sc.ITEMSEAT_STATUS, -999) AS ITEMSEAT_STATUS_child
, zc.ITEM_ID AS ITEM_ID_child
FROM oltp.ST_ABOITEM am
INNER JOIN oltp.ST_ITEMZONE zm ON zm.TICK_SERVER_ID = am.TICK_SERVER_ID AND zm.ITEM_ID = am.MASTER_ITEM_ID
INNER JOIN oltp.ST_ITEMZONE zc ON zc.TICK_SERVER_ID = am.TICK_SERVER_ID AND zc.ITEM_ID = am.CHILD_ITEM_ID
INNER JOIN oltp.ST_ITEMSEAT sm  ON sm.TICK_SERVER_ID = zm.TICK_SERVER_ID AND sm.ITEMZONE_ID = zm.ITEMZONE_ID
INNER JOIN oltp.ST_ITEMSEAT sc  ON sc.TICK_SERVER_ID = zc.TICK_SERVER_ID AND sc.ITEMZONE_ID = zc.ITEMZONE_ID AND sc.ITEMSEAT_POS_X = sm.ITEMSEAT_POS_X AND sc.ITEMSEAT_POS_Y = sm.ITEMSEAT_POS_Y
WHERE am.TICK_SERVER_ID = 45
AND (
sc.ETL_RUN_ID = 1591
or am.ETL_RUN_ID = 1591
or sm.ETL_RUN_ID = 1591
);


When I don't touch the indexscan setting I get the following output:
Total query runtime: 611484 ms.
20359 rows retrieved.
and the following plan: http://explain.depesz.com/s/sDy

However, when I put set enable_indexscan=off; in fron of the same query I get the following output:
Total query runtime: 16281 ms.
20599 rows retrieved.
and the followign plan: http://explain.depesz.com/s/EpP

Does anyone has an idea what triggers this bad plan, and how I can fix it?

wkr,
Bert

Re: query doesn't always follow 'correct' path..

От
Frank Lanitz
Дата:
Am 18.02.2013 10:43, schrieb Bert:
> Does anyone has an idea what triggers this bad plan, and how I can fix it?

Looks a bit like wrong statistics. Are the statistiks for your tables
correct?

Cheers,
Frank



Re: query doesn't always follow 'correct' path..

От
Bert
Дата:
Hello,

yes, the tables are vacuumed every day with the following command: vacuum analyze schema.table.
The last statistics were collected yesterday evening. I collected statistics about the statistics, and I found the following:
table_name; starttime; runtime
"st_itemseat";"2013-02-17 23:48:42";"00:01:02"
"st_itemseat_45";"2013-02-17 23:35:15";"00:00:08"
"st_itemzone";"2013-02-17 23:35:33";"00:00:01"

st_itemseat_45 is a child-partition of st_itemseat.

They seem to be pretty much up to date I guess?
I also don't get any difference in the query plans when they are run in the morning, or in the evening.

I have also run the query with set seq_scan to off, and then I get the following output:
Total query runtime: 12025 ms.
20599 rows retrieved.
and the following plan: http://explain.depesz.com/s/yaJK

These are 3 different plans. And the last one is blazingly fast. That's the one I would always want to use :-)

it's also weird that this is default plan for the biggest partition. But the smaller the partition gets, the smaller the partition gets.
So I don't think it has anything to do with the memory settings. Since it already chooses this plan for the bigger partitions...

wkr,
Bert


On Mon, Feb 18, 2013 at 11:51 AM, Frank Lanitz <frank@frank.uvena.de> wrote:
Am 18.02.2013 10:43, schrieb Bert:
> Does anyone has an idea what triggers this bad plan, and how I can fix it?

Looks a bit like wrong statistics. Are the statistiks for your tables
correct?

Cheers,
Frank


--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql



--
Bert Desmet
0477/305361

Re: query doesn't always follow 'correct' path..

От
Bert
Дата:
Hello,

Thanks the nice people on irc my problem is fixed.
I changed the following settings in the postgres.conf file:
default_statistics_target = 5000 -> and I analyzed the tables after the change of course -> now I only got 2 plans anymore, in stead of 3
cpu_tuple_cost = 0.1 -> by setting this value the seq scans were stopped, and the better index_only scan / bitmap index scan were used for this query.

Thank you Robe and Mabe_ for helping me with this issue!

wkr,
Bert


On Mon, Feb 18, 2013 at 2:42 PM, Bert <biertie@gmail.com> wrote:
Hello,

yes, the tables are vacuumed every day with the following command: vacuum analyze schema.table.
The last statistics were collected yesterday evening. I collected statistics about the statistics, and I found the following:
table_name; starttime; runtime
"st_itemseat";"2013-02-17 23:48:42";"00:01:02"
"st_itemseat_45";"2013-02-17 23:35:15";"00:00:08"
"st_itemzone";"2013-02-17 23:35:33";"00:00:01"

st_itemseat_45 is a child-partition of st_itemseat.

They seem to be pretty much up to date I guess?
I also don't get any difference in the query plans when they are run in the morning, or in the evening.

I have also run the query with set seq_scan to off, and then I get the following output:
Total query runtime: 12025 ms.
20599 rows retrieved.
and the following plan: http://explain.depesz.com/s/yaJK

These are 3 different plans. And the last one is blazingly fast. That's the one I would always want to use :-)

it's also weird that this is default plan for the biggest partition. But the smaller the partition gets, the smaller the partition gets.
So I don't think it has anything to do with the memory settings. Since it already chooses this plan for the bigger partitions...

wkr,
Bert


On Mon, Feb 18, 2013 at 11:51 AM, Frank Lanitz <frank@frank.uvena.de> wrote:
Am 18.02.2013 10:43, schrieb Bert:
> Does anyone has an idea what triggers this bad plan, and how I can fix it?

Looks a bit like wrong statistics. Are the statistiks for your tables
correct?

Cheers,
Frank


--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql



--
Bert Desmet
0477/305361



--
Bert Desmet
0477/305361

Re: query doesn't always follow 'correct' path..

От
Виктор Егоров
Дата:
2013/2/18 Bert <biertie@gmail.com>
When I don't touch the indexscan setting I get the following output:
Total query runtime: 611484 ms.
20359 rows retrieved.
and the following plan: http://explain.depesz.com/s/sDy

However, when I put set enable_indexscan=off; in fron of the same query I get the following output:
Total query runtime: 16281 ms.
20599 rows retrieved.
and the followign plan: http://explain.depesz.com/s/EpP

Is this a typo or do you really get different number of rows returned with and without indexscans?
Is this expected for the same query to return different sets over time?


--
Victor Y. Yegorov

Re: query doesn't always follow 'correct' path..

От
Julien Cigar
Дата:
On 02/18/2013 15:39, Bert wrote:
Hello,

Thanks the nice people on irc my problem is fixed.
I changed the following settings in the postgres.conf file:
default_statistics_target = 5000 -> and I analyzed the tables after the change of course -> now I only got 2 plans anymore, in stead of 3

default_statistics_target = 5000 as a default is *way* too high. Such high values should only be set on a per-column basis ...

cpu_tuple_cost = 0.1 -> by setting this value the seq scans were stopped, and the better index_only scan / bitmap index scan were used for this query.

Thank you Robe and Mabe_ for helping me with this issue!

s/Mabe_/Mage_ :-)


wkr,
Bert


On Mon, Feb 18, 2013 at 2:42 PM, Bert <biertie@gmail.com> wrote:
Hello,

yes, the tables are vacuumed every day with the following command: vacuum analyze schema.table.
The last statistics were collected yesterday evening. I collected statistics about the statistics, and I found the following:
table_name; starttime; runtime
"st_itemseat";"2013-02-17 23:48:42";"00:01:02"
"st_itemseat_45";"2013-02-17 23:35:15";"00:00:08"
"st_itemzone";"2013-02-17 23:35:33";"00:00:01"

st_itemseat_45 is a child-partition of st_itemseat.

They seem to be pretty much up to date I guess?
I also don't get any difference in the query plans when they are run in the morning, or in the evening.

I have also run the query with set seq_scan to off, and then I get the following output:
Total query runtime: 12025 ms.
20599 rows retrieved.
and the following plan: http://explain.depesz.com/s/yaJK

These are 3 different plans. And the last one is blazingly fast. That's the one I would always want to use :-)

it's also weird that this is default plan for the biggest partition. But the smaller the partition gets, the smaller the partition gets.
So I don't think it has anything to do with the memory settings. Since it already chooses this plan for the bigger partitions...

wkr,
Bert


On Mon, Feb 18, 2013 at 11:51 AM, Frank Lanitz <frank@frank.uvena.de> wrote:
Am 18.02.2013 10:43, schrieb Bert:
> Does anyone has an idea what triggers this bad plan, and how I can fix it?

Looks a bit like wrong statistics. Are the statistiks for your tables
correct?

Cheers,
Frank


--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql



--
Bert Desmet
0477/305361



--
Bert Desmet
0477/305361


-- 
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.

Re: query doesn't always follow 'correct' path..

От
Julien Cigar
Дата:
On 02/18/2013 16:20, Julien Cigar wrote:
On 02/18/2013 15:39, Bert wrote:
Hello,

Thanks the nice people on irc my problem is fixed.
I changed the following settings in the postgres.conf file:
default_statistics_target = 5000 -> and I analyzed the tables after the change of course -> now I only got 2 plans anymore, in stead of 3

default_statistics_target = 5000 as a default is *way* too high. Such high values should only be set on a per-column basis ...

oops.. it's per-table and not per-column


cpu_tuple_cost = 0.1 -> by setting this value the seq scans were stopped, and the better index_only scan / bitmap index scan were used for this query.

Thank you Robe and Mabe_ for helping me with this issue!

s/Mabe_/Mage_ :-)


wkr,
Bert


On Mon, Feb 18, 2013 at 2:42 PM, Bert <biertie@gmail.com> wrote:
Hello,

yes, the tables are vacuumed every day with the following command: vacuum analyze schema.table.
The last statistics were collected yesterday evening. I collected statistics about the statistics, and I found the following:
table_name; starttime; runtime
"st_itemseat";"2013-02-17 23:48:42";"00:01:02"
"st_itemseat_45";"2013-02-17 23:35:15";"00:00:08"
"st_itemzone";"2013-02-17 23:35:33";"00:00:01"

st_itemseat_45 is a child-partition of st_itemseat.

They seem to be pretty much up to date I guess?
I also don't get any difference in the query plans when they are run in the morning, or in the evening.

I have also run the query with set seq_scan to off, and then I get the following output:
Total query runtime: 12025 ms.
20599 rows retrieved.
and the following plan: http://explain.depesz.com/s/yaJK

These are 3 different plans. And the last one is blazingly fast. That's the one I would always want to use :-)

it's also weird that this is default plan for the biggest partition. But the smaller the partition gets, the smaller the partition gets.
So I don't think it has anything to do with the memory settings. Since it already chooses this plan for the bigger partitions...

wkr,
Bert


On Mon, Feb 18, 2013 at 11:51 AM, Frank Lanitz <frank@frank.uvena.de> wrote:
Am 18.02.2013 10:43, schrieb Bert:
> Does anyone has an idea what triggers this bad plan, and how I can fix it?

Looks a bit like wrong statistics. Are the statistiks for your tables
correct?

Cheers,
Frank


--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql



--
Bert Desmet
0477/305361



--
Bert Desmet
0477/305361


-- 
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.


-- 
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.

Re: query doesn't always follow 'correct' path..

От
Bert
Дата:
Hello,

there were 3 hours in between the 2 queries. so I guess new data was loaded already. new data is being loaded with that etl_run_id.

wkr,
Bert


On Mon, Feb 18, 2013 at 4:20 PM, Виктор Егоров <vyegorov@gmail.com> wrote:
2013/2/18 Bert <biertie@gmail.com>
When I don't touch the indexscan setting I get the following output:
Total query runtime: 611484 ms.
20359 rows retrieved.
and the following plan: http://explain.depesz.com/s/sDy

However, when I put set enable_indexscan=off; in fron of the same query I get the following output:
Total query runtime: 16281 ms.
20599 rows retrieved.
and the followign plan: http://explain.depesz.com/s/EpP

Is this a typo or do you really get different number of rows returned with and without indexscans?
Is this expected for the same query to return different sets over time?


--
Victor Y. Yegorov



--
Bert Desmet
0477/305361