Обсуждение: how to calibrate the cost model parameters

Поиск
Список
Период
Сортировка

how to calibrate the cost model parameters

От
jian xu
Дата:

Hello,

             On my system, the estimated query cost has huge difference with the real query cost. For example, a simple select query, the estimated query cost is 100X bigger than real cost,  I think I need to calibrate the cost model parameters below(I still use default value)

#seq_page_cost = 1.0                                 # measured on an arbitrary scale

#random_page_cost = 4.0                         # same scale as above

#cpu_tuple_cost = 0.01                              # same scale as above

#cpu_index_tuple_cost = 0.005               # same scale as above

#cpu_operator_cost = 0.0025                   # same scale as above

 

I googled it, and only find some suggestions, for example set random_page_cost to 1 or 2 for SSD, however, there is no doc about how to calibrate the cost model parameters to make the estimated query cost close to the real query cost.

 

Thanks,

 

James

Re: how to calibrate the cost model parameters

От
Laurenz Albe
Дата:
On Sun, 2022-05-29 at 15:29 +0000, jian xu wrote:
> On my system, the estimated query cost has huge difference with the real query cost.
> For example, a simple select query, the estimated query cost is 100X bigger than real cost,
> I think I need to calibrate the cost model parameters below(I still use default value)
>
> #seq_page_cost = 1.0                                 # measured on an arbitrary scale
> #random_page_cost = 4.0                         # same scale as above
> #cpu_tuple_cost = 0.01                              # same scale as above
> #cpu_index_tuple_cost = 0.005               # same scale as above
> #cpu_operator_cost = 0.0025                   # same scale as above
>  
> I googled it, and only find some suggestions, for example set random_page_cost to 1 or 2
> for SSD, however, there is no doc about how to calibrate the cost model parameters to make
> the estimated query cost close to the real query cost.

That is as designed, see that "arbitrary scale" comment on "seq_page_cost".

Sure, you could adjust all cost parameters so that 1 roughly corresponds to
the number of milliseconds for the actual query, but there is little benefit
in that.
The numbers are only used to compare plans with each other.  Moreover, settings
like "enable_seqscan" and "enable_nestloop" can artificially distort the numbers.

Yours,
Laurenz Albe



RE: how to calibrate the cost model parameters

От
jian xu
Дата:

Thanks Laurenz.  It is ok that these values are "arbitrary scale", but we still need to calibrate them to get the correct "arbitrary scale". For example if seq_page_cost is 1 and cpu_tuple_cost is 0.01, how to verify the io seq cost is 100 times slower than cpu access cost

"enable_seqscan" and "enable_nestloop" can work in some cases, but it doesn’t work with any cases.

Does anyone have experience to  calibrate the cost model parameters to get the correct value? Thanks

James

 

From: Laurenz Albe
Sent: Monday, May 30, 2022 7:23 AM
To: jian xu; pgsql-admin@postgresql.org
Subject: Re: how to calibrate the cost model parameters

 

On Sun, 2022-05-29 at 15:29 +0000, jian xu wrote:
> On my system, the estimated query cost has huge difference with the real query cost.
> For example, a simple select query, the estimated query cost is 100X bigger than real cost,
> I think I need to calibrate the cost model parameters below(I still use default value)
>
> #seq_page_cost = 1.0                                 # measured on an arbitrary scale
> #random_page_cost = 4.0                         # same scale as above
> #cpu_tuple_cost = 0.01                              # same scale as above
> #cpu_index_tuple_cost = 0.005               # same scale as above
> #cpu_operator_cost = 0.0025                   # same scale as above
>  
> I googled it, and only find some suggestions, for example set random_page_cost to 1 or 2
> for SSD, however, there is no doc about how to calibrate the cost model parameters to make
> the estimated query cost close to the real query cost.

That is as designed, see that "arbitrary scale" comment on "seq_page_cost".

Sure, you could adjust all cost parameters so that 1 roughly corresponds to
the number of milliseconds for the actual query, but there is little benefit
in that.
The numbers are only used to compare plans with each other.  Moreover, settings
like "enable_seqscan" and "enable_nestloop" can artificially distort the numbers.

Yours,
Laurenz Albe

 

Re: how to calibrate the cost model parameters

От
Jeff Janes
Дата:
On Mon, May 30, 2022 at 3:40 PM jian xu <jamesxu@outlook.com> wrote:

Thanks Laurenz.  It is ok that these values are "arbitrary scale", but we still need to calibrate them to get the correct "arbitrary scale". For example if seq_page_cost is 1 and cpu_tuple_cost is 0.01, how to verify the io seq cost is 100 times slower than cpu access cost

"enable_seqscan" and "enable_nestloop" can work in some cases, but it doesn’t work with any cases.

Does anyone have experience to  calibrate the cost model parameters to get the correct value? Thanks


I put a lot of time into it, and I think you will find that there is no consistent way to do so.  For cpu_tuple_cost, for example, are the hint bits already set?  Do they need to get set?  Are the clog pages still in memory?  Is there contention on the lock used to determine if a given transaction is still running?  How many columns does the table have? How many of them need to be accessed for the current case, and how far left and right are they in the table and are they toasted?

For seq reads, how much contention is there for the buffer mapping partition lock?  What is the throughput of your IO system?  How many other seq reads will be occuring at the same time? Etc.

For any change you intend to make, do you have a good enough load generator and test system set up so you can test that it doesn't make something else worse?

Cheers,

Jeff

Re: how to calibrate the cost model parameters

От
Laurenz Albe
Дата:
On Mon, 2022-05-30 at 17:12 -0400, Jeff Janes wrote:
> On Mon, May 30, 2022 at 3:40 PM jian xu <jamesxu@outlook.com> wrote:
> > Thanks Laurenz.  It is ok that these values are "arbitrary scale", but we still need
> > to calibrate them to get the correct"arbitrary scale". For example if seq_page_cost
> > is 1 and cpu_tuple_cost is 0.01, how to verify the io seq cost is 100 times slower than cpu access cost?
> > "enable_seqscan" and "enable_nestloop" can work in some cases, but it doesn’t work with any cases.
> > Does anyone have experience to  calibrate the cost model parameters to get the correct value? Thanks
> 
> I put a lot of time into it, and I think you will find that there is no consistent way to do so.
> For cpu_tuple_cost, for example, are the hint bits already set?  Do they need to get set?
> Are the clog pages still in memory?  Is there contention on the lock used to determine if a given
> transaction is still running?  How many columns does the table have? How many of them need to be
> accessed for the current case, and how far left and right are they in the table and are they toasted?
> 
> For seq reads, how much contention is there for the buffer mapping partition lock?  What is the
> throughput of your IO system?  How many other seq reads will be occuring at the same time? Etc.
> 
> For any change you intend to make, do you have a good enough load generator and test system set
> up so you can test that it doesn't make something else worse?

+1

I think that the values of those parameters are more set by experience and tradition than
by measurable physical evidence.  In a way, it just happens to work (mostly).
For example, who says that on a spinning disk, random I/O is four times as slow as
sequential I/O?  Very likely, you have to factor in that part of the I/O requests are
satisfied from the kernel page cache.

Yours,
Laurenz Albe



Re: how to calibrate the cost model parameters

От
Tom Lane
Дата:
Laurenz Albe <laurenz.albe@cybertec.at> writes:
> I think that the values of those parameters are more set by experience and tradition than
> by measurable physical evidence.  In a way, it just happens to work (mostly).
> For example, who says that on a spinning disk, random I/O is four times as slow as
> sequential I/O?  Very likely, you have to factor in that part of the I/O requests are
> satisfied from the kernel page cache.

The 4.0 value *is* based on some real evidence --- many years ago, I put
in a lot of machine time to get an estimate of that for non-cached fetches
on the hardware I was using at the time.  If somebody did that over again
today, they'd likely get a somewhat different value, even assuming they
were still using spinning-rust storage.  But it doesn't matter too much.
A value around that is probably fine for rotating storage, while if you
are using SSD it's likely better to use something around 1.0, and beyond
that any inaccuracies in the cost constants are usually swamped by
imperfections of the cost models and other data.  For instance, it's not
that useful to sweat about what the right value is for cpu_operator_cost
when we've never spent any serious effort on assigning plausible procost
factors to different functions.  (Worse, the whole notion that a given
function has a fixed cost independent of its input data is sadly
inadequate.)

            regards, tom lane



RE: how to calibrate the cost model parameters

От
jian xu
Дата:

Thanks everyone. The reason I asked this question is, my system has a complex query, which uses CTE, then join 7 other tables. It also has where clause using json data filter. Sometimes it runs slowly(5 minutes), but sometimes it runs fast(2 seconds), (query text is same , no parameter). I know it is because data distribution change. The table is  updated frequently.

I compared the execution plan, the fast query joins other tables first, then join CTE, however, the slow question joins CTE first, then join other tables. I analyzed the tables, it didn’t work, I increased the column sampling from 100 to 10000 for all the columns used in where clause, didn’t work. I set the random io cost to 1, it didn’t work.

I am not able to modify the query. The only solution I find is

  1. use join_collapse_limit to force the join order, however, it has limitation, I need to modify the code to set it before submitting the query, and set it back to original value after query run, in case it didn’t change back, other query will affect.
  2. Set both sequence and random io cost to 0.1, it can always generate correct plan.

So I suspect the cost model parameters are not fully optimized on my system, the fast and slow execution plan cost might be very close, so a little bit data change will cause the plan change, and generated a wrong plan.

Thanks

James

 

Sent from Mail for Windows

 

From: Tom Lane
Sent: Tuesday, May 31, 2022 10:44 AM
To: Laurenz Albe
Cc: Jeff Janes; jian xu; pgsql-admin@postgresql.org
Subject: Re: how to calibrate the cost model parameters

 

Laurenz Albe <laurenz.albe@cybertec.at> writes:
> I think that the values of those parameters are more set by experience and tradition than
> by measurable physical evidence.  In a way, it just happens to work (mostly).
> For example, who says that on a spinning disk, random I/O is four times as slow as
> sequential I/O?  Very likely, you have to factor in that part of the I/O requests are
> satisfied from the kernel page cache.

The 4.0 value *is* based on some real evidence --- many years ago, I put
in a lot of machine time to get an estimate of that for non-cached fetches
on the hardware I was using at the time.  If somebody did that over again
today, they'd likely get a somewhat different value, even assuming they
were still using spinning-rust storage.  But it doesn't matter too much.
A value around that is probably fine for rotating storage, while if you
are using SSD it's likely better to use something around 1.0, and beyond
that any inaccuracies in the cost constants are usually swamped by
imperfections of the cost models and other data.  For instance, it's not
that useful to sweat about what the right value is for cpu_operator_cost
when we've never spent any serious effort on assigning plausible procost
factors to different functions.  (Worse, the whole notion that a given
function has a fixed cost independent of its input data is sadly
inadequate.)

                        regards, tom lane

 

Re: how to calibrate the cost model parameters

От
Ron
Дата:
Manually (aka through a cron script) ANALYZE those seven tables on a regular basis.  During the business day, I ANALYZE some tables every two hours.

On 5/31/22 21:20, jian xu wrote:
@font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 4;}@font-face {font-family:DengXian; panose-1:2 1 6 0 3 1 1 1 1 1;}@font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4;}@font-face {font-family:"\@DengXian"; panose-1:2 1 6 0 3 1 1 1 1 1;}p.MsoNormal, li.MsoNormal, div.MsoNormal {margin:0in; font-size:11.0pt; font-family:"Calibri",sans-serif;}a:link, span.MsoHyperlink {mso-style-priority:99; color:blue; text-decoration:underline;}code {mso-style-priority:99; font-family:"Courier New";}p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph {mso-style-priority:34; margin-top:0in; margin-right:0in; margin-bottom:0in; margin-left:.5in; font-size:11.0pt; font-family:"Calibri",sans-serif;}.MsoChpDefault {mso-style-type:export-only;}div.WordSection1 {page:WordSection1;}ol {margin-bottom:0in;}ul {margin-bottom:0in;}

Thanks everyone. The reason I asked this question is, my system has a complex query, which uses CTE, then join 7 other tables. It also has where clause using json data filter. Sometimes it runs slowly(5 minutes), but sometimes it runs fast(2 seconds), (query text is same , no parameter). I know it is because data distribution change. The table is  updated frequently.

I compared the execution plan, the fast query joins other tables first, then join CTE, however, the slow question joins CTE first, then join other tables. I analyzed the tables, it didn’t work, I increased the column sampling from 100 to 10000 for all the columns used in where clause, didn’t work. I set the random io cost to 1, it didn’t work.

I am not able to modify the query. The only solution I find is

  1. use join_collapse_limit to force the join order, however, it has limitation, I need to modify the code to set it before submitting the query, and set it back to original value after query run, in case it didn’t change back, other query will affect.
  2. Set both sequence and random io cost to 0.1, it can always generate correct plan.

So I suspect the cost model parameters are not fully optimized on my system, the fast and slow execution plan cost might be very close, so a little bit data change will cause the plan change, and generated a wrong plan.

Thanks

James

 

Sent from Mail for Windows

 

From: Tom Lane
Sent: Tuesday, May 31, 2022 10:44 AM
To: Laurenz Albe
Cc: Jeff Janes; jian xu; pgsql-admin@postgresql.org
Subject: Re: how to calibrate the cost model parameters

 

Laurenz Albe <laurenz.albe@cybertec.at> writes:
> I think that the values of those parameters are more set by experience and tradition than
> by measurable physical evidence.  In a way, it just happens to work (mostly).
> For example, who says that on a spinning disk, random I/O is four times as slow as
> sequential I/O?  Very likely, you have to factor in that part of the I/O requests are
> satisfied from the kernel page cache.

The 4.0 value *is* based on some real evidence --- many years ago, I put
in a lot of machine time to get an estimate of that for non-cached fetches
on the hardware I was using at the time.  If somebody did that over again
today, they'd likely get a somewhat different value, even assuming they
were still using spinning-rust storage.  But it doesn't matter too much.
A value around that is probably fine for rotating storage, while if you
are using SSD it's likely better to use something around 1.0, and beyond
that any inaccuracies in the cost constants are usually swamped by
imperfections of the cost models and other data.  For instance, it's not
that useful to sweat about what the right value is for cpu_operator_cost
when we've never spent any serious effort on assigning plausible procost
factors to different functions.  (Worse, the whole notion that a given
function has a fixed cost independent of its input data is sadly
inadequate.)

                        regards, tom lane

 


--
Angular momentum makes the world go 'round.

RE: how to calibrate the cost model parameters

От
jian xu
Дата:

Thanks Ron. This is what we are doing right now. we have job to analyze tables every 2 hours. However, the update statistics jobs cause the execution plan change, and query runs slowly sometimes.

I think the problem is, the “good” and “bad” query execution plan cost are close, when data changes, analyze table will update the statistics, and make the plan cost change, due to the cost model parameters is not fully optimized, it picks the “bad” execution plan.

Thanks,

James

 

Sent from Mail for Windows

 

From: Ron
Sent: Thursday, June 2, 2022 2:17 AM
To: pgsql-admin@lists.postgresql.org
Subject: Re: how to calibrate the cost model parameters

 

Manually (aka through a cron script) ANALYZE those seven tables on a regular basis.  During the business day, I ANALYZE some tables every two hours.

On 5/31/22 21:20, jian xu wrote:

Thanks everyone. The reason I asked this question is, my system has a complex query, which uses CTE, then join 7 other tables. It also has where clause using json data filter. Sometimes it runs slowly(5 minutes), but sometimes it runs fast(2 seconds), (query text is same , no parameter). I know it is because data distribution change. The table is  updated frequently.

I compared the execution plan, the fast query joins other tables first, then join CTE, however, the slow question joins CTE first, then join other tables. I analyzed the tables, it didn’t work, I increased the column sampling from 100 to 10000 for all the columns used in where clause, didn’t work. I set the random io cost to 1, it didn’t work.

I am not able to modify the query. The only solution I find is

  1. use join_collapse_limit to force the join order, however, it has limitation, I need to modify the code to set it before submitting the query, and set it back to original value after query run, in case it didnt change back, other query will affect.
  2. Set both sequence and random io cost to 0.1, it can always generate correct plan.

So I suspect the cost model parameters are not fully optimized on my system, the fast and slow execution plan cost might be very close, so a little bit data change will cause the plan change, and generated a wrong plan.

Thanks

James

 

Sent from Mail for Windows

 

From: Tom Lane
Sent: Tuesday, May 31, 2022 10:44 AM
To: Laurenz Albe
Cc: Jeff Janes; jian xu; pgsql-admin@postgresql.org
Subject: Re: how to calibrate the cost model parameters

 

Laurenz Albe <laurenz.albe@cybertec.at> writes:
> I think that the values of those parameters are more set by experience and tradition than
> by measurable physical evidence.  In a way, it just happens to work (mostly).
> For example, who says that on a spinning disk, random I/O is four times as slow as
> sequential I/O?  Very likely, you have to factor in that part of the I/O requests are
> satisfied from the kernel page cache.

The 4.0 value *is* based on some real evidence --- many years ago, I put
in a lot of machine time to get an estimate of that for non-cached fetches
on the hardware I was using at the time.  If somebody did that over again
today, they'd likely get a somewhat different value, even assuming they
were still using spinning-rust storage.  But it doesn't matter too much.
A value around that is probably fine for rotating storage, while if you
are using SSD it's likely better to use something around 1.0, and beyond
that any inaccuracies in the cost constants are usually swamped by
imperfections of the cost models and other data.  For instance, it's not
that useful to sweat about what the right value is for cpu_operator_cost
when we've never spent any serious effort on assigning plausible procost
factors to different functions.  (Worse, the whole notion that a given
function has a fixed cost independent of its input data is sadly
inadequate.)

                        regards, tom lane

 

 

--
Angular momentum makes the world go 'round.