Обсуждение: URGENT HELP about 'duration' stats

Поиск
Список
Период
Сортировка

URGENT HELP about 'duration' stats

От
Camilo Porto
Дата:
<div style="text-align: left;"><span class="q">Hi,<br /><br /></span>I have been doing my master degree research  and 
Iam using Postgresql 8.2.5. I would appreciate if anyone could help me, ASAP, concerning the following question: <br
/><br/>How can I obtain the time spent by the Postgresql  EXECUTOR  during a given time interval? <br /> <br />I am
tryingto get the above information in the following way::<br /> <br />Using the BenchmarkSQL Software, I submit random
transactions(by simulating the TPC-C benchmark) during a time interval (say 30 seconds). <br /><br />PostgreSQL then
generatesstatistics for each phase that a SQL Command is executed. In thje following, it is shown the time duration of
theEXECUTE Phase of some SQL Command:<span class="q"><br /> <br />.<br />.<br />.<br />.<br />2007-10-11 15:49: 05.221
BRT[46994] LOG:  duration: 0.012 ms  execute S_1: BEGIN<br />2007-10-11 15:49:05.430 BRT [46994] LOG:  duration: 94.073
ms execute <unnamed>: SELECT c_discount, c_last, c_credit, w_tax  FROM customer, warehouse WHERE <br /> w_id = $1
ANDw_id = c_w_id AND c_d_id = $2 AND c_id = $3 <br />2007-10-11 15:49:05.462 BRT [46994] LOG:  duration: 13.662 ms 
execute<unnamed>: SELECT d_next_o_id, d_tax FROM district WHERE d_id = $1 AND d_w_id = $2 FO <br />R UPDATE<br
/>2007-10-1115:49:05.562 BRT [46994] LOG:  duration: 98.818 ms  execute <unnamed>: INSERT INTO NEW_ORDER
(no_o_id,no_d_id, no_w_id) VALUES ( $1, $2, $3)<br />2007-10-11 15:49:05.579 BRT [46994] LOG:  duration: 15.459 ms 
execute<unnamed>: UPDATE district SET d_next_o_id = d_next_o_id + 1  WHERE d_id = $1 AND d_w <br />_id = $2<br
/>2007-10-1115:49:05.748 BRT [46994] LOG:  duration: 122.387 ms  execute <unnamed>: INSERT INTO OORDER  (o_id,
o_d_id,o_w_id, o_c_id, o_entry_d, o_ol_cnt, o <br />_all_local) VALUES ($1, $2, $3, $4, $5, $6, $7) <br />2007-10-11
15:49:05.779BRT [46994] LOG:  duration: 29.787 ms  execute <unnamed>: SELECT i_price, i_name , i_data FROM item
WHEREi_id = $1<br />2007-10-11 15:49: 05.846 BRT [46994] LOG:  duration: 64.963 ms  execute <unnamed>: SELECT
s_quantity,s_data, s_dist_01, s_dist_02, s_dist_03, s_dist_04, s_d <br />ist_05,        s_dist_06, s_dist_07,
s_dist_08,s_dist_09, s_dist_10 FROM stock WHERE s_i_id = $1 AND s_w_id = $2 FOR UPDATE <br />2007-10-11 15:49:05.882
BRT[46994] LOG:  duration: 33.865 ms  execute <unnamed>: SELECT i_price, i_name , i_data FROM item WHERE i_id =
$1<br />2007-10-11 15:49:05.931 BRT [46994] LOG:  duration: 48.075 ms  execute <unnamed>: SELECT s_quantity,
s_data,s_dist_01, s_dist_02, s_dist_03, s_dist_04, s_d <br />ist_05,        s_dist_06, s_dist_07, s_dist_08, s_dist_09,
s_dist_10FROM stock WHERE s_i_id = $1 AND s_w_id = $2 FOR UPDATE <br />2007-10-11 15:49:05.962 BRT [46994] LOG: 
duration:28.782 ms  execute <unnamed>: SELECT i_price, i_name , i_data FROM item WHERE i_id = $1 <br />2007-10-11
15:49:06.002BRT [46994] LOG:  duration: 38.026 ms  execute <unnamed>: SELECT s_quantity, s_data, s_dist_01,
s_dist_02,s_dist_03, s_dist_04, s_d <br />ist_05,        s_dist_06, s_dist_07, s_dist_08, s_dist_09, s_dist_10 FROM
stockWHERE s_i_id = $1 AND s_w_id = $2 FOR UPDATE <br />2007-10-11 15:49:06.027 BRT [46994] LOG:  duration: 23.220 ms 
execute<unnamed>: SELECT i_price, i_name , i_data FROM item WHERE i_id = $1 <br />2007-10-11 15:49:06.043 BRT
[46994]LOG:  duration: 14.308 ms  execute <unnamed>: SELECT s_quantity, s_data, s_dist_01, s_dist_02, s_dist_03,
s_dist_04,s_d <br />ist_05,        s_dist_06, s_dist_07, s_dist_08, s_dist_09, s_dist_10 FROM stock WHERE s_i_id = $1
ANDs_w_id = $2 FOR UPDATE <br />.<br />.<br />.<br />.<br /><br /></span>Trying to answer my initial question, I am
doingsomething like this: <br /><br />1. Through the statements TIMESTAMP, I calculate the total time interval in which
allthe statments had been executed: (END TIMESTAMP(2007-10-11 15:49: 06.043) - BEGIN TIMESTAMP (2007-10-11
15:49:05.221))<br/>2. Then I sum all the 'duration' fields of all EXECUTOR Sections of all statements executed. This,
inthesis, give me the amount of time that the executor had been used during the total time interval in which the
statementshad been executed) <br /><br />The problem I have encountered is that the sum of executor's duration time is,
*sometimes*,bigger than the total time interval in which the statements had been executed!! And this makes no sense!<br
/><blockquoteclass="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0px 0px 0px 0.8ex;
padding-left:1ex;">In some tests I have done, I have found these results:<span class="q"><br />- Total Time interval
whichBenchmarkSQL submitted queries = 30 seconds <br /></span>- sum of executor's duration time during this period of
time= 2 minutes and 36 seconds!! (5x longer than total time) <br /><br /><br />How could the sum of duration time of
executorbe longer than the sum of duration of the statements that use it? <br /><br />Can anyone, please, help me to
solvethis? Many thanks, Camilo</blockquote></div><br /><hr />Veja mapas e encontre as melhores rotas para fugir do
trânsitocom o Live Search Maps! <a href="http://www.livemaps.com.br/index.aspx?tr=true" target="_new">Experimente
já!</a>

Re: URGENT HELP about 'duration' stats

От
Tom Lane
Дата:
Camilo Porto <camiloporto@hotmail.com> writes:
> The problem I have encountered is that the sum of executor's
> duration time is, *sometimes*, bigger than the total time interval in
> which the statements had been executed!! And this makes no sense!

Umm ... why not?  If you have, say, two queries executing in parallel
for 1 second, they'll each report a duration: of 1 second, thus summing
to 2 seconds, but the elapsed time was only 1 second.

If you don't see that always, then your benchmark program isn't trying
very hard to run more than one query in parallel ...
        regards, tom lane


Re: URGENT HELP about 'duration' stats

От
Camilo Porto
Дата:


[Camilo Porto]



> To: camiloporto@hotmail.com
> CC: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] URGENT HELP about 'duration' stats
> Date: Fri, 26 Oct 2007 23:06:22 -0400
> From: tgl@sss.pgh.pa.us
>
> Camilo Porto <camiloporto@hotmail.com> writes:
> > The problem I have encountered is that the sum of executor's
> > duration time is, *sometimes*, bigger than the total time interval in
> > which the statements had been executed!! And this makes no sense!
>
> Umm ... why not? If you have, say, two queries executing in parallel
> for 1 second, they'll each report a duration: of 1 second, thus summing
> to 2 seconds, but the elapsed time was only 1 second.
>
> If you don't see that always, then your benchmark program isn't trying
> very hard to run more than one query in parallel ...

This really make sense, but let me add some questions:

The parallelism happens even if my PC has only one processor?
Each query is executed in a separeted Thread?
I am simulating only 1 client with the Benchmark. Can 1 Client submit parallel queries, in single-processor enviroment?

Many Thanks
>
> regards, tom lane


Veja mapas e encontre as melhores rotas para fugir do trânsito com o Live Search Maps! Experimente já!

Re: URGENT HELP about 'duration' stats

От
"Filip Rembiałkowski"
Дата:
2007/10/27, Camilo Porto <camiloporto@hotmail.com>:

> The parallelism happens even if my PC has only one processor?

PostgreSQL creates a separate process for each client connection -
whether you have one CPU or more.

> Each query is executed in a separeted Thread?
No threads, at least on Unix. Regular processes (sometimes called backends).

> I am simulating only 1 client with the Benchmark. Can 1 Client submit
> parallel queries, in single-processor enviroment?
If this client uses two connections, you can run two queries in paralell.


this chapter of documentation can be helpful:
http://www.postgresql.org/docs/8.2/static/overview.html


Re: URGENT HELP about 'duration' stats

От
Camilo Porto
Дата:


[Camilo Porto]



> Date: Sat, 27 Oct 2007 18:12:28 +0100
> From: plk.zuber@gmail.com
> To: camiloporto@hotmail.com
> Subject: Re: [HACKERS] URGENT HELP about 'duration' stats
> CC: pgsql-hackers@postgresql.org
>
> 2007/10/27, Camilo Porto <camiloporto@hotmail.com>:
>
> > The parallelism happens even if my PC has only one processor?
>
> PostgreSQL creates a separate process for each client connection -
> whether you have one CPU or more.
>
> > Each query is executed in a separeted Thread?
> No threads, at least on Unix. Regular processes (sometimes called backends).
>
> > I am simulating only 1 client with the Benchmark. Can 1 Client submit
> > parallel queries, in single-processor enviroment?
> If this client uses two connections, you can run two queries in paralell.
    The client uses only 1 connection. In this situation is possible that the EXECUTOR's duration time become greater than the time period which the Query was observed? (as stated in my first topic)?
>
>
> this chapter of documentation can be helpful:
> http://www.postgresql.org/docs/8.2/static/overview.html


Receba GRÁTIS as mensagens do Messenger no seu celular quando você estiver offline. Conheça o MSN Mobile! Cadastre-se já!

Re: URGENT HELP about 'duration' stats

От
"Filip Rembiałkowski"
Дата:
2007/10/30, Camilo Porto <camiloporto@hotmail.com>:

> > > I am simulating only 1 client with the Benchmark. Can 1 Client submit
> > > parallel queries, in single-processor enviroment?
> > If this client uses two connections, you can run two queries in paralell.
>     The client uses only 1 connection. In this situation is possible that
> the EXECUTOR's duration time become greater than the time period which the
> Query was observed? (as stated in my first topic)?

I guess it's possible under some circumstances...
2007-10-30 16:07:00 GMT [123] LOG:  duration: 99000.000 ms  select longfunc()
2007-10-30 16:07:01 GMT [123] LOG:  duration: 1000.000 ms  select shortfunc()
interval is 1 second, sum of durations 100 seconds :)


AFAIK, timestamps in the front of each line are assigned by log
writer, ie. *in the moment of writing* to the log. I'd better trust
"duration: xxx ms " messages. they are calculated in backend directly.


In this log sample you showed us, the sum of durations is circa  625
ms. and the interval between first and last log entry is circa 822 ms.
If you have a test case which shows that much difference you speak of,
could you please present it here, along with your logging settings?



--
Filip Rembiałkowski

Re: URGENT HELP about 'duration' stats

От
Alvaro Herrera
Дата:
Camilo Porto wrote:

> In some tests I have done, I have found these results:
> - Total Time interval which BenchmarkSQL submitted queries = 30 seconds
> 
> - sum of executor's duration time during this period of time = 2
> minutes and 36 seconds!! (5x longer than total time) 

Are you sure you're not confusing the units?  The log you posted shows
milliseconds ...

Another thought is that some queries may count the time more than once.
For example if you submit an UPDATE that acts on a table with foreign
keys, and then the SELECT FOR UPDATE that the FK code does internally is
also logged, I think the UPDATE would include the time consumed in the
SELECT FOR UPDATE.  So if you just sum both, the total obtained would
include the SELECT FOR UPDATE twice.  This is just a wild guess; I'm not
sure if the queries for FKs checking are logged.

-- 
Alvaro Herrera       Valdivia, Chile   ICBM: S 39º 49' 18.1", W 73º 13' 56.4"
"Everybody understands Mickey Mouse. Few understand Hermann Hesse.
Hardly anybody understands Einstein. And nobody understands Emperor Norton."


Re: URGENT HELP about 'duration' stats

От
Camilo Porto
Дата:


[Camilo Porto]



> Date: Tue, 30 Oct 2007 16:09:01 +0000
> From: plk.zuber@gmail.com
> To: camiloporto@hotmail.com
> Subject: Re: [HACKERS] URGENT HELP about 'duration' stats
> CC: pgsql-hackers@postgresql.org
>
> 2007/10/30, Camilo Porto <camiloporto@hotmail.com>:
>
> > > > I am simulating only 1 client with the Benchmark. Can 1 Client submit
> > > > parallel queries, in single-processor enviroment?
> > > If this client uses two connections, you can run two queries in paralell.
> > The client uses only 1 connection. In this situation is possible that
> > the EXECUTOR's duration time become greater than the time period which the
> > Query was observed? (as stated in my first topic)?
>
> I guess it's possible under some circumstances...
> 2007-10-30 16:07:00 GMT [123] LOG: duration: 99000.000 ms select longfunc()
> 2007-10-30 16:07:01 GMT [123] LOG: duration: 1000.000 ms select shortfunc()
> interval is 1 second, sum of durations 100 seconds :)
>
>
> AFAIK, timestamps in the front of each line are assigned by log
> writer, ie. *in the moment of writing* to the log. I'd better trust
> "duration: xxx ms " messages. they are calculated in backend directly.
>
>
> In this log sample you showed us, the sum of durations is circa 625
> ms. and the interval between first and last log entry is circa 822 ms.
> If you have a test case which shows that much difference you speak of,
> could you please present it here, along with your logging settings?

I tried to generate a log file that could present the above situation, but was not sucessfully! This case happens when I wa using version 7.4 os PostgreSQL. Using 8.2.5, the problem (with 1 client and 1 connection) appear not to happen anymore! I've just tried with more than one client (1 connection per client) and the problem turn to appear. But now this make sense (considering the comments submitted to this topic).

Well, I will think about how can I calculate the % of time the executor passing executing queries along a period of time.

I would be grateful if someone could help me to resolve this...

Many Thanks!!!

Camilo
>
>
>
> --
> Filip Rembiałkowski


Conheça o Windows Live Spaces, o site de relacionamentos do Messenger! Crie já o seu!