Обсуждение: Help with duration of statement: EXECUTE [PREPARE: COMMIT]

Поиск
Список
Период
Сортировка

Help with duration of statement: EXECUTE [PREPARE: COMMIT]

От
Eric Comeau
Дата:
We currently have

   log_min_duration_statement = 5000

and are seeing statements like the following logged

2010-10-16 05:55:52 EDT [6334]: [1-1] LOG:  duration: 5572.517 ms
statement: EXECUTE <unnamed>  [PREPARE:  COMMIT]
2010-10-16 06:06:24 EDT [26856]: [1-1] LOG:  duration: 5617.866 ms
statement: EXECUTE <unnamed>  [PREPARE:  COMMIT]
2010-10-16 06:06:24 EDT [20740]: [13-1] LOG:  duration: 5210.190 ms
statement: EXECUTE <unnamed>  [PREPARE:  COMMIT]
2010-10-16 08:24:06 EDT [8743]: [1-1] LOG:  duration: 6487.346 ms
statement: EXECUTE <unnamed>  [PREPARE:  COMMIT]

Questions;

1) What do these statements mean?
2) How do I dig deeper to determine why they are taking longer than 5 secs.

Version Info -->

select version();
                                                   version

------------------------------------------------------------------------------------------------------------
  PostgreSQL 8.1.17 on x86_64-unknown-linux-gnu, compiled by GCC gcc
(GCC) 4.1.1 20070105 (Red Hat 4.1.1-52)

Re: Help with duration of statement: EXECUTE [PREPARE: COMMIT]

От
Tom Lane
Дата:
Eric Comeau <ecomeau@signiant.com> writes:
> 2010-10-16 05:55:52 EDT [6334]: [1-1] LOG:  duration: 5572.517 ms
> statement: EXECUTE <unnamed>  [PREPARE:  COMMIT]
> 2010-10-16 06:06:24 EDT [26856]: [1-1] LOG:  duration: 5617.866 ms
> statement: EXECUTE <unnamed>  [PREPARE:  COMMIT]
> 2010-10-16 06:06:24 EDT [20740]: [13-1] LOG:  duration: 5210.190 ms
> statement: EXECUTE <unnamed>  [PREPARE:  COMMIT]
> 2010-10-16 08:24:06 EDT [8743]: [1-1] LOG:  duration: 6487.346 ms
> statement: EXECUTE <unnamed>  [PREPARE:  COMMIT]

> Questions;

> 1) What do these statements mean?

They appear to be COMMIT commands.  (It's pretty stupid to be using the
PREPARE/EXECUTE machinery to execute a COMMIT, but that's evidently what
your client-side code is doing.)

> 2) How do I dig deeper to determine why they are taking longer than 5 secs.

My guess would be overstressed disk subsystem.  A COMMIT doesn't require
much except fsync'ing the commit WAL record down to disk ... but if the
disk is too busy to process that request quickly, you might have to
wait.  It also seems possible that the filesystem is interlocking the
fsync on the WAL file with previous writes to other files.  Anyway,
watching things with vmstat or iostat to see if there's an activity
spike when this is happening would confirm or refute that idea.

[ thinks for a bit ... ]  Actually, it's possible that the COMMIT
command is doing nontrivial work before it can really commit.  Perhaps
you have deferred foreign keys to check?

            regards, tom lane

Re: Help with duration of statement: EXECUTE [PREPARE: COMMIT]

От
Mladen Gogala
Дата:
Tom Lane wrote:
> My guess would be overstressed disk subsystem.  A COMMIT doesn't require
> much except fsync'ing the commit WAL record down to disk ...
Doesn't the "commit" statement also release all the locks held by the
transaction?

--

Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
http://www.vmsinfo.com
The Leader in Integrated Media Intelligence Solutions




Re: Help with duration of statement: EXECUTE [PREPARE: COMMIT]

От
Tom Lane
Дата:
Mladen Gogala <mladen.gogala@vmsinfo.com> writes:
> Tom Lane wrote:
>> My guess would be overstressed disk subsystem.  A COMMIT doesn't require
>> much except fsync'ing the commit WAL record down to disk ...

> Doesn't the "commit" statement also release all the locks held by the
> transaction?

Yeah, and there's a nontrivial amount of other cleanup too, but it all
amounts to just changes in in-memory data structures.  I don't see that
taking five seconds, especially not if commits of similar transactions
usually take much less than that.

            regards, tom lane

Re: Help with duration of statement: EXECUTE [PREPARE: COMMIT]

От
Eric Comeau
Дата:
On 10-10-18 11:02 AM, Tom Lane wrote:
> Mladen Gogala<mladen.gogala@vmsinfo.com>  writes:
>> Tom Lane wrote:
>>> My guess would be overstressed disk subsystem.  A COMMIT doesn't require
>>> much except fsync'ing the commit WAL record down to disk ...
>
>> Doesn't the "commit" statement also release all the locks held by the
>> transaction?
>
> Yeah, and there's a nontrivial amount of other cleanup too, but it all
> amounts to just changes in in-memory data structures.  I don't see that
> taking five seconds, especially not if commits of similar transactions
> usually take much less than that.
>
>             regards, tom lane
>

Thanks for the info. The system is a QA system under load. It is running
200 jobs per minute, so yes the disk it stressed. Our application
bundles PG into its install and installs the app and database all on the
same filesystem. The QA folks probably have lots of logging turned on as
well.

I am not sure what front-end client is doing the prepare/execute on a
commit - I found it strange, I'm glad someone else does as well.

The web app is using jboss with connection pooling, but there is a
scheduler built in C using libpq as well.

Thanks for the hint on deferred fk, I'll check, but I think if that was
the case it would be happening much more often - like maybe almost all
commits for this transaction type.

The OS is RH 5.2 64-bit, and I'm surprised they don't have iostat
installed on it by default. There is vmstat. The load avg is

06:36:49 up 28 days, 15:20,  5 users,  load average: 19.44, 22.59, 22.50

Okay - I'm starting to see other stmts other than just commits taking
longer than 5 secs sometimes as well now - stress test has been running
for 3 days now...some commits 17 and 15 secs ouch...

  2010-10-19 05:44:35 EDT [11760]: [10-1] LOG:  duration: 17137.425 ms
statement: commit
2010-10-19 05:44:36 EDT [10704]: [14-1] LOG:  duration: 14928.903 ms
statement: EXECUTE <unnamed>  [PREPARE:  COMMIT]
2010-10-19 05:44:36 EDT [12535]: [1-1] LOG:  duration: 13241.032 ms
statement: EXECUTE <unnamed>  [PREPARE:  update scheduled_job set
active_filename=$1, active_state=$2, begin_time=$3, changed_by=$4,
changed_on=$5, created_by=$6, created_on=$7, current_run=$8,
deferred_time=$9, deleted=$10, end_time=$11, expire_at=$12,
frequency_spec=$13, job_class=$14, contract_id=$15, job_name=$16,
last_active_status_msg=$17, last_exit_code=$18, package_id=$19,
perc_denominator=$20, perc_numerator=$21, retry_at=$22,
scheduled_at=$23, scheduled_state=$24, start_at=$25, states_list=$26,
timezone=$27, total_runs=$28 where id=$29]
2010-10-19 05:44:41 EDT [11760]: [11-1] LOG:  duration: 6000.118 ms
statement: commit
2010-10-19 05:44:49 EDT [10704]: [15-1] LOG:  duration: 13804.450 ms
statement: EXECUTE <unnamed>  [PREPARE:  COMMIT]
2010-10-19 05:44:49 EDT [12535]: [2-1] LOG:  duration: 13807.317 ms
statement: EXECUTE <unnamed>  [PREPARE:  COMMIT]
2010-10-19 05:45:00 EDT [11760]: [12-1] LOG:  duration: 18879.010 ms
statement: commit
2010-10-19 05:45:18 EDT [10704]: [16-1] LOG:  duration: 28177.626 ms
statement: EXECUTE <unnamed>  [PREPARE:  COMMIT]
2010-10-19 05:45:20 EDT [11760]: [13-1] LOG:  duration: 19740.822 ms
statement: commit
2010-10-19 05:45:20 EDT [13093]: [1-1] LOG:  duration: 20828.412 ms
statement: EXECUTE <unnamed>  [PREPARE:  COMMIT]


I do not have a vmstat to look at from when the stmts above executed,
wish I did, here is vmstat 5, now but at this time everything is
executing under 5 secs... procs -----------memory---------- ---swap--
-----io---- --system-- -----cpu------
  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy
id wa st
65  0   3232 340480 166812 2212884    0    0     0  1519 1676 50841 70
30  0  0  0
57  0   3232 273332 166856 2212796    0    0     0  1157 1609 52887 69
31  0  0  0
73  0   3232 320668 166884 2212668    0    0     0   781 1458 53420 70
30  0  0  0
44  0   3232 393240 166900 2213272    0    0    43  1336 1578 53155 70
30  0  0  0
42  0   3232 349176 166928 2213244    0    0     2   656 1449 52006 70
30  0  0  0
35  0   3232 299320 166972 2213436    0    0     3  1312 1582 51126 75
25  0  0  0
68  0   3232 265868 167012 2213420    0    0     0   739 1484 51982 74
26  0  0  0
42  0   3232 234672 167048 2212440    0    0     2   772 1550 50536 74
26  0  0  0
72  0   3232 252232 167080 2213004    0    0     0  1192 1616 48063 77
23  0  0  0
56  0   3232 336852 167112 2213220    0    0     0   699 1433 50655 78
22  0  0  0
38  0   3232 302212 167148 2213380    0    0     0   786 1578 49895 76
24  0  0  0
61  0   3232 381884 167180 2213260    0    0     6   943 1525 46474 77
23  0  0  0
66  0   3232 366568 167216 2213716    0    0     0  1150 1491 39232 82
18  0  0  0
93  0   3232 343792 167232 2213680    0    0     2   946 1504 39030 82
18  0  0  0
66  0   3232 377376 167268 2213260    0    0     0   954 1427 37206 84
16  0  0  0
60  0   3232 319552 167288 2212952    0    0     0   385 1365 34413 83
17  0  0  0
53  0   3232 320024 167400 2213184    0    0     2  3119 1576 33904 81
19  0  0  0
42  0   3232 256116 167432 2213716    0    0     0  1062 1501 32128 85
15  0  0  0
11  0   3232 783712 167480 2224604    0    0     2  3219 1499 33598 79
21  0  0  0
52  0   3232 828444 167520 2224668    0    0     3  1129 1429 40321 70
30  0  0  0
29  0   3232 933804 167548 2224828    0    0     8  1197 1384 37422 71
29  0  0  0
33  0   3232 974348 167560 2224956    0    0     0  1049 1438 34956 70
31  0  0  0
31  0   3232 941496 167588 2224956    0    0     0   716 1346 34662 68
32  0  0  0
40  0   3232 846540 167616 2225032    0    0     0   758 1426 35924 70
30  0  0  0


Something I just realized is they are running LVM as well - and I'm not
very up on LVM - here is lvs output
lvs --aligned
   LV       VG         Attr   LSize   Origin Snap%  Move Log Copy%  Convert
   LogVol00 VolGroup00 -wi-ao 146.97G
   LogVol01 VolGroup00 -wi-ao   1.94G





Re: Help with duration of statement: EXECUTE [PREPARE: COMMIT]

От
Tom Lane
Дата:
Eric Comeau <ecomeau@signiant.com> writes:
> Okay - I'm starting to see other stmts other than just commits taking
> longer than 5 secs sometimes as well now - stress test has been running
> for 3 days now...some commits 17 and 15 secs ouch...

If it's not just commits then some of the stranger theories go away.
I think you're almost certainly looking at overstressed disk.

It'd be worth your while to turn on checkpoint logging and see if
the slow operations occur during checkpoints.  If so, you might be
able to ameliorate things by tweaking the checkpoint parameters
to spread out checkpoint I/O load some more.  But the bottom line
here is that you haven't got much headroom between your application's
needs and the max throughput available from your disk.  Better disk
hardware will be the only real cure.

            regards, tom lane