Обсуждение: 7.2.1 optimises very badly against 7.2
Samuel Liddicott http://www.ananova.com
Support Consultant
sam@ananova.com
Direct Dial: +44 (0)113 367 4523
Fax: +44 (0)113 367 4680
Switchboard: +44 (0)113 367 4600Ananova Limited
Marshall Mill
Marshall Street
Leeds
LS11 9YJ
St James Court
Great Park Road
Almondsbury Park
Bradley Stoke
Bristol BS32 4QJ
Registered in England No.2858918
The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you receive this in error, please contact the sender and delete the material from any computer.
On Fri, 28 Jun 2002, Sam Liddicott wrote: > I have a 10GB database which serves up customised tv listings selections at > www.ananova.com/tv <http://www.ananova.com/tv> (see "Personalise Listings" > in the left column) > > We had it running very well under postgres 7.2 on a 4xPentium 700mhz with > 8GB RAM > > For a personalised selection from "start" to "end" of 7 channels [url1 > below] > takes a fraction of a second to do the query and if I'm lucky enough to spot > it on "top" it uses low CPU percentage. > > Under 7.2.1 it takes 99% CPU for between 5-9 seconds. > Rolling back to 7.2 is also very slow unless we vacuum analyse after rolling > back, then it is very fast again. > [We normally vacuum analyse every 24 hours] > > We have max connections=128, 4520 shared buffers 91268 sortmem and collect > row level and block level stats. I'm thinking that you need to run some queries with explain analyze ... to see what the planner thinks it should do and how it's estimates differ from reality under both 7.2 and 7.2.1. > I also have this anecodotal information; > we installed 7.2.1 on a fresh box and then restored a pg_dump of the tv > database and found it always very slow, stopping postgres and transferring > the binary DB files and restarting was very fast but degraded slowly over a > few days. I know nothing of the filesystem structure but that sounds odd. Like there's something introduced into 7.2.1 that's inherently slow. I assume there's also data loads over those few days. > So I imagine there is something about stats gathering and use changed with > 7.2.1 (I hear it has a new optimiser). Well you'll be able to see if it's the stats. that are causing this by doing explains and comparing pg_stats for the 7.2 and pg_restored 7.2.1. You could also do the binary transfer from 7.2 to 7.2.1 again and check the explains and pg_stats immediately after and then after it's slowed down. This is probably most useful if there are _no_ data loads in the meantime. > The query we do is complex and really does need a good optimiser (why we > don't use mysql) as it has to join programmes against broadcasts (restricted > by time) to channels (restricted to interested channels). It has to be > careful not to initially start with all broadcasts of a interested channel > as well as not all broadcasts on the interested channels. > > [url1, 7 channels from midnight to about 6:00am day after] > http://www.ananova.com/tv_listings/tv_mainlisting.html?day=day1 > <http://www.ananova.com/tv_listings/tv_mainlisting.html?day=day1&start=Start > &end=End&tvregion=n_14&i=1&p=0&h=&c=12.25.36.53.54.86.33&S=1> > &start=Start&end=End&tvregion=n_14&i=1&p=0&h=&c=12.25.36.53.54.86.33&S=1 > Interesting that your stuff completes so quick normally. I worked on a large TV listings site that had complex queries and ran with Oracle. It's queries took ages to run mostly. I didn't design the system btw although I did have a hand in some lucky chap winning two or three short breaks one week when we tweaked a slow query and it turned out to be slightly more complex than we thought when we tweaked it. -- Nigel J. Andrews Director --- Logictree Systems Limited Computer Consultants
"Sam Liddicott" <sam.liddicott@ananova.com> writes:
> Under 7.2.1 it takes 99% CPU for between 5-9 seconds.
> Rolling back to 7.2 is also very slow unless we vacuum analyse after rolling
> back, then it is very fast again.
> [We normally vacuum analyse every 24 hours]
AFAIK, the only change from 7.2 to 7.2.1 that would be likely to have
anything to do with this was a rework of the code in ANALYZE that
estimates the number of distinct values in a column. (See pghackers
archives from around 18-Feb.) Although the revised version did better
on some test cases, it sounds like it's doing worse for you.
As Nigel commented, we can't do much without seeing EXPLAIN ANALYZE
results. I'd also like to see the pg_stats entries for the table(s)
used by the query, as produced by both 7.2 ANALYZE and 7.2.1 ANALYZE.
regards, tom lane
> -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: 28 June 2002 15:13 > To: Sam Liddicott > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] 7.2.1 optimises very badly against 7.2 > > As Nigel commented, we can't do much without seeing EXPLAIN ANALYZE > results. I'd also like to see the pg_stats entries for the table(s) > used by the query, as produced by both 7.2 ANALYZE and 7.2.1 ANALYZE. Thanks for your comments (and everyone else). I'll get these stats and explain analyse out on Monday and we can have a good luck. Just doing some swig-java work for my boss. Sam
A bit late, but here is the explain analyse data for the problem where a
complex DB was able to produce very quick results with low load with
postgres 7.2 but takes many many times longer with postgres 7.2.1.
O notice missing in the 7.2.1 (slow) explain analyse this part:
"Index Scan using idx_broadcast_channelregionid on broadcast"
Here is the query:
SELECT distinct
channelregion."id",
channelregion."customtitle",
channelregion."title" as channeltitle,
channelregion."tag" as channeltag,
channelregion."distributionid",
channelregion."channelid",
distribution."description",
broadcast."id" as broadcastid,
broadcast."groupid",
broadcast."duration",
broadcast."start" as stime,
broadcast."stereo",
broadcast."subtitles" as subtitle,
broadcast."repeat",
broadcast."blackandwhite" as bw,
broadcast."premiere",
broadcast."surround",
broadcast."widescreen",
broadcast."followon",
episode."id" as episodeid,
episode."title" as title,
episode."seriestitle" as seriestitle,
episode."categories",
episode."episodename",
episode."episodereference",
episode."episodenumber",
episode."episodecount",
episode."detail0",
episode."detail1",
episode."detail2",
episode."created" as filmyear
INTO TEMPORARY TABLE selection
FROM
"channelregion" channelregion,
"broadcast" broadcast,
"distribution" distribution,
"episode" episode
WHERE
broadcast.channelregionid=channelregion.id AND
channelregion."distributionid" = distribution."id" AND
broadcast.episode=episode.id AND
(((broadcast.start+broadcast.duration)>1026120300) AND
(broadcast.sourcekey<=20020708) AND
((channelregion.id in (2,20,41,53,54,733,734,86,33))))
ORDER BY
broadcast."start" ASC;
Here is explain analyse on a postgres 7.2.1 box:
Unique (cost=99202.15..99607.55 rows=523 width=279) (actual
time=7932.43..7936.36 rows=276 loops=1)
-> Sort (cost=99202.15..99202.15 rows=5231 width=279) (actual
time=7932.41..7932.73 rows=276 loops=1)
-> Nested Loop (cost=78.02..98879.06 rows=5231 width=279) (actual
time=2779.61..7926.74 rows=276 loops=1)
-> Hash Join (cost=78.02..74013.87 rows=5231 width=119)
(actual time=2778.98..7886.85 rows=276 loops=1)
-> Seq Scan on broadcast (cost=0.00..70871.32
rows=399885 width=35) (actual time=2768.80..7851.94 rows=8019 loops=1)
-> Hash (cost=78.00..78.00 rows=9 width=84) (actual
time=9.56..9.56 rows=0 loops=1)
-> Hash Join (cost=1.09..78.00 rows=9 width=84)
(actual time=1.73..9.53 rows=9 loops=1)
-> Index Scan using channelregion_pkey,
channelregion_pkey, channelregion_pkey, channelregion_pkey,
channelregion_pkey, channelregion_pkey, channelregion_pkey,
channelregion_pkey, channelregion_pkey on channelregion (cost=0.00..76.75
rows=9 width=60) (actual time=0.48..8.00 rows=9 loops=1)
-> Hash (cost=1.07..1.07 rows=7 width=24)
(actual time=0.19..0.19 rows=0 loops=1)
-> Seq Scan on distribution
(cost=0.00..1.07 rows=7 width=24) (actual time=0.14..0.17 rows=7 loops=1)
-> Index Scan using episode_pkey on episode (cost=0.00..4.74
rows=1 width=160) (actual time=0.09..0.11 rows=1 loops=276)
Total runtime: 8009.97 msec
Here is explain analyse on a 7.2 box:
Unique (cost=13355.63..13416.75 rows=79 width=278) (actual
time=525.79..529.63 rows=276 loops=1)
-> Sort (cost=13355.63..13355.63 rows=789 width=278) (actual
time=525.78..526.07 rows=276 loops=1)
-> Nested Loop (cost=1.09..13317.68 rows=789 width=278) (actual
time=5.32..520.46 rows=276 loops=1)
-> Nested Loop (cost=1.09..9749.11 rows=789 width=119)
(actual time=5.07..481.22 rows=276 loops=1)
-> Hash Join (cost=1.09..69.44 rows=9 width=84)
(actual time=1.24..3.89 rows=9 loops=1)
-> Index Scan using channelregion_pkey,
channelregion_pkey, channelregion_pkey, channelregion_pkey,
channelregion_pkey, channelregion_pkey, channelregion_pkey,
channelregion_pkey, channelregion_pkey on channelregion (cost=0.00..68.20
rows=9 width=60) (actual time=0.08..2.17 rows=9 loops=1)
-> Hash (cost=1.07..1.07 rows=7 width=24)
(actual time=0.10..0.10 rows=0 loops=1)
-> Seq Scan on distribution
(cost=0.00..1.07 rows=7 width=24) (actual time=0.04..0.07 rows=7 loops=1)
-> Index Scan using idx_broadcast_channelregionid on
broadcast (cost=0.00..1080.72 rows=88 width=35) (actual time=3.80..52.47
rows=31 loops=9)
-> Index Scan using episode_pkey on episode (cost=0.00..4.51
rows=1 width=159) (actual time=0.09..0.11 rows=1 loops=276)
Total runtime: 598.78 msec
"Sam Liddicott" <sam.liddicott@ananova.com> writes:
> O notice missing in the 7.2.1 (slow) explain analyse this part:
> "Index Scan using idx_broadcast_channelregionid on broadcast"
Indeed. What do 7.2 and 7.2.1 have in the pg_stats row for
broadcast.channelregionid? What is the real distribution of that
column?
regards, tom lane
"Sam Liddicott" <sam.liddicott@ananova.com> writes:
>> Indeed. What do 7.2 and 7.2.1 have in the pg_stats row for
>> broadcast.channelregionid? What is the real distribution of that
>> column?
> 7.2 says:
> [snip]
Well, that's annoying: the new ANALYZE code is in fact more accurate
than the old (it's estimating 429 distinct values, vs. formerly 4532,
when the true figure is 636) but it's nonetheless making a worse final
choice of plan.
If you turn off enable_seqscan, what EXPLAIN results do you get from
7.2.1? How about if you leave enable_seqscan on, but reduce
random_page_cost from the default 4.0 to perhaps 3.0?
regards, tom lane
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: 08 July 2002 16:22
> To: Sam Liddicott
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] 7.2.1 optimises very badly against 7.2
>
>
> "Sam Liddicott" <sam.liddicott@ananova.com> writes:
> > O notice missing in the 7.2.1 (slow) explain analyse this part:
> > "Index Scan using idx_broadcast_channelregionid on broadcast"
>
> Indeed. What do 7.2 and 7.2.1 have in the pg_stats row for
> broadcast.channelregionid? What is the real distribution of that
> column?
7.2 says:
tv=# select * from pg_stats where attname='channelregionid';
tablename | attname | null_frac | avg_width | n_distinct |
most_common_vals |
most_common_freqs |
histogram_bounds | correlation
-----------+-----------------+-----------+-----------+------------+---------
----------------------+-----------------------------------------------------
----------------------------------+-----------------------------------------
+-------------
broadcast | channelregionid | 0 | 4 | 4532 |
{54,81,2,22,1,4,645,76,53,23} |
{0.0376667,0.0346667,0.0293333,0.029,0.0273333,0.024,0.0236667,0.019,0.01866
67,0.018} | {3,16,36,49,90,170,231,425,494,659,747} | -0.155299
(1 row)
................
channelregionid | count
-----------------+-------
418 | 11698
588 | 8677
417 | 8331
138 | 7435
170 | 7336
219 | 6729
701 | 6585
184 | 6584
218 | 6537
109 | 6479
195 | 6367
734 | 6341
67 | 6235
33 | 5902
615 | 5900
707 | 5899
136 | 5896
227 | 5806
49 | 5754
414 | 5714
1 | 5710
122 | 5646
413 | 5629
48 | 5603
2 | 5593
415 | 5586
3 | 5581
34 | 5579
186 | 5565
13 | 5547
15 | 5546
11 | 5545
12 | 5545
18 | 5545
713 | 5545
9 | 5544
14 | 5544
4 | 5542
10 | 5542
17 | 5541
412 | 5541
16 | 5539
92 | 5493
39 | 5489
35 | 5393
612 | 5371
99 | 5346
678 | 5333
659 | 5304
45 | 5287
46 | 5287
85 | 5286
102 | 5269
705 | 5267
215 | 5252
190 | 5249
709 | 5247
47 | 5234
44 | 5221
36 | 5216
194 | 5210
38 | 5188
698 | 5187
661 | 5136
37 | 5134
40 | 5128
41 | 5114
663 | 5081
82 | 5068
42 | 5051
19 | 5036
81 | 5022
95 | 5019
141 | 4996
54 | 4984
52 | 4980
20 | 4979
25 | 4978
27 | 4978
24 | 4977
29 | 4977
31 | 4977
724 | 4977
22 | 4976
23 | 4976
26 | 4976
660 | 4976
30 | 4975
32 | 4975
420 | 4975
185 | 4966
43 | 4958
21 | 4933
149 | 4756
53 | 4692
480 | 4663
76 | 4652
91 | 4606
134 | 4577
89 | 4536
168 | 4507
700 | 4506
487 | 4499
200 | 4381
222 | 4379
617 | 4329
71 | 4250
613 | 4199
83 | 4198
128 | 4127
130 | 4076
188 | 4070
703 | 4060
197 | 4042
169 | 4025
706 | 4018
129 | 3972
66 | 3944
112 | 3851
704 | 3849
641 | 3809
232 | 3708
622 | 3696
133 | 3695
110 | 3649
221 | 3549
645 | 3484
183 | 3441
634 | 3404
738 | 3399
682 | 3376
123 | 3352
166 | 3346
90 | 3283
64 | 3258
84 | 3248
58 | 3237
631 | 3214
636 | 3180
635 | 3179
639 | 3176
640 | 3176
177 | 3144
710 | 3142
478 | 3124
203 | 3121
172 | 3066
733 | 3061
192 | 3051
214 | 3051
633 | 2962
632 | 2923
722 | 2909
171 | 2698
702 | 2695
107 | 2685
161 | 2658
485 | 2622
696 | 2598
638 | 2568
474 | 2559
275 | 2549
274 | 2546
451 | 2489
637 | 2486
619 | 2470
155 | 2406
433 | 2373
216 | 2334
431 | 2329
231 | 2279
241 | 2261
63 | 2253
605 | 2233
150 | 2227
114 | 2091
223 | 2048
606 | 2047
139 | 2036
73 | 2031
120 | 2020
668 | 2020
96 | 1984
68 | 1977
657 | 1976
365 | 1949
608 | 1940
368 | 1900
8 | 1888
187 | 1864
86 | 1830
70 | 1817
50 | 1813
175 | 1808
124 | 1806
69 | 1802
367 | 1801
119 | 1795
144 | 1791
178 | 1774
125 | 1753
174 | 1728
143 | 1724
74 | 1680
278 | 1666
422 | 1659
379 | 1644
369 | 1595
313 | 1594
535 | 1594
261 | 1553
154 | 1552
435 | 1523
359 | 1505
308 | 1495
530 | 1494
534 | 1493
543 | 1490
542 | 1487
111 | 1481
461 | 1481
249 | 1476
620 | 1476
602 | 1475
614 | 1469
153 | 1463
228 | 1459
87 | 1457
536 | 1451
289 | 1434
601 | 1421
524 | 1418
525 | 1418
512 | 1383
513 | 1383
375 | 1373
560 | 1373
518 | 1372
245 | 1370
521 | 1369
495 | 1367
493 | 1366
494 | 1366
454 | 1364
561 | 1364
505 | 1363
56 | 1358
496 | 1358
544 | 1356
284 | 1353
77 | 1349
78 | 1348
79 | 1348
80 | 1348
545 | 1347
540 | 1342
541 | 1342
537 | 1337
358 | 1334
618 | 1310
556 | 1299
557 | 1299
349 | 1290
366 | 1282
712 | 1280
425 | 1279
528 | 1276
529 | 1276
572 | 1276
568 | 1272
508 | 1271
509 | 1271
514 | 1257
727 | 1257
515 | 1256
362 | 1255
396 | 1254
603 | 1254
342 | 1249
479 | 1248
486 | 1248
554 | 1247
564 | 1246
565 | 1246
394 | 1239
229 | 1237
582 | 1232
570 | 1230
571 | 1230
292 | 1227
321 | 1227
286 | 1222
287 | 1222
510 | 1222
511 | 1222
580 | 1220
266 | 1218
531 | 1217
716 | 1213
546 | 1211
547 | 1211
491 | 1210
492 | 1210
374 | 1203
472 | 1203
563 | 1203
462 | 1199
500 | 1194
584 | 1193
499 | 1188
562 | 1188
731 | 1185
742 | 1184
437 | 1182
555 | 1182
280 | 1172
720 | 1170
581 | 1168
717 | 1168
732 | 1162
432 | 1160
242 | 1156
548 | 1151
549 | 1151
579 | 1151
260 | 1150
567 | 1149
569 | 1149
578 | 1148
428 | 1147
532 | 1146
559 | 1145
438 | 1144
621 | 1143
371 | 1138
333 | 1137
522 | 1137
533 | 1135
523 | 1132
558 | 1132
538 | 1126
539 | 1126
489 | 1124
714 | 1121
427 | 1115
506 | 1114
735 | 1107
59 | 1104
517 | 1104
430 | 1101
255 | 1099
336 | 1087
516 | 1084
652 | 1077
383 | 1076
387 | 1072
285 | 1071
251 | 1069
699 | 1069
322 | 1067
552 | 1067
553 | 1067
309 | 1063
473 | 1061
550 | 1061
551 | 1061
497 | 1060
498 | 1060
697 | 1060
385 | 1056
470 | 1046
256 | 1044
282 | 1044
296 | 1044
299 | 1044
314 | 1044
456 | 1044
650 | 1044
381 | 1042
463 | 1040
477 | 1040
335 | 1036
402 | 1036
471 | 1034
55 | 1033
646 | 1033
360 | 1031
643 | 1031
653 | 1031
279 | 1026
320 | 1026
352 | 1023
331 | 1021
364 | 1020
356 | 1018
465 | 1016
574 | 1016
464 | 1014
673 | 1014
103 | 1013
234 | 1013
334 | 1013
380 | 1013
295 | 1008
332 | 1006
263 | 1004
482 | 1004
585 | 1004
353 | 1000
361 | 998
401 | 997
484 | 996
294 | 993
217 | 987
156 | 986
246 | 986
312 | 986
311 | 985
376 | 984
399 | 984
377 | 983
594 | 982
330 | 981
692 | 978
271 | 977
267 | 976
270 | 976
272 | 976
277 | 976
326 | 975
575 | 972
233 | 966
298 | 966
305 | 966
424 | 966
649 | 966
235 | 965
459 | 963
526 | 962
372 | 960
408 | 959
527 | 954
319 | 947
599 | 947
651 | 947
340 | 945
373 | 945
577 | 945
403 | 944
469 | 943
327 | 938
455 | 937
719 | 936
158 | 931
236 | 926
258 | 926
276 | 926
488 | 926
586 | 926
476 | 925
388 | 924
501 | 924
502 | 924
409 | 919
573 | 918
744 | 917
264 | 906
445 | 904
443 | 903
283 | 902
442 | 902
444 | 900
407 | 896
339 | 890
252 | 889
247 | 888
503 | 888
253 | 886
273 | 886
589 | 886
583 | 884
576 | 882
239 | 880
607 | 877
406 | 876
220 | 875
386 | 873
440 | 872
329 | 871
384 | 871
350 | 870
405 | 870
708 | 870
250 | 868
604 | 868
392 | 864
429 | 864
140 | 862
248 | 859
458 | 858
393 | 854
439 | 852
357 | 850
595 | 850
262 | 849
269 | 848
304 | 848
318 | 848
647 | 848
723 | 848
354 | 847
268 | 846
281 | 846
648 | 846
240 | 845
225 | 842
325 | 839
224 | 838
146 | 836
441 | 833
389 | 832
664 | 832
145 | 829
481 | 829
315 | 828
644 | 827
346 | 825
328 | 823
404 | 812
475 | 812
348 | 810
378 | 809
600 | 805
390 | 792
466 | 792
747 | 792
341 | 790
426 | 790
593 | 789
316 | 788
301 | 773
457 | 772
591 | 772
592 | 772
288 | 770
587 | 758
597 | 758
460 | 752
590 | 752
291 | 745
436 | 739
254 | 731
683 | 731
244 | 730
715 | 721
324 | 715
721 | 708
297 | 690
654 | 690
310 | 673
338 | 672
382 | 641
237 | 632
693 | 632
448 | 629
355 | 600
370 | 600
259 | 592
118 | 588
238 | 572
351 | 494
395 | 456
290 | 454
265 | 453
300 | 434
655 | 424
656 | 417
303 | 414
323 | 410
446 | 399
179 | 366
293 | 363
658 | 360
363 | 351
230 | 338
756 | 337
642 | 336
116 | 332
691 | 306
307 | 296
317 | 296
306 | 276
257 | 215
159 | 168
181 | 163
180 | 160
737 | 144
60 | 138
62 | 138
93 | 138
100 | 138
101 | 138
104 | 138
151 | 138
204 | 138
205 | 138
206 | 138
207 | 138
208 | 138
209 | 138
210 | 138
211 | 138
212 | 138
213 | 138
410 | 138
411 | 138
616 | 138
121 | 137
749 | 136
182 | 135
337 | 135
596 | 135
450 | 134
189 | 131
449 | 119
447 | 114
751 | 90
142 | 81
745 | 67
743 | 66
711 | 56
391 | 42
694 | 24
137 | 19
695 | 13
736 | 11
(636 rows)
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
7.2.1 says:
tv=# select * from pg_stats where attname='channelregionid';
tablename | attname | null_frac | avg_width | n_distinct |
most_common_vals | most_common_freqs
| histogram_bounds | correlation
-----------+-----------------+-----------+-----------+------------+---------
---------------------+------------------------------------------------------
---------------------------------+-----------------------------------------+
-------------
broadcast | channelregionid | 0 | 4 | 429 |
{2,54,76,4,81,1,37,3,22,487} |
{0.0326667,0.032,0.029,0.028,0.0273333,0.0253333,0.0253333,0.0233333,0.02266
67,0.021} | {8,24,38,64,91,183,250,428,567,659,756} | -0.19478
(1 row)
..........
channelregionid | count
-----------------+-------
418 | 11698
588 | 8677
417 | 8331
138 | 7435
170 | 7336
219 | 6729
701 | 6585
184 | 6584
218 | 6537
109 | 6479
195 | 6367
734 | 6341
67 | 6235
33 | 5902
615 | 5900
707 | 5899
136 | 5896
227 | 5806
49 | 5754
414 | 5714
1 | 5710
122 | 5646
413 | 5629
48 | 5603
2 | 5593
415 | 5586
3 | 5581
34 | 5579
186 | 5565
13 | 5547
15 | 5546
11 | 5545
12 | 5545
18 | 5545
713 | 5545
9 | 5544
14 | 5544
4 | 5542
10 | 5542
17 | 5541
412 | 5541
16 | 5539
92 | 5493
39 | 5489
35 | 5393
612 | 5371
99 | 5346
678 | 5333
659 | 5304
45 | 5287
46 | 5287
85 | 5286
102 | 5269
705 | 5267
215 | 5252
190 | 5249
709 | 5247
47 | 5234
44 | 5221
36 | 5216
194 | 5210
38 | 5188
698 | 5187
661 | 5136
37 | 5134
40 | 5128
41 | 5114
663 | 5081
82 | 5068
42 | 5051
19 | 5036
81 | 5022
95 | 5019
141 | 4996
54 | 4984
52 | 4980
20 | 4979
25 | 4978
27 | 4978
24 | 4977
29 | 4977
31 | 4977
724 | 4977
22 | 4976
23 | 4976
26 | 4976
660 | 4976
30 | 4975
32 | 4975
420 | 4975
185 | 4966
43 | 4958
21 | 4933
149 | 4756
53 | 4692
480 | 4663
76 | 4652
91 | 4606
134 | 4577
89 | 4536
168 | 4507
700 | 4506
487 | 4499
200 | 4381
222 | 4379
617 | 4329
71 | 4250
613 | 4199
83 | 4198
128 | 4127
130 | 4076
188 | 4070
703 | 4060
197 | 4042
169 | 4025
706 | 4018
129 | 3972
66 | 3944
112 | 3851
704 | 3849
641 | 3809
232 | 3708
622 | 3696
133 | 3695
110 | 3649
221 | 3549
645 | 3484
183 | 3441
634 | 3404
738 | 3399
682 | 3376
123 | 3352
166 | 3346
90 | 3283
64 | 3258
84 | 3248
58 | 3237
631 | 3214
636 | 3180
635 | 3179
639 | 3176
640 | 3176
177 | 3144
710 | 3142
478 | 3124
203 | 3121
172 | 3066
733 | 3061
192 | 3051
214 | 3051
633 | 2962
632 | 2923
722 | 2909
171 | 2698
702 | 2695
107 | 2685
161 | 2658
485 | 2622
696 | 2598
638 | 2568
474 | 2559
275 | 2549
274 | 2546
451 | 2489
637 | 2486
619 | 2470
155 | 2406
433 | 2373
216 | 2334
431 | 2329
231 | 2279
241 | 2261
63 | 2253
605 | 2233
150 | 2227
114 | 2091
223 | 2048
606 | 2047
139 | 2036
73 | 2031
120 | 2020
668 | 2020
96 | 1984
68 | 1977
657 | 1976
365 | 1949
608 | 1940
368 | 1900
8 | 1888
187 | 1864
86 | 1830
70 | 1817
50 | 1813
175 | 1808
124 | 1806
69 | 1802
367 | 1801
119 | 1795
144 | 1791
178 | 1774
125 | 1753
174 | 1728
143 | 1724
74 | 1680
278 | 1666
422 | 1659
379 | 1644
369 | 1595
313 | 1594
535 | 1594
261 | 1553
154 | 1552
435 | 1523
359 | 1505
308 | 1495
530 | 1494
534 | 1493
543 | 1490
542 | 1487
111 | 1481
461 | 1481
249 | 1476
620 | 1476
602 | 1475
614 | 1469
153 | 1463
228 | 1459
87 | 1457
536 | 1451
289 | 1434
601 | 1421
524 | 1418
525 | 1418
512 | 1383
513 | 1383
375 | 1373
560 | 1373
518 | 1372
245 | 1370
521 | 1369
495 | 1367
493 | 1366
494 | 1366
454 | 1364
561 | 1364
505 | 1363
56 | 1358
496 | 1358
544 | 1356
284 | 1353
77 | 1349
78 | 1348
79 | 1348
80 | 1348
545 | 1347
540 | 1342
541 | 1342
537 | 1337
358 | 1334
618 | 1310
556 | 1299
557 | 1299
349 | 1290
366 | 1282
712 | 1280
425 | 1279
528 | 1276
529 | 1276
572 | 1276
568 | 1272
508 | 1271
509 | 1271
514 | 1257
727 | 1257
515 | 1256
362 | 1255
396 | 1254
603 | 1254
342 | 1249
479 | 1248
486 | 1248
554 | 1247
564 | 1246
565 | 1246
394 | 1239
229 | 1237
582 | 1232
570 | 1230
571 | 1230
292 | 1227
321 | 1227
286 | 1222
287 | 1222
510 | 1222
511 | 1222
580 | 1220
266 | 1218
531 | 1217
716 | 1213
546 | 1211
547 | 1211
491 | 1210
492 | 1210
374 | 1203
472 | 1203
563 | 1203
462 | 1199
500 | 1194
584 | 1193
499 | 1188
562 | 1188
731 | 1185
742 | 1184
437 | 1182
555 | 1182
280 | 1172
720 | 1170
581 | 1168
717 | 1168
732 | 1162
432 | 1160
242 | 1156
548 | 1151
549 | 1151
579 | 1151
260 | 1150
567 | 1149
569 | 1149
578 | 1148
428 | 1147
532 | 1146
559 | 1145
438 | 1144
621 | 1143
371 | 1138
333 | 1137
522 | 1137
533 | 1135
523 | 1132
558 | 1132
538 | 1126
539 | 1126
489 | 1124
714 | 1121
427 | 1115
506 | 1114
735 | 1107
59 | 1104
517 | 1104
430 | 1101
255 | 1099
336 | 1087
516 | 1084
652 | 1077
383 | 1076
387 | 1072
285 | 1071
251 | 1069
699 | 1069
322 | 1067
552 | 1067
553 | 1067
309 | 1063
473 | 1061
550 | 1061
551 | 1061
497 | 1060
498 | 1060
697 | 1060
385 | 1056
470 | 1046
256 | 1044
282 | 1044
296 | 1044
299 | 1044
314 | 1044
456 | 1044
650 | 1044
381 | 1042
463 | 1040
477 | 1040
335 | 1036
402 | 1036
471 | 1034
55 | 1033
646 | 1033
360 | 1031
643 | 1031
653 | 1031
279 | 1026
320 | 1026
352 | 1023
331 | 1021
364 | 1020
356 | 1018
465 | 1016
574 | 1016
464 | 1014
673 | 1014
103 | 1013
234 | 1013
334 | 1013
380 | 1013
295 | 1008
332 | 1006
263 | 1004
482 | 1004
585 | 1004
353 | 1000
361 | 998
401 | 997
484 | 996
294 | 993
217 | 987
156 | 986
246 | 986
312 | 986
311 | 985
376 | 984
399 | 984
377 | 983
594 | 982
330 | 981
692 | 978
271 | 977
267 | 976
270 | 976
272 | 976
277 | 976
326 | 975
575 | 972
233 | 966
298 | 966
305 | 966
424 | 966
649 | 966
235 | 965
459 | 963
526 | 962
372 | 960
408 | 959
527 | 954
319 | 947
599 | 947
651 | 947
340 | 945
373 | 945
577 | 945
403 | 944
469 | 943
327 | 938
455 | 937
719 | 936
158 | 931
236 | 926
258 | 926
276 | 926
488 | 926
586 | 926
476 | 925
388 | 924
501 | 924
502 | 924
409 | 919
573 | 918
744 | 917
264 | 906
445 | 904
443 | 903
283 | 902
442 | 902
444 | 900
407 | 896
339 | 890
252 | 889
247 | 888
503 | 888
253 | 886
273 | 886
589 | 886
583 | 884
576 | 882
239 | 880
607 | 877
406 | 876
220 | 875
386 | 873
440 | 872
329 | 871
384 | 871
350 | 870
405 | 870
708 | 870
250 | 868
604 | 868
392 | 864
429 | 864
140 | 862
248 | 859
458 | 858
393 | 854
439 | 852
357 | 850
595 | 850
262 | 849
269 | 848
304 | 848
318 | 848
647 | 848
723 | 848
354 | 847
268 | 846
281 | 846
648 | 846
240 | 845
225 | 842
325 | 839
224 | 838
146 | 836
441 | 833
389 | 832
664 | 832
145 | 829
481 | 829
315 | 828
644 | 827
346 | 825
328 | 823
404 | 812
475 | 812
348 | 810
378 | 809
600 | 805
390 | 792
466 | 792
747 | 792
341 | 790
426 | 790
593 | 789
316 | 788
301 | 773
457 | 772
591 | 772
592 | 772
288 | 770
587 | 758
597 | 758
460 | 752
590 | 752
291 | 745
436 | 739
254 | 731
683 | 731
244 | 730
715 | 721
324 | 715
721 | 708
297 | 690
654 | 690
310 | 673
338 | 672
382 | 641
237 | 632
693 | 632
448 | 629
355 | 600
370 | 600
259 | 592
118 | 588
238 | 572
351 | 494
395 | 456
290 | 454
265 | 453
300 | 434
655 | 424
656 | 417
303 | 414
323 | 410
446 | 399
179 | 366
293 | 363
658 | 360
363 | 351
230 | 338
756 | 337
642 | 336
116 | 332
691 | 306
307 | 296
317 | 296
306 | 276
257 | 215
159 | 168
181 | 163
180 | 160
737 | 144
60 | 138
62 | 138
93 | 138
100 | 138
101 | 138
104 | 138
151 | 138
204 | 138
205 | 138
206 | 138
207 | 138
208 | 138
209 | 138
210 | 138
211 | 138
212 | 138
213 | 138
410 | 138
411 | 138
616 | 138
121 | 137
749 | 136
182 | 135
337 | 135
596 | 135
450 | 134
189 | 131
449 | 119
447 | 114
751 | 90
142 | 81
745 | 67
743 | 66
711 | 56
391 | 42
694 | 24
137 | 19
695 | 13
736 | 11
(636 rows)
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: 08 July 2002 21:44
> To: Sam Liddicott
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] 7.2.1 optimises very badly against 7.2
>
> If you turn off enable_seqscan, what EXPLAIN results do you get from
> 7.2.1?
Wow. Even better than 7.2, much better use of indexes.
Unique (cost=116936.99..117348.73 rows=531 width=276) (actual
time=567.37..571.80 rows=305 loops=1)
-> Sort (cost=116936.99..116936.99 rows=5313 width=276) (actual
time=567.36..567.70 rows=305 loops=1)
-> Nested Loop (cost=3.51..116608.26 rows=5313 width=276) (actual
time=6.56..561.30 rows=305 loops=1)
-> Nested Loop (cost=3.51..92080.31 rows=5313 width=119)
(actual time=6.00..519.18 rows=305 loops=1)
-> Hash Join (cost=3.51..61.71 rows=9 width=84)
(actual time=1.45..7.61 rows=9 loops=1)
-> Index Scan using channelregion_pkey,
channelregion_pkey, channelregion_pkey, channelregion_pkey,
channelregion_pkey, channelregion_pkey, channelregion_pkey,
channelregion_pkey, channelregion_pkey on channelregion (cost=0.00..58.05
rows=9 width=60) (actual time=0.17..5.78 rows=9 loops=1)
-> Hash (cost=3.49..3.49 rows=7 width=24)
(actual time=0.14..0.14 rows=0 loops=1)
-> Index Scan using distribution_pkey on
distribution (cost=0.00..3.49 rows=7 width=24) (actual time=0.06..0.11
rows=7 loops=1)
-> Index Scan using idx_broadcast_channelregionid on
broadcast (cost=0.00..10273.33 rows=872 width=35) (actual time=3.75..56.22
rows=34 loops=9)
-> Index Scan using episode_pkey on episode (cost=0.00..4.60
rows=1 width=157) (actual time=0.09..0.10 rows=1 loops=305)
Total runtime: 668.73 msec
> How about if you leave enable_seqscan on, but reduce
> random_page_cost from the default 4.0 to perhaps 3.0?
The same. Normally (in high concurrent use) I think even sequence scans
degrade to random_pages because of disk head contention; but in this case
the seq_scan query was running on an otherwise idle machine, as the sole
pgsql client, so perhaps as you hinted the random_page_cost was too high?
But I feel where indexes are used and seq_scan *could* have been used,
seq_scan is only slightly faster where the machine is idle (and the small
delay can perhaps be afforded), but where there there is disk head
contention seq_scan is deadly, thus I always prefer index scan, so I shall
disable seq_scan in the config file. Is my reasoning faulty?, and is it a
reasonable solution or can we expect that the query-plan-chooser ought
always to do better?
Thanks for you help on this and I'm happy to keep running tests for you
until you are happy with your results as well as me happy with mine.
Unique (cost=91354.47..91766.20 rows=531 width=276) (actual
time=542.55..547.01 rows=305 loops=1)
-> Sort (cost=91354.47..91354.47 rows=5313 width=276) (actual
time=542.54..542.89 rows=305 loops=1)
-> Nested Loop (cost=1.09..91025.74 rows=5313 width=276) (actual
time=7.14..536.40 rows=305 loops=1)
-> Nested Loop (cost=1.09..69315.01 rows=5313 width=119)
(actual time=6.59..496.13 rows=305 loops=1)
-> Hash Join (cost=1.09..50.53 rows=9 width=84)
(actual time=1.94..9.11 rows=9 loops=1)
-> Index Scan using channelregion_pkey,
channelregion_pkey, channelregion_pkey, channelregion_pkey,
channelregion_pkey, channelregion_pkey, channelregion_pkey,
channelregion_pkey, channelregion_pkey on channelregion (cost=0.00..49.28
rows=9 width=60) (actual time=0.42..7.04 rows=9 loops=1)
-> Hash (cost=1.07..1.07 rows=7 width=24)
(actual time=0.20..0.20 rows=0 loops=1)
-> Seq Scan on distribution
(cost=0.00..1.07 rows=7 width=24) (actual time=0.13..0.16 rows=7 loops=1)
-> Index Scan using idx_broadcast_channelregionid on
broadcast (cost=0.00..7730.28 rows=872 width=35) (actual time=3.66..53.50
rows=34 loops=9)
-> Index Scan using episode_pkey on episode (cost=0.00..4.07
rows=1 width=157) (actual time=0.09..0.10 rows=1 loops=305)
Total runtime: 630.32 msec
Sam
"Sam Liddicott" <sam.liddicott@ananova.com> writes:
> But I feel where indexes are used and seq_scan *could* have been used,
> seq_scan is only slightly faster where the machine is idle (and the small
> delay can perhaps be afforded), but where there there is disk head
> contention seq_scan is deadly, thus I always prefer index scan, so I shall
> disable seq_scan in the config file. Is my reasoning faulty?
Quite. If we could get by with a rule as simplistic as "never use a
seqscan if you can avoid it" then the planner could be a lot simpler.
Your real gripe is that the planner is overestimating the costs of
indexscans relative to seqscans; that would be more appropriately
addressed by lowering random_page_cost a little than by getting out
the sledgehammer.
A more practical reason not to do that is that in situations where a
seqscan is not avoidable, enable_seq_scan = OFF is likely to cause the
planner to make bad choices, since the disable cost will swamp out all
the actually-useful cost judgments.
regards, tom lane
On Wed, Jul 10, 2002 at 10:00:38AM -0400, Tom Lane wrote: > "Sam Liddicott" <sam.liddicott@ananova.com> writes: > > But I feel where indexes are used and seq_scan *could* have been used, > > seq_scan is only slightly faster where the machine is idle (and the small > > delay can perhaps be afforded), but where there there is disk head > > contention seq_scan is deadly, thus I always prefer index scan, so I shall > > disable seq_scan in the config file. Is my reasoning faulty? > > Quite. If we could get by with a rule as simplistic as "never use a > seqscan if you can avoid it" then the planner could be a lot simpler. > Your real gripe is that the planner is overestimating the costs of > indexscans relative to seqscans; that would be more appropriately > addressed by lowering random_page_cost a little than by getting out > the sledgehammer. I think there is a little problem with multiple seq scans in a single plan. If your plan is only doing a single seq scan on a large table, then the cost estimate is probably fine. But if the planner chooses the seq scan two large tables in parallel, the actual disk transfers degenerate to random access. But only if they are on the same disk. Should postgres be worrying about this? -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > There are 10 kinds of people in the world, those that can do binary > arithmetic and those that can't.
> -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: 10 July 2002 15:01 > To: Sam Liddicott > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] 7.2.1 optimises very badly against 7.2 > > If we could get by with a rule as simplistic as "never use a > seqscan if you can avoid it" then the planner could be a lot simpler. > Your real gripe is that the planner is overestimating the costs of > indexscans relative to seqscans; that would be more appropriately > addressed by lowering random_page_cost a little than by getting out > the sledgehammer. > > A more practical reason not to do that is that in situations where a > seqscan is not avoidable, enable_seq_scan = OFF is likely to cause the > planner to make bad choices, since the disable cost will swamp out all > the actually-useful cost judgments. Right. Do you feel the random page cost of 3 is good to solve this? Is it solely a tuning problem at my end, or do you want to do further tests to derive a better default value? Sam
"Sam Liddicott" <sam.liddicott@ananova.com> writes:
> Do you feel the random page cost of 3 is good to solve this?
For the moment, anyway. There have been a couple of rounds of
pgsql-hackers discussion about whether to lower the default value of
random_page_cost, but so far no one has done any experiments that
would be needed to establish a good new value. (The current default
of 4.0 is based on some old experiments I did. I'm quite willing to
accept that those experiments might have been flawed, but not willing
to replace the number without seeing better experiments...)
regards, tom lane
Just curious, Is the number of record per page and the number of key per page taken in consideration? Tom Lane wrote: > > "Sam Liddicott" <sam.liddicott@ananova.com> writes: > > Do you feel the random page cost of 3 is good to solve this? > > For the moment, anyway. There have been a couple of rounds of > pgsql-hackers discussion about whether to lower the default value of > random_page_cost, but so far no one has done any experiments that > would be needed to establish a good new value. (The current default > of 4.0 is based on some old experiments I did. I'm quite willing to > accept that those experiments might have been flawed, but not willing > to replace the number without seeing better experiments...) > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster
Jean-Luc Lachance <jllachan@nsd.ca> writes:
> Is the number of record per page and the number of key per page taken in
> consideration?
Yes, indirectly. (The cost equations actually use total tuples and
total pages in each table/index, but that's isomorphic to using average
tuple size or average number of tuples per page or however you want
to think about it ...)
regards, tom lane
tgl@sss.pgh.pa.us (Tom Lane) wrote in message > For the moment, anyway. There have been a couple of rounds of > pgsql-hackers discussion about whether to lower the default value of > random_page_cost, but so far no one has done any experiments that > would be needed to establish a good new value. (The current default > of 4.0 is based on some old experiments I did. I'm quite willing to > accept that those experiments might have been flawed, but not willing > to replace the number without seeing better experiments...) > I wonder if a simple script could be designed to test the various parameters specified in postgresql.conf. The user could be asked to input 3 or 4 of the most commmon database queries and the script would determine the cost, run those queries, and send back the actual time of completion for each variation of the optimizer (e.g. run 1, enable_seq_scan false; run 2, random_page_cost 3; etc.). This way administrators would have a way (albeit brute force) to fine-tune their settings to their specific machine and data. Plus, they could upload the results to the hackers list just like regression tests. Just a thought. -Tony
On Fri, 12 Jul 2002, Tom Lane wrote: > "Sam Liddicott" <sam.liddicott@ananova.com> writes: > > Do you feel the random page cost of 3 is good to solve this? > > For the moment, anyway. There have been a couple of rounds of > pgsql-hackers discussion about whether to lower the default value of > random_page_cost, but so far no one has done any experiments that > would be needed to establish a good new value. (The current default > of 4.0 is based on some old experiments I did. I'm quite willing to > accept that those experiments might have been flawed, but not willing > to replace the number without seeing better experiments...) When I first started using the 7.x series, the query planner often picked a sequential scan that would take minutes to return, when an index scan would take seconds. A very low setting for random page cost would fix this (a setting of 0.1 or something like that) but would also make the planner make some bad choices where it should be picking a seq scan but didn't. With 7.2 I've found that a random page cost of around 2 to 4 seems to work very well. So, my point (and I have one!) is that previous experiences with random page cost and older versions of postgresql don't necessarily apply to postgresql 7.2.1. Also, if you're having problems with the query planner and you're running a version of postgresql from before 7.2, you should upgrade first, then worry about what random page cost should be set to.
Tom Lane wrote (snipped) >There have been a couple of rounds of >pgsql-hackers discussion about whether to lower the default value of >random_page_cost, but so far no one has done any experiments that >would be needed to establish a good new value. (The current default >of 4.0 is based on some old experiments I did 4.0 seems like a pretty reasonable default setting. I had the opportunity to measure approximate random_page_cost for a number of configurations recently whilst doing some file system benchmarking: Rough Config Random_page_cost ---------------------------------------------------------------- Sun E220 Solaris 8 SAN RAID 5 6 Cyclone Linux,3Ware 7850 4xATA-133 RAID 5 40 Cyclone Linux 1xATA 133 10 Given the wide variation obtainable ... I cant see any reason to move away from 4.0. regards Mark P.s : I used these to do the measurements - http://homepages.slingshot.co.nz/~markir/tar/benchtool/benchtools-0.9.1.tar.gz and if anyone knows how to enable largefile support on Linux in a cleaner way than I did, feel free to let me know !
> -----Original Message----- > From: Martijn van Oosterhout [mailto:kleptog@svana.org] > Sent: 11 July 2002 00:37 > To: Tom Lane > Cc: Sam Liddicott; pgsql-general@postgresql.org > Subject: Re: [GENERAL] 7.2.1 optimises very badly against 7.2 > > I think there is a little problem with multiple seq scans in > a single plan. > If your plan is only doing a single seq scan on a large > table, then the cost > estimate is probably fine. But if the planner chooses the seq > scan two large > tables in parallel, the actual disk transfers degenerate to > random access. > But only if they are on the same disk. > > Should postgres be worrying about this? I think it should. The same applies if two different queries are running together of the same disk; which is probably any DB with allow_connections>1 Sam
On Mon, 15 Jul 2002, Sam Liddicott wrote: > > From: Martijn van Oosterhout [mailto:kleptog@svana.org] > > > > But if the planner chooses the seq scan two large > > tables in parallel, the actual disk transfers degenerate to random access. > > But only if they are on the same disk. > > > > Should postgres be worrying about this? > > I think it should. The same applies if two different queries are running > together of the same disk; which is probably any DB with allow_connections>1 Well, should it then worry about read-ahead? On most OSes, it doesn't actually degenerate to 1-block random reads; it degerates to something along the lines of 8-block random reads. Trying to optimized based on more than the very simplest and common ideas about physical layout opens up a huge can of worms when you don't actually have any real control over or knowledge of that layout. cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC
Curt Sampson <cjs@cynic.net> writes:
> Well, should it then worry about read-ahead? On most OSes, it
> doesn't actually degenerate to 1-block random reads; it degerates
> to something along the lines of 8-block random reads.
Exactly. There is still an OS-provided benefit for sequential reading,
even if it's not as large as it might be in the absence of any other
activity. What this line of argument suggests is that random_page_cost
should be reduced in a heavily-loaded system ... but probably not to 1.0.
regards, tom lane