Обсуждение: Re: default statistics target testing (was: Simple postgresql.conf wizard)
On Thu, Nov 27, 2008 at 6:46 PM, Gregory Stark <stark@enterprisedb.com> wrote: >> ANALYZE with default_statistics_target set to 10 takes 13 s. With >> 100, 92 s. With 1000, 289 s. > > That is interesting. It would also be interesting to total up the time it > takes to run EXPLAIN (without ANALYZE) for a large number of queries. OK, I did this. I actually tried 10 .. 100 in increments of 10 and then 100 ... 1000 in increments of 50, for 7 different queries of varying complexity (but all generally similar, including all of them having LIMIT 100 as is typical for this database). I planned each query 100 times with each default_statistics_target. The results were somewhat underwhelming. The query which was by far the most complex, slowest, and most expensive to plan took 2% longer to plan with default_statistics_target = 1000 vs. default_statistics_target = 10 (284 ms vs. 279 ms). The average of the remaining 6 queries was a 12% increase in planning time (17.42 ms vs. 15.47 ms). The ANALYZE times as a function of default_statistics_target were: 10 13.030161 20 22.523386 30 32.38686 40 42.200557 50 51.280172 60 60.510998 70 69.319333 80 77.942732 90 85.96144 100 93.235432 150 120.251883 200 131.297581 250 142.410084 300 152.763004 350 164.222845 400 175.989113 450 186.762032 500 199.075595 550 210.241334 600 224.366766 650 233.036997 700 240.685552 750 249.516471 800 259.522957 850 268.19841 900 277.227745 950 290.639858 1000 297.099143 I'm attaching the test harness in case anyone wants to try this out with their own queries (disclaimer: this is pretty quick and dirty - it expects the input to be in files called q1.txt through q7.txt and you have to edit the code to change that, or, uh, anything else). Obviously these queries aren't very interesting in terms of d_s_t; maybe someone has some where it makes more of a difference. ...Robert
Вложения
On Fri, 5 Dec 2008, Robert Haas wrote: > OK, I did this. I actually tried 10 .. 100 in increments of 10 and > then 100 ... 1000 in increments of 50, for 7 different queries of > varying complexity Great bit of research. Was this against CVS HEAD or an 8.3 database? -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
Looks like Robert accidentally answered my question about what version his results were from off-list. Here's his update: --- Unfortunately it was 8.2.9, as I realized halfway into the run. Here are the results from a CVS HEAD checkout last night. *** Query planning times q1 (the complex one): 367 ms with default_statistics_target = 10, 379 ms with default_statistics_target = 1000 (3% slowdown) other 6 queries: 18.48 ms with default_statistics_target = 10, 21.75 ms with default_statistics_target = 1000 (18% slowdown) comparing to 8.2.9 results: - q1 was 32% slower with dst = 10, 33% slower with dst = 1000 - other queries were 19% slower with dst = 10, 25% slower with dst = 1000 *** ANALYZE times 10 4.283007 20 6.741417 30 9.157209 40 11.445861 50 13.466374 60 15.402897 70 17.365484 80 19.243842 90 20.905635 100 22.643275 150 29.056517 200 31.878688 250 34.240638 300 36.996304 350 39.921178 400 42.443728 450 44.69833 500 47.553414 550 50.305143 600 52.799597 650 54.961691 700 57.117117 750 58.847345 800 61.60112 850 63.020136 900 65.858213 950 68.072528 1000 69.861321 comparing to 8.2.9 results: - 3x faster with default_statistics_target = 10 - 4.25x faster with default_statistics_target = 1000 ...Robert
Re: default statistics target testing (was: Simple postgresql.conf wizard)
От
"Guillaume Smet"
Дата:
Robert, On Fri, Dec 5, 2008 at 7:24 PM, Greg Smith <gsmith@gregsmith.com> wrote: > comparing to 8.2.9 results: > - q1 was 32% slower with dst = 10, 33% slower with dst = 1000 > - other queries were 19% slower with dst = 10, 25% slower with dst = 1000 You mean that HEAD is slower than 8.2.9 or I don't understand correctly what you wrote? -- Guillaume
> On Fri, Dec 5, 2008 at 7:24 PM, Greg Smith <gsmith@gregsmith.com> wrote: >> comparing to 8.2.9 results: >> - q1 was 32% slower with dst = 10, 33% slower with dst = 1000 >> - other queries were 19% slower with dst = 10, 25% slower with dst = 1000 > > You mean that HEAD is slower than 8.2.9 or I don't understand > correctly what you wrote? Correct. As compared with 8.2.9, ANALYZE was substantially faster, but query planning was significantly slower. Thanks also to Greg for reposting my emails. I didn't even realize I hadn't sent them to the list. ...Robert
Re: default statistics target testing (was: Simple postgresql.conf wizard)
От
"Guillaume Smet"
Дата:
On Fri, Dec 5, 2008 at 8:11 PM, Robert Haas <robertmhaas@gmail.com> wrote: > Correct. As compared with 8.2.9, ANALYZE was substantially faster, > but query planning was significantly slower. > > Thanks also to Greg for reposting my emails. I didn't even realize I > hadn't sent them to the list. Any chance you could do the same test with a 8.3? It could be interesting to see if it's a HEAD thing or if the slow down was introduced in 8.3. -- Guillaume
>> Thanks also to Greg for reposting my emails. I didn't even realize I >> hadn't sent them to the list. > > Any chance you could do the same test with a 8.3? It could be > interesting to see if it's a HEAD thing or if the slow down was > introduced in 8.3. Somehow I knew you were going to ask that. :-) Yeah, I'll try to do it tonight, or over the weekend some time. ...Robert
On Fri, Dec 5, 2008 at 2:16 PM, Guillaume Smet <guillaume.smet@gmail.com> wrote: > On Fri, Dec 5, 2008 at 8:11 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> Correct. As compared with 8.2.9, ANALYZE was substantially faster, >> but query planning was significantly slower. >> >> Thanks also to Greg for reposting my emails. I didn't even realize I >> hadn't sent them to the list. > > Any chance you could do the same test with a 8.3? It could be > interesting to see if it's a HEAD thing or if the slow down was > introduced in 8.3. OK, here are the results from 8.2.9, 8.3.5, and CVS HEAD. A couple of these numbers are slightly different than the ones I posted before due to, uh, me having typo'd my awk script last time. However, the basic picture remains unchanged. 8.3.5 is very similar to CVS HEAD; 8.2.9 plans faster but analyzes more slowly. Q1 CVS HEAD 12/5/2008 default_statistics_target=10 367 ms CVS HEAD 12/5/2008 default_statistics_target=1000 380 ms 8.3.5 default_statistics_target=10 367 ms 8.3.5 default_statistics_target=1000 379 ms 8.2.9 default_statistics_target=10 279 ms 8.2.9 default_statistics_target=1000 285 ms Average of Q2-Q7 CVS HEAD 12/5/2008 default_statistics_target=10 18.14 ms CVS HEAD 12/5/2008 default_statistics_target=1000 21.75 ms 8.3.5 default_statistics_target=10 18.26 ms 8.3.5 default_statistics_target=1000 21.35 ms 8.2.9 default_statistics_target=10 15.47 ms 8.2.9 default_statistics_target=1000 17.42 ms ANALYZE CVS HEAD 12/5/2008 default_statistics_target=10 4.283007 CVS HEAD 12/5/2008 default_statistics_target=1000 69.861321 8.3.5 default_statistics_target=10 4.052358 8.3.5 default_statistics_target=1000 65.12919 8.2.9 default_statistics_target=10 13.030161 8.2.9 default_statistics_target=1000 297.099143 ....Robert
On Friday 05 December 2008 00:05:34 Robert Haas wrote: > On Thu, Nov 27, 2008 at 6:46 PM, Gregory Stark <stark@enterprisedb.com> wrote: > >> ANALYZE with default_statistics_target set to 10 takes 13 s. With > >> 100, 92 s. With 1000, 289 s. > > > > That is interesting. It would also be interesting to total up the time it > > takes to run EXPLAIN (without ANALYZE) for a large number of queries. > I wonder if we'd see anything dramatically different using PREPARE... > OK, I did this. I actually tried 10 .. 100 in increments of 10 and > then 100 ... 1000 in increments of 50, for 7 different queries of > varying complexity (but all generally similar, including all of them > having LIMIT 100 as is typical for this database). I planned each > query 100 times with each default_statistics_target. The results were > somewhat underwhelming. > The one thing this test seems to overlook is at what point do we see diminshing returns from increasing dst. I think the way to do this would be to plot dst setting vs. query time; Robert, do you think you could modify your test to measure prepare time and then execute time over a series of runs? -- Robert Treat Conjecture: http://www.xzilla.net Consulting: http://www.omniti.com
>> > That is interesting. It would also be interesting to total up the time it >> > takes to run EXPLAIN (without ANALYZE) for a large number of queries. > I wonder if we'd see anything dramatically different using PREPARE... Well... the point here is to measure planning time. I would think that EXPLAIN would be the best way to get that information without confounding factors. >> OK, I did this. I actually tried 10 .. 100 in increments of 10 and >> then 100 ... 1000 in increments of 50, for 7 different queries of >> varying complexity (but all generally similar, including all of them >> having LIMIT 100 as is typical for this database). I planned each >> query 100 times with each default_statistics_target. The results were >> somewhat underwhelming. > The one thing this test seems to overlook is at what point do we see > diminshing returns from increasing dst. I think the way to do this would be > to plot dst setting vs. query time; Robert, do you think you could modify > your test to measure prepare time and then execute time over a series of > runs? I did some previous testing on query #1 where I determined that it runs just as fast with default_statistics_target=1 (no, that's not a typo) as default_statistics_target=1000. The plan is stable down to values in the 5-7 range; below that it changes but not appreciably for the worse. I could test the other queries but I suspect the results are similar because the tables are small and should be well-modelled even when the MCV and histogram sizes are small. The point here is to figure out how much we're paying in additional planning time in the worst-case scenario where the statistics aren't helping. ...Robert