Обсуждение: Re: [SQL] ORDER BY Optimization

Поиск
Список
Период
Сортировка

Re: [SQL] ORDER BY Optimization

От
Rosser Schwarz
Дата:
while you weren't looking, Derek Buttineau|Compu-SOLVE wrote:

> I'm hoping this is the right place to send this.

The PostgreSQL Performance list, pgsql-performance@postgresql.org
would be more appropriate. I'm copying my followup there, as well.

As for your query, almost all the time is actually spent in the
nestloop, not the sort.  Compare:

>   ->  Sort  (cost=31402.85..31405.06 rows=886 width=306) (actual
> time=87454.187..87454.240 rows=10 loops=1)

vs.

>          ->  Nested Loop  (cost=0.00..31359.47 rows=886 width=306)
> (actual time=4.740..86430.468 rows=26308 loops=1)

That's 50-ish ms versus 80-odd seconds.

It seems to me a merge join might be more appropriate here than a
nestloop. What's your work_mem set at?  Off-the-cuff numbers show the
dataset weighing in the sub-ten mbyte range.

Provided it's not already at least that big, and you don't want to up
it permanently, try saying:

SET work_mem = 10240; -- 10 mbytes

immediately before running this query (uncached, of course) and see
what happens.

Also, your row-count estimates look pretty off-base.  When were these
tables last VACUUMed or ANALYZEd?

/rls

--
:wq

Re: [SQL] ORDER BY Optimization

От
Derek Buttineau|Compu-SOLVE
Дата:
Thanks for the response :)

>That's 50-ish ms versus 80-odd seconds.
>
>It seems to me a merge join might be more appropriate here than a
>nestloop. What's your work_mem set at?  Off-the-cuff numbers show the
>dataset weighing in the sub-ten mbyte range.
>
>Provided it's not already at least that big, and you don't want to up
>it permanently, try saying:
>
>SET work_mem = 10240; -- 10 mbytes
>
>
It's currently set at 16mb, I've also tried upping sort_mem as well
without any noticible impact on the uncached query. :(

>immediately before running this query (uncached, of course) and see
>what happens.
>
>Also, your row-count estimates look pretty off-base.  When were these
>tables last VACUUMed or ANALYZEd?
>
>
I'm not entirely sure what's up with the row-count estimates, the tables
are updated quite frequently (and VACUUM is also run quite frequently),
however I had just run a VACUUM ANALYZE on both databases before running
the explain.

I'm also still baffled at the differences in the plans between the two
servers, on the one that uses the index to sort, I get for comparison a
nestloop of:

Nested Loop  (cost=0.00..1175943.99 rows=1814 width=311) (actual
time=25.337..26.867 rows=10 loops=1)

The plan that the "live" server seems to be using seems fairly inefficient.

Derek

Re: [SQL] ORDER BY Optimization

От
Tom Lane
Дата:
[ cc list limited to -performance ]

Derek Buttineau|Compu-SOLVE <derek@csolve.net> writes:
>> It seems to me a merge join might be more appropriate here than a
>> nestloop.

After some experimentation, I don't seem to be able to get the planner
to generate a mergejoin based on a backwards index scan.  I suspect
it's not considering the idea of a merge using descending order at all.
Might be a good enhancement, although we'd need to figure out how to
keep this from just uselessly doubling the number of mergejoin paths
considered :-(

In the meantime, the nestloop is the only hope for avoiding a
full-scan-and-sort.

> I'm not entirely sure what's up with the row-count estimates, the tables
> are updated quite frequently (and VACUUM is also run quite frequently),

They're probably not as bad as they look.  The estimates for the lower
nodes are made without regard to the LIMIT, but the actuals of course
reflect the fact that the LIMIT stopped execution of the plan early.

The problem with this query is that the "fast" plan depends on the
assumption that as we scan in backwards m.msg_date order, it won't take
very long to find 10 rows that join to mr rows with mr.subscription=89.
If that's true then the plan wins, if it's not true then the plan can
lose big.  That requires a pretty good density of rows with
subscription=89, and in particular a good density near the end of the
msg_date order.  The planner effectively assumes that the proportion of
rows with subscription=89 isn't changing over time, but perhaps it is
--- if there's been a lot recently that could explain why the "fast"
plan is fast.  In any case I suppose that the reason the larger server
doesn't want to try that plan is that its statistics show a much lower
density of rows with subscription=89, and so the plan doesn't look
promising compared to something that wins if there are few rows with
subscription=89 ... which the other plan does.

You could probably get your larger server to try the no-sort plan if
you said "set enable_sort = 0" first.  It would be interesting to
compare the EXPLAIN ANALYZE results for that case with the other
server.

The contents of the pg_stats row for mr.subscription in each server
would be informative, too.  One rowcount estimate that does look
wrong is

               ->  Index Scan using maillog_received_subscription_idx on
maillog_received mr  (cost=0.00..17789.73 rows=4479 width=43) (actual
time=0.030..33554.061 rows=65508 loops=1)
                     Index Cond: (subscription = 89)

so the stats row is suggesting there are only 4479 rows with
subscription = 89 when really there are 65508.  (The preceding
discussion hopefully makes it clear why this is a potentially critical
mistake.)  This suggests that you may need to raise your statistics
targets.

            regards, tom lane

Re: ORDER BY Optimization

От
Derek Buttineau|Compu-SOLVE
Дата:
Thanks for the response :)

>You could probably get your larger server to try the no-sort plan if
>you said "set enable_sort = 0" first.  It would be interesting to
>compare the EXPLAIN ANALYZE results for that case with the other
>server.
>
>
>

Odd, I went to investigate this switch on the larger server, but the
query planner is now using the reverse index sort for this particular
subscription.  I'm guessing it's now accumulated enough rows for the
planner to justify the reverse sort?

 Limit  (cost=0.00..14808.49 rows=10 width=299) (actual
time=3.760..11.689 rows=10 loops=1)
   ->  Nested Loop  (cost=0.00..15594816.65 rows=10531 width=299)
(actual time=3.750..11.600 rows=10 loops=1)
         ->  Index Scan Backward using maillog_msg_date_idx on maillog
m  (cost=0.00..805268.22 rows=2454190 width=256) (actual
time=0.132..5.548 rows=194 loops=1)
               Filter: (spam = 1)
         ->  Index Scan using
maillog_received_subscription_maillog_id_idx on maillog_received mr
(cost=0.00..6.01 rows=1 width=43) (actual time=0.020..0.021 rows=0
loops=194)
               Index Cond: ((mr.subscription = 89) AND (mr.maillog_id =
"outer".id))
 Total runtime: 11.878 ms


I decided to try the same query with enable_sort on and off to see what
sort of a difference it made roughly:

With enable_sort = 1:

 Limit  (cost=7515.77..7515.79 rows=10 width=299) (actual
time=13153.300..13153.412 rows=10 loops=1)
   ->  Sort  (cost=7515.77..7516.26 rows=196 width=299) (actual
time=13153.288..13153.324 rows=10 loops=1)
         Sort Key: m.msg_date
         ->  Nested Loop  (cost=0.00..7508.30 rows=196 width=299)
(actual time=0.171..13141.099 rows=853 loops=1)
               ->  Index Scan using maillog_received_subscription_idx on
maillog_received mr  (cost=0.00..4266.90 rows=1069 width=43) (actual
time=0.095..5240.645 rows=993 loops=1)
                     Index Cond: (subscription = 15245)
               ->  Index Scan using maillog_pkey on maillog m
(cost=0.00..3.02 rows=1 width=256) (actual time=7.893..7.902 rows=1
loops=993)
                     Index Cond: ("outer".maillog_id = m.id)
                     Filter: (spam = 1)
 Total runtime: 13153.812 ms

With enable_sort = 0;

 Limit  (cost=0.00..795580.99 rows=10 width=299) (actual
time=108.345..3801.446 rows=10 loops=1)
   ->  Nested Loop  (cost=0.00..15593387.49 rows=196 width=299) (actual
time=108.335..3801.352 rows=10 loops=1)
         ->  Index Scan Backward using maillog_msg_date_idx on maillog
m  (cost=0.00..805194.97 rows=2453965 width=256) (actual
time=0.338..3338.096 rows=15594 loops=1)
               Filter: (spam = 1)
         ->  Index Scan using
maillog_received_subscription_maillog_id_idx on maillog_received mr
(cost=0.00..6.01 rows=1 width=43) (actual time=0.020..0.020 rows=0
loops=15594)
               Index Cond: ((mr.subscription = 15245) AND (mr.maillog_id
= "outer".id))
 Total runtime: 3801.676 ms

In comparsion, query plan on the smaller server (it used a sort for this
subscription vs a reverse scan):

 Limit  (cost=197.37..197.38 rows=6 width=313) (actual
time=883.576..883.597 rows=10 loops=1)
   ->  Sort  (cost=197.37..197.38 rows=6 width=313) (actual
time=883.571..883.577 rows=10 loops=1)
         Sort Key: m.msg_date
         ->  Nested Loop  (cost=0.00..197.29 rows=6 width=313) (actual
time=106.334..873.928 rows=47 loops=1)
               ->  Index Scan using maillog_received_subscription_idx on
maillog_received mr  (cost=0.00..109.17 rows=28 width=41) (actual
time=47.289..389.775 rows=58 loops=1)
                     Index Cond: (subscription = 15245)
               ->  Index Scan using maillog_pkey on maillog m
(cost=0.00..3.13 rows=1 width=272) (actual time=8.319..8.322 rows=1
loops=58)
                     Index Cond: ("outer".maillog_id = m.id)
                     Filter: (spam = 1)
 Total runtime: 883.820 ms

>The contents of the pg_stats row for mr.subscription in each server
>would be informative, too.
>
I've increased the statistics targets to 300, so these rows are pretty
bulky, however I've included the rows as text files to this message
(pg_stats_large.txt and pg_stats_small.txt).

>One rowcount estimate that does look
>wrong is
>
>               ->  Index Scan using maillog_received_subscription_idx on
>maillog_received mr  (cost=0.00..17789.73 rows=4479 width=43) (actual
>time=0.030..33554.061 rows=65508 loops=1)
>                     Index Cond: (subscription = 89)
>
>so the stats row is suggesting there are only 4479 rows with
>subscription = 89 when really there are 65508.  (The preceding
>discussion hopefully makes it clear why this is a potentially critical
>mistake.)
>
This could potentially make sense on the larger server (if my
understanding of the vacuum process is correct).  The regular
maintenance of the large server (which is currently the only one being
updated regularily), does a vacuum analyze once per day, a scheduled
vacuum once / hour, and autovacuum for the remainder of the time (which
might be overkill).  With the function of these tables, it is easily
possible for the maillog_received table to gain 60,000 rows for a single
subscription in 24 hours.  With that information, maybe it might be in
my best interest to run the analyze hourly?

>  This suggests that you may need to raise your statistics
>targets.
>
>
>
I've raised the statistics targets to 300 on these columns, it seems to
have helped some.  Interesting enough, when I query on the
most_common_vals for the subscription argument, they tend to use the
reverse scan method vs the sort.

Thanks again,

Derek
 public     | maillog_received | subscription |         0 |         4 |       4003 |
{1,2131,16039,1496,1821,15033,1167,15055,710,982,1433,1608,606,14214,1765,1873,15790,89,15265,2861,15040,1366,1270,14131,1792,505,13717,15267,1076,1176,780,656,1839,676,15168,12227,10156,6051,820,10791,11527,1825,5563,8675,2335,8031,14962,14059,4965,13917,1134,1186,1977,1982,4973,6540,10666,3385,10005,1969,8503,13395,6154,12568,8284,8551,8630,13272,2150,5108,11166,13186,497,2533,6648,7470,1045,2846,3356,11378,12673,15264,15456,5447,8594,13530,7353,13687,688,1759,5473,7804,10533,2437,3097,5227,6691,13130,4445,9142,9276,10212,12619,605,3470,4614,7550,7659,7772,8533,8586,13727,5727,6595,7817,9128,9806,10170,8743,8801,12631,13359,1881,2389,10091,10394,11540,11643,11760,478,3870,8941,8979,9156,9201,9743,10114,13872,15078,2531,2706,3641,9272,9514,9794,12527,463,3011,5193,5501,7114,7338,8819,9203,9291,9777,10342,13559,3449,7173,8093,8475,8701,10205,13353,13685,1824,3078,3188,3781,7078,7700,7848,8410,8654,8891,8989,11261,12339,2659,2844,4326,6297,8404,8628,8703,9404,9638,10361,12311,2149,2877,4087,4325,5411,7018,7054,7818,9211,9359,9542,9795,10454,12217,12782,2287,4801,5338,7361,7625,7701,8836,9496,9677,9700,10118,10467,10725,10876,11259,11287,13409,13495,13829,2987,4482,4680,4718,4786,5286,5386,5543,5751,5798,6305,6537,7539,8098,8769,9015,10146,10445,11369,12236,12755,13094,13874}
|
{0.122233,0.0794,0.0607889,0.0560444,0.0551222,0.0339333,0.0316556,0.0296667,0.0235889,0.0130667,0.0103,0.0092,0.00691111,0.00658889,0.0064,0.00533333,0.00505556,0.00442222,0.00414444,0.00404444,0.00396667,0.00317778,0.00275556,0.00265556,0.00255556,0.00253333,0.0016,0.00142222,0.00141111,0.00134444,0.00126667,0.00122222,0.00122222,0.00107778,0.00106667,0.00102222,0.00101111,0.000988889,0.000888889,0.000866667,0.000788889,0.000755556,0.000733333,0.000733333,0.000711111,0.000711111,0.000711111,0.000688889,0.000677778,0.000677778,0.000666667,0.000666667,0.000666667,0.000666667,0.000655556,0.000655556,0.000655556,0.000644444,0.000644444,0.000633333,0.000633333,0.000622222,0.000611111,0.000611111,0.0006,0.0006,0.0006,0.0006,0.000566667,0.000566667,0.000566667,0.000555556,0.000544444,0.000544444,0.000544444,0.000544444,0.000533333,0.000533333,0.000533333,0.000533333,0.000533333,0.000533333,0.000533333,0.000522222,0.000522222,0.000522222,0.000511111,0.000511111,0.0005,0.000488889,0.000488889,0.000488889,0.000488889,0.000477778,0.000477778,0.000477778,0.000477778,0.000477778,0.000466667,0.000466667,0.000466667,0.000466667,0.000466667,0.000455556,0.000455556,0.000455556,0.000455556,0.000455556,0.000455556,0.000455556,0.000455556,0.000455556,0.000444444,0.000444444,0.000444444,0.000444444,0.000444444,0.000444444,0.000433333,0.000433333,0.000433333,0.000433333,0.000422222,0.000422222,0.000422222,0.000422222,0.000422222,0.000422222,0.000422222,0.000411111,0.000411111,0.000411111,0.000411111,0.000411111,0.000411111,0.000411111,0.000411111,0.000411111,0.000411111,0.0004,0.0004,0.0004,0.0004,0.0004,0.0004,0.0004,0.000388889,0.000388889,0.000388889,0.000388889,0.000388889,0.000388889,0.000388889,0.000388889,0.000388889,0.000388889,0.000388889,0.000388889,0.000377778,0.000377778,0.000377778,0.000377778,0.000377778,0.000377778,0.000377778,0.000377778,0.000366667,0.000366667,0.000366667,0.000366667,0.000366667,0.000366667,0.000366667,0.000366667,0.000366667,0.000366667,0.000366667,0.000366667,0.000366667,0.000355556,0.000355556,0.000355556,0.000355556,0.000355556,0.000355556,0.000355556,0.000355556,0.000355556,0.000355556,0.000355556,0.000344444,0.000344444,0.000344444,0.000344444,0.000344444,0.000344444,0.000344444,0.000344444,0.000344444,0.000344444,0.000344444,0.000344444,0.000344444,0.000344444,0.000344444,0.000333333,0.000333333,0.000333333,0.000333333,0.000333333,0.000333333,0.000333333,0.000333333,0.000333333,0.000333333,0.000333333,0.000333333,0.000333333,0.000333333,0.000333333,0.000333333,0.000333333,0.000333333,0.000333333,0.000322222,0.000322222,0.000322222,0.000322222,0.000322222,0.000322222,0.000322222,0.000322222,0.000322222,0.000322222,0.000322222,0.000322222,0.000322222,0.000322222,0.000322222,0.000322222,0.000322222,0.000322222,0.000322222,0.000322222,0.000322222,0.000322222,0.000322222}
|
{436,551,590,633,692,708,853,954,998,1061,1135,1178,1269,1343,1396,1443,1502,1605,1659,1723,1791,1845,1980,2085,2153,2192,2231,2245,2265,2303,2337,2369,2391,2414,2444,2484,2511,2546,2566,2597,2637,2697,2728,2758,2792,2832,2964,2995,3037,3052,3080,3112,3139,3171,3237,3275,3294,3344,3376,3404,3490,3527,3564,3584,3639,3713,3762,3800,3850,3898,3921,3994,4067,4085,4117,4156,4179,4209,4253,4274,4327,4400,4427,4495,4587,4628,4668,4697,4756,4789,4833,4877,4915,4969,5010,5069,5119,5150,5224,5257,5303,5353,5379,5405,5452,5491,5598,5613,5658,5713,5739,5775,5849,5905,5939,5973,6022,6061,6083,6156,6176,6198,6232,6252,6283,6290,6331,6377,6424,6476,6541,6570,6598,6624,6656,6696,6717,6791,6834,6860,6927,6995,7059,7101,7149,7204,7260,7292,7346,7370,7416,7455,7512,7575,7616,7673,7728,7765,7809,7864,7947,7996,8052,8081,8106,8153,8222,8246,8318,8357,8424,8439,8504,8557,8596,8664,8692,8730,8768,8792,8850,8900,8930,9013,9058,9106,9171,9224,9280,9314,9350,9393,9443,9482,9539,9560,9593,9658,9772,9854,9902,9941,9984,10030,10060,10095,10120,10145,10175,10231,10267,10319,10362,10464,10486,10535,10563,10614,10653,10689,10738,10775,10801,10845,10874,10923,10969,11053,11088,11107,11159,11207,11268,11330,11366,11420,11474,11524,11577,11610,11686,11722,11799,11834,11883,11913,11947,12049,12090,12145,12191,12243,12294,12336,12377,12428,12478,12530,12599,12618,12669,12714,12768,12828,12863,12898,12959,13010,13048,13093,13117,13154,13212,13243,13312,13369,13388,13475,13536,13593,13640,13688,13759,13798,13823,13849,13901,13955,13999,14049,14095,14184,14882,15031,15072,15126,15370,15522,15817,15946,16211}
| -0.0530052 
 public     | maillog_received | subscription |         0 |         4 |       3462 |
{1,2131,1821,1496,89,15033,16039,1167,15055,1765,710,982,1608,1433,606,2861,1792,15265,1873,14214,15040,505,10156,1366,2846,13717,1270,2532,10605,15267,780,9156,15790,8654,1186,1176,1839,5793,686,14131,15168,9593,676,7817,12081,7114,1134,8093,7804,15072,3356,10104,3687,3746,8801,688,1825,7659,13272,13296,1922,12227,4614,8941,1759,1982,5501,8743,11527,8503,3282,5727,2335,3641,13530,13783,9795,10005,5447,5659,5798,7338,2811,4877,8675,10362,10467,5286,6297,7054,8031,11378,3364,5775,7772,8081,8551,4754,11209}
|
{0.248344,0.0744,0.0433222,0.0391111,0.0318778,0.0298111,0.0296556,0.0237444,0.0214333,0.0213333,0.0158333,0.0103,0.0089,0.00763333,0.00646667,0.00574444,0.00566667,0.00402222,0.00368889,0.00346667,0.00345556,0.00287778,0.00223333,0.00215556,0.00215556,0.00203333,0.00184444,0.00157778,0.00143333,0.00135556,0.0013,0.00125556,0.00123333,0.0012,0.00116667,0.00113333,0.00113333,0.00108889,0.001,0.001,0.000911111,0.0009,0.000877778,0.000866667,0.000866667,0.000844444,0.000777778,0.000777778,0.000722222,0.0007,0.000677778,0.0006,0.000588889,0.000577778,0.000566667,0.000555556,0.000533333,0.000533333,0.000533333,0.000533333,0.000522222,0.000511111,0.0005,0.0005,0.000488889,0.000488889,0.000488889,0.000488889,0.000488889,0.000477778,0.000466667,0.000466667,0.000455556,0.000455556,0.000455556,0.000444444,0.000433333,0.000433333,0.000422222,0.000422222,0.000422222,0.000411111,0.0004,0.0004,0.0004,0.0004,0.0004,0.000388889,0.000388889,0.000388889,0.000388889,0.000388889,0.000377778,0.000377778,0.000377778,0.000377778,0.000377778,0.000366667,0.000366667}
|
{3,497,584,649,803,927,1038,1070,1175,1365,1417,1468,1577,1636,1693,1791,1836,1956,2040,2128,2158,2192,2231,2245,2265,2287,2328,2363,2390,2413,2433,2467,2503,2533,2565,2609,2659,2697,2723,2748,2802,2840,2901,2987,3021,3039,3074,3084,3112,3142,3188,3246,3291,3344,3376,3401,3470,3509,3563,3592,3659,3739,3781,3833,3866,3898,3951,4020,4073,4099,4143,4179,4209,4250,4274,4325,4381,4420,4452,4547,4603,4662,4690,4756,4794,4838,4880,4917,4969,5010,5044,5108,5155,5227,5256,5299,5355,5386,5411,5452,5488,5575,5600,5626,5701,5739,5771,5881,5930,5973,6031,6067,6144,6162,6197,6232,6256,6285,6304,6339,6399,6461,6524,6545,6595,6620,6664,6696,6715,6769,6834,6866,6930,7019,7059,7100,7149,7201,7261,7293,7353,7370,7411,7455,7506,7550,7594,7625,7677,7707,7762,7789,7828,7862,7955,8019,8062,8098,8124,8186,8246,8314,8355,8404,8429,8475,8524,8574,8620,8672,8703,8744,8770,8813,8850,8900,8932,8979,9020,9063,9093,9142,9201,9224,9280,9302,9319,9356,9393,9423,9482,9507,9539,9560,9623,9658,9718,9777,9831,9887,9908,9973,10032,10064,10098,10120,10146,10172,10192,10231,10267,10319,10361,10403,10466,10525,10548,10596,10651,10680,10721,10768,10791,10810,10857,10912,10958,11046,11091,11113,11165,11196,11261,11330,11368,11398,11446,11517,11556,11596,11636,11684,11746,11799,11835,11891,11927,11969,12030,12123,12145,12191,12242,12274,12317,12354,12416,12485,12530,12576,12610,12631,12676,12721,12780,12828,12863,12903,12962,13026,13054,13104,13130,13185,13215,13305,13346,13369,13395,13477,13524,13570,13624,13685,13727,13787,13819,13837,13874,13917,13977,14028,14049,14168,14962,15078,15245,15416,15616,15861,16090}
|  -0.206239