Re: Sorting performance vs. MySQL

Поиск
Список
Период
Сортировка
От Yang Zhang
Тема Re: Sorting performance vs. MySQL
Дата
Msg-id 9066fa251002222151u87ba137lb7785e42d66e59d6@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Sorting performance vs. MySQL  (Scott Marlowe <scott.marlowe@gmail.com>)
Ответы Re: Sorting performance vs. MySQL  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Sorting performance vs. MySQL  (Scott Marlowe <scott.marlowe@gmail.com>)
Re: Sorting performance vs. MySQL  (Alex Hunsaker <badalex@gmail.com>)
Re: Sorting performance vs. MySQL  (John Gage <jsmgage@numericable.fr>)
Re: Sorting performance vs. MySQL  (Baron Schwartz <baron@xaprb.com>)
Список pgsql-general
nnnnnOn Mon, Feb 22, 2010 at 3:45 PM, Scott Marlowe
<scott.marlowe@gmail.com> wrote:
> On Mon, Feb 22, 2010 at 12:53 PM, Yang Zhang <yanghatespam@gmail.com> wrote:
>> On Mon, Feb 22, 2010 at 2:52 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>>> On Mon, Feb 22, 2010 at 11:10 AM, Yang Zhang <yanghatespam@gmail.com> wrote:
>>>> I have the exact same table of data in both MySQL and Postgresql. In Postgresql:
>>>
>>> Just wondering, are these on the same exact machine?
>>>
>>
>> Yes, on the same disk.
>
> I'm wondering how much of this could be caching effects.  Is the MySQL
> database "warmed up" before you started, and the pgsql database is
> "cold" and no caching has taken place?
>
> What do things like vmstat 10 say while the query is running on each
> db?  First time, second time, things like that.

Awesome -- this actually led me to discover the problem.

When running the query in MySQL InnoDB:

$ vmstat 10
procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0 13 13733604  83020   5648 2193884    3    3   936   168    2    1
4  2 89  5  0
 1 12 13749952  80164   5600 2178032    0 4354   908  4379 3586 2638
0  1 38 60  0
 0 19 13762228  80576   5556 2145220  208 3527  1280  3690 3668 2635
1  1 39 59  0
 0 19 13778632  79420   5560 2135228   52 4186  1046  4191 3682 2418
0  1 37 62  0
 0 19 13792964  77336   5592 2082520   41 3731  1698  3804 4102 2686
1  1 53 45  0
 0 14 13810356  84036   5556 2049836   36 4241   797  4246 3913 2603
0  1 68 31  0
 1 14 13825640  81336   5520 2001920    0 4212   958  4220 3848 2736
1  1 73 25  0
 0 17 13844952  78036   5476 1976956    8 4685   923  4689 3832 2547
0  1 69 29  0
 2 13 13863828  79812   5448 1954952    3 4627   692  4634 3744 2505
0  1 70 28  0
 0 15 13883828  77764   5440 1920528  249 4544   972  4548 4345 2506
0  1 70 28  0
 1 20 13898900  79132   5456 1890192   28 4341   723  4438 4982 3030
0  3 64 33  0
 0 11 13915252  85184   5624 1865260   79 3668   752  3764 4472 2765
0  3 57 40  0
 0 12 13933964  78448   5700 1832640  120 4327  1066  4434 4484 2777
1  3 52 45  0
 0 19 13951748  77640   5816 1795720   94 4005  1159  4091 4580 2762
1  3 48 49  0
 0 16 13972748  79884   5780 1753676    0 4737   787  4746 4385 2766
1  3 51 45  0
 0 25 13988108  78936   5884 1726068  547 3954  1468  4116 4976 3502
0  4 44 52  0
 1 20 14011500  77676   5868 1689136  161 4980   843  5506 5218 3131
0  3 34 62  0
 0 22 14032472  81348   5816 1647884  270 4198   943  4369 4521 2826
1  3 40 56  0
 0 23 14055220  81712   5804 1626872  193 4774  1408  4856 4617 2754
1  3 38 58  0
 0 21 14075848  81844   5696 1576836    0 4738   974  4742 4528 2704
1  3 40 56  0
 0 25 14097260  79788   5628 1536336  213 4512   922  4639 4726 2929
1  3 27 69  0
 0 24 14123900  80820   5616 1488460  319 5033  1059  5128 4895 2780
2  3 17 78  0
 1 26 14142632  77276   5660 1445592  445 4605  1434  4727 5401 3364
1  3 16 79  0
 0 31 14165668  83736   5976 1387048  362 4288  1444  4428 4739 2963
2  3 17 78  0
 1 28 14180104  77564   6324 1369232  387 4526  2222  4677 5748 3559
1  3 16 80  0

I'm guessing the swap numbers are because MySQL uses mmap?

Anyway, when running the query in Postgresql:

$ vmstat 10
procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  2 13866336 1574540  25024 7878980    3    3   936   168    2    1
4  2 89  5  0
 1  3 13861520 1163596  25328 8128360 1046    0 25429   433 4368 4175
4  2 80 14  0
 0  3 13856484 803024  25600 8343220 1117    0 22062   688 4492 4590
4  2 73 20  0
 0  6 13855304 678868  26052 8435540  160    0  9239   598 5195 7141
1  5 70 24  0
 0  6 13853644 513568  26332 8563680  401    0 12480  7100 4775 4248
3  3 68 26  0
 2  2 13851804 166228  26624 8775304  634    0 21466  1497 4680 4550
6  2 64 28  0
 0  5 13861556  81896  26740 8825360  860 3547  6100  3847 5142 3386
6  2 57 35  0
 0  6 13867688  91368  26808 8832712  653 3326  1835  3604 4738 2762
5  2 61 32  0
 0  5 13870676  88524  26872 8849392  638 3272  2578  3517 4864 2909
4  2 55 39  0
 0  5 13872748  79512  27004 8864456  629 1788  2086  2949 4337 2921
1  3 55 41  0
 0  7 13876760  83124  27136 8867272 1018 2253  1713  2409 4321 2889
0  3 63 33  0
 0  6 13878964  82876  27240 8874540  792 2119  1854  2314 4288 2813
2  2 72 24  0
 3  4 13883204  81224  27280 8887068  661 3067  2995  3385 4558 2899
4  2 72 22  0
 0  6 13886636  82036  27352 8905628  594 3726  2628  4013 4744 2765
4  2 69 25  0
 0  8 13899852  85604  27400 8925800  638 4423  2689  4658 4903 2808
4  2 55 40  0
 1  4 13905184  80544  27484 8940040  676 3501  3006  3799 4805 2932
4  2 66 28  0
 0  9 13908480  80100  27516 8948476  668 2996  1720  3192 4594 2799
4  2 60 35  0

vmstat showed no swapping-out for a while, and then suddenly it
started spilling a lot. Checking psql's memory stats showed that it
was huge -- apparently, it's trying to store its full result set in
memory. As soon as I added a LIMIT 10000, everything worked
beautifully and finished in 4m (I verified that the planner was still
issuing a Sort).

I'm relieved that Postgresql itself does not, in fact, suck, but
slightly disappointed in the behavior of psql. I suppose it needs to
buffer everything in memory to properly format its tabular output,
among other possible reasons I could imagine.

Now it all almost seems obvious. My only solace is that it wasn't obv.
to the list either (thanks to everyone for their feedback!). Should've
just started with system profiling -- saves the day again.

>
> Also, just curios, what's shared_buffers set to on the pgsql instance?

 shared_buffers
----------------
 320000kB
(1 row)
--
Yang Zhang
http://www.mit.edu/~y_z/

В списке pgsql-general по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: comment on constraint
Следующее
От: Yang Zhang
Дата:
Сообщение: Re: Sorting performance vs. MySQL