Interesting group by behavior
От | Culberson, Philip |
---|---|
Тема | Interesting group by behavior |
Дата | |
Msg-id | 83CC9259BB25D31186F400A0C9975D9C9CE216@datmail02.dat.com обсуждение исходный текст |
Список | pgsql-general |
Greetings: I am somewhat baffled by the horrendous time difference between doing two separate queries versus a single query with a group by clause. The table in question has ~2.5 million rows, of which 847,324 are in the date range selected. An explain plan on all three queries show that the cost is the same for each, although the select .. group by takes 50 times longer. Below I've included output from both PSQL and the postmaster log. Note that the select .. group by query had a 100% buffer hit rate but a huge jump in the "Direct blocks" reads and writes. I have two question's: 1) What is the difference between the "Shared/Local/Direct" block stats? 2) Is the long running "select .. group by" an undocumented feature (bug)? Regards, Phil Culberson test=> \d posting Table = posting +----------------------------------+----------------------------------+----- --+ | Field | Type | Length| +----------------------------------+----------------------------------+----- --+ | date | date | 4 | | comp_id | int4 | 4 | | posting_type | char() | 1 | | equip_type | char() | 2 | | partial_flag | bool | 1 | | origin_id | int4 | 4 | | dest_id | int4 | 4 | +----------------------------------+----------------------------------+----- --+ Indices: date_idx dest_idx orig_idx test=> \d date_idx Table = date_idx +----------------------------------+----------------------------------+----- --+ | Field | Type | Length| +----------------------------------+----------------------------------+----- --+ | date | date | 4 | +----------------------------------+----------------------------------+----- --+ Log output ========== 991006.12:36:15.659 [1321] StartTransactionCommand 991006.12:36:15.660 [1321] query: SELECT COUNT(posting_type) FROM posting WHERE date BETWEEN 19990801::date AND 19990831::date AND posting_type = 'L'; 991006.12:36:15.667 [1321] ProcessQuery ! system usage stats: ! 41.747049 elapsed 39.280000 user 0.240000 system sec ! [243.440000 user 2.450000 sys total] ! 1/0 [19/6] filesystem blocks in/out ! 474/0 [7425/0] page faults/reclaims, 0 [0] swaps ! 0 [1] signals rcvd, 0/0 [11/13] messages rcvd/sent ! 475/503 [7472/3142] voluntary/involuntary context switches ! postgres usage stats: ! Shared blocks: 1800 read, 0 written, buffer hit rate = 99.79% ! Local blocks: 0 read, 0 written, buffer hit rate = 0.00% ! Direct blocks: 0 read, 0 written 991006.12:36:57.408 [1321] CommitTransactionCommand Elapsed time: ~42 seconds PSQL output =========== SELECT COUNT(posting_type) FROM posting WHERE date BETWEEN 19990801::date AND 19990831::date AND posting_type = 'L'; count ------ 805124 (1 row) Log output ========== 991006.12:36:57.418 [1321] StartTransactionCommand 991006.12:36:57.418 [1321] query: SELECT COUNT(posting_type) FROM posting WHERE date BETWEEN 19990801::date AND 19990831::date AND posting_type = 'E'; 991006.12:36:57.424 [1321] ProcessQuery ! system usage stats: ! 27.584268 elapsed 27.550000 user 0.000000 system sec ! [271.000000 user 2.450000 sys total] ! 0/0 [19/6] filesystem blocks in/out ! 0/0 [7425/0] page faults/reclaims, 0 [0] swaps ! 0 [1] signals rcvd, 0/0 [12/14] messages rcvd/sent ! 0/528 [7473/3670] voluntary/involuntary context switches ! postgres usage stats: ! Shared blocks: 0 read, 0 written, buffer hit rate = 100.00% ! Local blocks: 0 read, 0 written, buffer hit rate = 0.00% ! Direct blocks: 0 read, 0 written 991006.12:37:25.004 [1321] CommitTransactionCommand Elapsed time: ~28 seconds PSQL output =========== SELECT COUNT(posting_type) FROM posting WHERE date BETWEEN 19990801::date AND 19990831::date AND posting_type = 'E'; count ----- 42200 (1 row) Log output ========== 991006.12:37:25.014 [1321] StartTransactionCommand 991006.12:37:25.015 [1321] query: SELECT posting_type, COUNT(posting_type) FROM posting WHERE date BETWEEN 19990801::date AND 19990831::date GROUP BY posting_type; 991006.12:37:25.019 [1321] ProcessQuery ! system usage stats: ! 3515.408886 elapsed 3508.790000 user 2.050000 system sec ! [3779.800000 user 4.510000 sys total] ! 4/19 [23/25] filesystem blocks in/out ! 0/0 [7425/0] page faults/reclaims, 0 [0] swaps ! 0 [1] signals rcvd, 0/0 [13/15] messages rcvd/sent ! 41/34863 [7515/38533] voluntary/involuntary context switches ! postgres usage stats: ! Shared blocks: 0 read, 0 written, buffer hit rate = 100.00% ! Local blocks: 0 read, 0 written, buffer hit rate = 0.00% ! Direct blocks: 10137 read, 10964 written 991006.13:36:00.425 [1321] CommitTransactionCommand Elapsed time: ~58 MINUTES 35 SECONDS! PSQL output =========== SELECT posting_type, COUNT(posting_type) FROM posting WHERE date BETWEEN 19990801::date AND 19990831::date GROUP BY posting_type; posting_type| count ------------+------ E | 42200 L |805124 (2 rows)
В списке pgsql-general по дате отправления:
Следующее
От: Bruce MomjianДата:
Сообщение: Re: [GENERAL] what does this mean - pg_psort.1922.0 in database directory?]