Обсуждение: sort operation leads planner to different number of rows?
I'm in the process of upgrading one of my servers from 7.3 to 8.1, and have run across a query that is slower on the new 8.1 box. FWIW The data is all freshly loaded and freshly analyzed, and this is 8.1.1 to be precise. The part that I am really curious about right now is this snippit of the explain plan: > Sort (cost=616.64..620.56 rows=1568 width=12) (actual time=46.579..54.641 rows=6407 loops=1) Sort Key: latest_download.host_id -> Subquery Scan latest_download (cost=498.14..533.42 rows=1568 width=12) (actual time=43.657..45.594 rows=472 loops=1) I am wondering why it would end up with a different number of rows after the sort operation. If you want to see the full explain analyze, it's at http://rafb.net/paste/results/D8lq9v79.html, these are lines 29-31. TIA Robert Treat -- Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
Robert Treat <xzilla@users.sourceforge.net> writes:
> Sort (cost=616.64..620.56 rows=1568 width=12) (actual time=46.579..54.641 rows=6407 loops=1)
> Sort Key: latest_download.host_id
> -> Subquery Scan latest_download (cost=498.14..533.42 rows=1568 width=12) (actual time=43.657..45.594
rows=472loops=1)
> I am wondering why it would end up with a different number of rows after
> the sort operation.
The planner's estimate didn't change: 1568 at both steps. The "actual"
is the number of rows actually pulled from the node at runtime, and the
discrepancy here occurs because this is the inner side of a mergejoin.
mergejoin has to rescan duplicate inner rows to join them to duplicate
outer rows. It looks like you have a pretty fair number of
duplicates...
regards, tom lane