Обсуждение: BUG #6587: Limit on a query is mis-documented
The following bug has been logged on the website: Bug reference: 6587 Logged by: Mike Boldi Email address: mboldi@prospectiv.com PostgreSQL version: 9.0.3 Operating system: Linux/Redhat Description:=20=20=20=20=20=20=20=20 in the manual it says If a limit count is given, no more than that many rows will be returned (but possibly less, if the query itself yields less rows) BUT !! limit caps the largest internal spool. not the rows returned. I am doing a large query so I limit the query to 5500000 but I get 88 rows returned. I increase this to 9000000 and get 136 rows returned .. explain Plan ... Aggregate (cost=3D1689951.65..1689951.66 rows=3D1 width=3D0) -> Hash Join (cost=3D644.62..1687656.17 rows=3D918194 width=3D0) Hash Cond: (a.email_a =3D o.email) -> Subquery Scan on a (cost=3D0.00..1654854.18 rows=3D9149 width= =3D64) Filter: ((a.memberid_a > a.memberid_b) AND (a.email_a =3D a.email_b) AND (length(a.email_a) > 0)) -> Limit (cost=3D0.00..1325501.10 rows=3D16467654 width=3D= 690) -> Nested Loop (cost=3D0.00..1325501.10 rows=3D16467= 654 width=3D690) -> Seq Scan on lgen_fraud_members12282=20 (cost=3D0.00..1597.14 rows=3D5381 width=3D345) Filter: ((regcomplete <> (-8)) AND (dateid >=3D 5938)) -> Index Scan using lgah11_12282 on lgen_fraud_members12282 (cost=3D0.00..123.62 rows=3D3060 width=3D345) Index Cond: ((staging.lgen_fraud_members12282.dateid >=3D staging.lgen_fraud_members12282.dateid) AND (staging.lgen_fraud_members12282.memberid > staging.lgen_fraud_members12282.memberid)) -> Hash (cost=3D393.72..393.72 rows=3D20072 width=3D32) -> Seq Scan on lgen_fraud_email o (cost=3D0.00..393.72 rows=3D20072 width=3D32)
On Fri, Apr 13, 2012 at 8:40 PM, <mboldi@prospectiv.com> wrote: > I am doing a large query so I limit the query to 5500000 but I get 88 rows > returned. I increase this to 9000000 and get 136 rows returned .. > explain Plan ... Please send the query with and without the LIMIT. It might also help to send more such as the schema and possibly even the data if that's possible. But at least send the queries so we can see what you're doing. The plan does not look correct for what you've described. -- greg
On 04/13/12 12:40 PM, mboldi@prospectiv.com wrote: > Aggregate (cost=1689951.65..1689951.66 rows=1 width=0) > -> Hash Join (cost=644.62..1687656.17 rows=918194 width=0) > Hash Cond: (a.email_a = o.email) > -> Subquery Scan on a (cost=0.00..1654854.18 rows=9149 width=64) > Filter: ((a.memberid_a> a.memberid_b) AND (a.email_a = > a.email_b) AND (length(a.email_a)> 0)) > -> Limit (cost=0.00..1325501.10 rows=16467654 width=690) > -> Nested Loop (cost=0.00..1325501.10 rows=16467654 > width=690) > -> Seq Scan on lgen_fraud_members12282 > (cost=0.00..1597.14 rows=5381 width=345) > Filter: ((regcomplete<> (-8)) AND (dateid >> >= 5938)) > -> Index Scan using lgah11_12282 on > lgen_fraud_members12282 (cost=0.00..123.62 rows=3060 width=345) > Index Cond: > ((staging.lgen_fraud_members12282.dateid>= > staging.lgen_fraud_members12282.dateid) AND > (staging.lgen_fraud_members12282.memberid> > staging.lgen_fraud_members12282.memberid)) > -> Hash (cost=393.72..393.72 rows=20072 width=32) > -> Seq Scan on lgen_fraud_email o (cost=0.00..393.72 > rows=20072 width=32) without seeing the query, I can only suggest that it appears the LIMIT is inside of a nested query or something, as it looks like it was applied prior to additional filtering and joining. -- john r pierce N 37, W 122 santa cruz ca mid-left coast