Обсуждение: BUG #6587: Limit on a query is mis-documented

Поиск
Список
Период
Сортировка

BUG #6587: Limit on a query is mis-documented

От
mboldi@prospectiv.com
Дата:
The following bug has been logged on the website:

Bug reference:      6587
Logged by:          Mike Boldi
Email address:      mboldi@prospectiv.com
PostgreSQL version: 9.0.3
Operating system:   Linux/Redhat
Description:=20=20=20=20=20=20=20=20

in the manual it says
If a limit count is given, no more than that many rows will be returned (but
possibly less, if the query itself yields less rows)
BUT !!  limit caps the largest internal spool. not the rows returned.
I am doing a large query so I limit the query to 5500000 but I get 88 rows
returned. I increase this to 9000000 and get 136 rows returned ..
explain Plan ...
Aggregate  (cost=3D1689951.65..1689951.66 rows=3D1 width=3D0)
   ->  Hash Join  (cost=3D644.62..1687656.17 rows=3D918194 width=3D0)
         Hash Cond: (a.email_a =3D o.email)
         ->  Subquery Scan on a  (cost=3D0.00..1654854.18 rows=3D9149 width=
=3D64)
               Filter: ((a.memberid_a > a.memberid_b) AND (a.email_a =3D
a.email_b) AND (length(a.email_a) > 0))
               ->  Limit  (cost=3D0.00..1325501.10 rows=3D16467654 width=3D=
690)
                     ->  Nested Loop  (cost=3D0.00..1325501.10 rows=3D16467=
654
width=3D690)
                           ->  Seq Scan on lgen_fraud_members12282=20
(cost=3D0.00..1597.14 rows=3D5381 width=3D345)
                                 Filter: ((regcomplete <> (-8)) AND (dateid
>=3D 5938))
                           ->  Index Scan using lgah11_12282 on
lgen_fraud_members12282  (cost=3D0.00..123.62 rows=3D3060 width=3D345)
                                 Index Cond:
((staging.lgen_fraud_members12282.dateid >=3D
staging.lgen_fraud_members12282.dateid) AND
(staging.lgen_fraud_members12282.memberid >
staging.lgen_fraud_members12282.memberid))
         ->  Hash  (cost=3D393.72..393.72 rows=3D20072 width=3D32)
               ->  Seq Scan on lgen_fraud_email o  (cost=3D0.00..393.72
rows=3D20072 width=3D32)

Re: BUG #6587: Limit on a query is mis-documented

От
Greg Stark
Дата:
On Fri, Apr 13, 2012 at 8:40 PM,  <mboldi@prospectiv.com> wrote:
> I am doing a large query so I limit the query to 5500000 but I get 88 rows
> returned. I increase this to 9000000 and get 136 rows returned ..
> explain Plan ...

Please send the query with and without the LIMIT.

It might also help to send more such as the schema and possibly even
the data if that's possible. But at least send the queries so we can
see what you're doing. The plan does not look correct for what you've
described.

--
greg

Re: BUG #6587: Limit on a query is mis-documented

От
John R Pierce
Дата:
On 04/13/12 12:40 PM, mboldi@prospectiv.com wrote:
> Aggregate  (cost=1689951.65..1689951.66 rows=1 width=0)
>     ->   Hash Join  (cost=644.62..1687656.17 rows=918194 width=0)
>           Hash Cond: (a.email_a = o.email)
>           ->   Subquery Scan on a  (cost=0.00..1654854.18 rows=9149 width=64)
>                 Filter: ((a.memberid_a>  a.memberid_b) AND (a.email_a =
> a.email_b) AND (length(a.email_a)>  0))
>                 ->   Limit  (cost=0.00..1325501.10 rows=16467654 width=690)
>                       ->   Nested Loop  (cost=0.00..1325501.10 rows=16467654
> width=690)
>                             ->   Seq Scan on lgen_fraud_members12282
> (cost=0.00..1597.14 rows=5381 width=345)
>                                   Filter: ((regcomplete<>  (-8)) AND (dateid
>> >= 5938))
>                             ->   Index Scan using lgah11_12282 on
> lgen_fraud_members12282  (cost=0.00..123.62 rows=3060 width=345)
>                                   Index Cond:
> ((staging.lgen_fraud_members12282.dateid>=
> staging.lgen_fraud_members12282.dateid) AND
> (staging.lgen_fraud_members12282.memberid>
> staging.lgen_fraud_members12282.memberid))
>           ->   Hash  (cost=393.72..393.72 rows=20072 width=32)
>                 ->   Seq Scan on lgen_fraud_email o  (cost=0.00..393.72
> rows=20072 width=32)

without seeing the query, I can only suggest that it appears the LIMIT
is inside of a nested query or something, as it looks like it was
applied prior to additional filtering and joining.







--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast