Обсуждение: Query plan good in 8.4, bad in 9.2 and better in 9.3
OK so we have a query that does OK in 8.4, goes to absolute crap in 9.2 and then works great in 9.3. Thing is we've spent several months regression testing 9.2 and no time testing 9.3, so we can't just "go to 9.3" in an afternoon. But we might have to. 9.2 seems hopelessly broken here. The query looks something like this: SELECT COUNT(*) FROM u, ug WHERE u.ugid = ug.id AND NOT u.d AND ug.somefield IN (SELECT somefunction(12345)); In 8.4 we get this plan http://explain.depesz.com/s/r3hF which takes ~5ms In 9.2 we get this plan http://explain.depesz.com/s/vM7 which takes ~10s In 9.3 we get this plan http://explain.depesz.com/s/Wub which takes ~0.35ms The data sets are identical, the schemas are identical. Making changes to random_page_cost, sequential_page_cost and various other tuning parameters don't make it any better. PG versions: 8.4.20, 9.2.8, 9.3.4 Adding a limit to the function DOES make 9.2 better, ala: SELECT COUNT(*) FROM u, ug WHERE u.ugid = ug.id AND NOT u.d AND ug.somefield IN (SELECT somefunction(12345) limit 199); If the limit is 200 the bad plan shows up again. Question, is this a known issue with 9.2? If so is it something that will one day be fixed or are we stuck with it? Is there a workaround to make it better? Note: I'd rather not have to compile 9.2 from source with a patch, but at this point that would be acceptable over "you're stuck with it". -- To understand recursion, one must first understand recursion.
Scott Marlowe <scott.marlowe@gmail.com> writes: > OK so we have a query that does OK in 8.4, goes to absolute crap in > 9.2 and then works great in 9.3. Thing is we've spent several months > regression testing 9.2 and no time testing 9.3, so we can't just "go > to 9.3" in an afternoon. But we might have to. 9.2 seems hopelessly > broken here. > The query looks something like this: > SELECT COUNT(*) FROM u, ug > WHERE u.ugid = ug.id > AND NOT u.d > AND ug.somefield IN (SELECT somefunction(12345)); You really should show us somefunction's definition if you want useful comments. I gather however that it returns a set. 8.4 seems to be planning on the assumption that the set contains only one row, which is completely unjustified in general though it happens to be true in your example. 9.2 is assuming 1000 rows in the set, and getting a sucky plan because that's wrong. 9.3 is still assuming that; and I rather doubt that you are really testing 9.3 on the same data, because 9.2 is finding millions of rows in a seqscan of u while 9.3 is finding none in the exact same seqscan. I'd suggest affixing a ROWS estimate to somefunction, or better declaring it to return singleton not set if that's actually always the case. regards, tom lane
On Thu, May 15, 2014 at 9:35 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
OK so we have a query that does OK in 8.4, goes to absolute crap in
9.2 and then works great in 9.3. Thing is we've spent several months
regression testing 9.2 and no time testing 9.3, so we can't just "go
to 9.3" in an afternoon. But we might have to. 9.2 seems hopelessly
broken here.
The query looks something like this:
SELECT COUNT(*) FROM u, ug
WHERE u.ugid = ug.id
AND NOT u.d
AND ug.somefield IN (SELECT somefunction(12345));
In 8.4 we get this plan http://explain.depesz.com/s/r3hF which takes ~5ms
In 9.2 we get this plan http://explain.depesz.com/s/vM7 which takes ~10s
In 9.3 we get this plan http://explain.depesz.com/s/Wub which takes ~0.35ms
Based on the actual row counts given in the seq scan on u, , in 9.2, u contains millions of rows. In 9.3, it contains zero rows.
The data sets are identical, the schemas are identical.
Please double check that.
Cheers,
Jeff
On Thu, May 15, 2014 at 10:52 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Scott Marlowe <scott.marlowe@gmail.com> writes: >> OK so we have a query that does OK in 8.4, goes to absolute crap in >> 9.2 and then works great in 9.3. Thing is we've spent several months >> regression testing 9.2 and no time testing 9.3, so we can't just "go >> to 9.3" in an afternoon. But we might have to. 9.2 seems hopelessly >> broken here. > >> The query looks something like this: > >> SELECT COUNT(*) FROM u, ug >> WHERE u.ugid = ug.id >> AND NOT u.d >> AND ug.somefield IN (SELECT somefunction(12345)); > > You really should show us somefunction's definition if you want > useful comments. I gather however that it returns a set. 8.4 > seems to be planning on the assumption that the set contains > only one row, which is completely unjustified in general though > it happens to be true in your example. 9.2 is assuming 1000 rows > in the set, and getting a sucky plan because that's wrong. 9.3 > is still assuming that; and I rather doubt that you are really > testing 9.3 on the same data, because 9.2 is finding millions of > rows in a seqscan of u while 9.3 is finding none in the exact > same seqscan. > > I'd suggest affixing a ROWS estimate to somefunction, or better > declaring it to return singleton not set if that's actually > always the case. Well great, now I look like an idiot. Last time I trust someone else to set up my test servers. Anyway, yeah, affixing a rows estimate fixes this for us 100%. So thanks!