Re: anti-join chosen even when slower than old plan

От: Kevin Grittner
Тема: Re: anti-join chosen even when slower than old plan
Дата: ,
Msg-id: 4CDA62C60200002500037512@gw.wicourts.gov
(см: обсуждение, исходный текст)
Ответ на: Re: anti-join chosen even when slower than old plan  (Tom Lane)
Ответы: Re: anti-join chosen even when slower than old plan  (Tom Lane)
Re: anti-join chosen even when slower than old plan  (Robert Haas)
Список: pgsql-performance

Скрыть дерево обсуждения

anti-join chosen even when slower than old plan  ("Kevin Grittner", )
 Re: anti-join chosen even when slower than old plan  ("Kevin Grittner", )
  Re: anti-join chosen even when slower than old plan  (Tom Lane, )
   Re: anti-join chosen even when slower than old plan  ("Kevin Grittner", )
 Re: anti-join chosen even when slower than old plan  (Tom Lane, )
  Re: anti-join chosen even when slower than old plan  (Grzegorz Jaśkiewicz, )
  Re: anti-join chosen even when slower than old plan  ("Kevin Grittner", )
   Re: anti-join chosen even when slower than old plan  (Tom Lane, )
   Re: anti-join chosen even when slower than old plan  (Robert Haas, )
    Re: anti-join chosen even when slower than old plan  ("Kevin Grittner", )
     Re: anti-join chosen even when slower than old plan  (Tom Lane, )
      Re: anti-join chosen even when slower than old plan  (Robert Haas, )
       Re: anti-join chosen even when slower than old plan  (Kenneth Marshall, )
        Re: anti-join chosen even when slower than old plan  (Mladen Gogala, )
         Re: anti-join chosen even when slower than old plan  (Kenneth Marshall, )
         Re: anti-join chosen even when slower than old plan  (Bob Lunney, )
     Re: anti-join chosen even when slower than old plan  (Mladen Gogala, )
      Re: anti-join chosen even when slower than old plan  ("Kevin Grittner", )
       Re: anti-join chosen even when slower than old plan  (Mladen Gogala, )
        Re: anti-join chosen even when slower than old plan  (Tom Lane, )
         Re: anti-join chosen even when slower than old plan  (Mladen Gogala, )
         Re: anti-join chosen even when slower than old plan  (Bruce Momjian, )
        Re: anti-join chosen even when slower than old plan  (Craig James, )
       Re: anti-join chosen even when slower than old plan  (Robert Haas, )
        Re: anti-join chosen even when slower than old plan  (Tom Lane, )
         Re: anti-join chosen even when slower than old plan  (Tom Lane, )
          Re: anti-join chosen even when slower than old plan  (Robert Haas, )
           Re: anti-join chosen even when slower than old plan  (Tom Lane, )
            Re: anti-join chosen even when slower than old plan  ("Kevin Grittner", )
            Re: anti-join chosen even when slower than old plan  (Robert Haas, )
             Re: anti-join chosen even when slower than old plan  (<>, )
              Re: anti-join chosen even when slower than old plan  (Kenneth Marshall, )
             Re: anti-join chosen even when slower than old plan  (Cédric Villemain, )
              Re: anti-join chosen even when slower than old plan  (Robert Haas, )
               Re: anti-join chosen even when slower than old plan  (Tom Lane, )
                Re: anti-join chosen even when slower than old plan  (Robert Haas, )
                Re: anti-join chosen even when slower than old plan  (Cédric Villemain, )
                 Re: anti-join chosen even when slower than old plan  ("Marc Mamin", )
                  Re: anti-join chosen even when slower than old plan  (bricklen, )
                  Re: anti-join chosen even when slower than old plan  (Robert Haas, )
                Re: anti-join chosen even when slower than old plan  (Bruce Momjian, )
                 Re: anti-join chosen even when slower than old plan  (Cédric Villemain, )
                  Re: anti-join chosen even when slower than old plan  (Robert Haas, )
                   Re: anti-join chosen even when slower than old plan  (Cédric Villemain, )
                   Re: anti-join chosen even when slower than old plan  (Bruce Momjian, )
             Re: anti-join chosen even when slower than old plan  (Bruce Momjian, )
              Re: anti-join chosen even when slower than old plan  (Cédric Villemain, )
            Re: anti-join chosen even when slower than old plan  (Cédric Villemain, )
             Re: anti-join chosen even when slower than old plan  (Vitalii Tymchyshyn, )
              Re: anti-join chosen even when slower than old plan  (Cédric Villemain, )
               Re: anti-join chosen even when slower than old plan  (Vitalii Tymchyshyn, )
                Re: anti-join chosen even when slower than old plan  (Cédric Villemain, )
          Re: anti-join chosen even when slower than old plan  ("Kevin Grittner", )
           Re: anti-join chosen even when slower than old plan  ("Kevin Grittner", )
           Re: anti-join chosen even when slower than old plan  (Tom Lane, )
            Re: anti-join chosen even when slower than old plan  ("Kevin Grittner", )
            Re: anti-join chosen even when slower than old plan  (Jon Nelson, )
          Re: anti-join chosen even when slower than old plan  (Andres Freund, )
         Re: anti-join chosen even when slower than old plan  (Robert Haas, )
          Re: anti-join chosen even when slower than old plan  (Tom Lane, )
    Re: anti-join chosen even when slower than old plan  (Віталій Тимчишин, )
 Re: anti-join chosen even when slower than old plan  ("Kevin Grittner", )

Tom Lane <> wrote:
> "Kevin Grittner" <> writes:
>> The semi-join and anti-join have helped us quite a bit, but we
>> have seen a situation where anti-join is chosen even though it is
>> slower than the "old fashioned" plan.  I know there have been
>> other reports of this, but I just wanted to go on record with my
>> details.
>
> In principle, the old-style plan ought to be equivalent to a
> nestloop antijoin with a seqscan of DbTranLogRecord on the outside
> and an indexscan of DbTranRepository on the inside.  Can you force
> it to choose such a plan by setting enable_mergejoin off (and
> maybe enable_hashjoin too)?

Well, I got what I think is the equivalent plan by adding OFFSET 0
to the subquery:

 Delete  (cost=0.00..1239005015.67 rows=337702752 width=6)
   ->  Seq Scan on "DbTranLogRecord"  (cost=0.00..1239005015.67
rows=337702752 width=6)
         Filter: (NOT (SubPlan 1))
         SubPlan 1
           ->  Limit  (cost=0.00..1.82 rows=1 width=974)
                 ->  Index Scan using "DbTranRepositoryPK" on
"DbTranRepository" r  (cost=0.00..1.82 rows=1 width=974)
                       Index Cond: ((("countyNo")::smallint =
($0)::smallint) AND (("tranImageSeqNo")::numeric = ($1)::numeric))

> If so, it'd be interesting to see the estimated costs and actual
> runtime on 9.0 for that plan.

Unfortunately, based on the oprofile information I decided to check
out the plan I would get by boosting cpu_index_tuple_cost by a
factor of 20.  The resulting plan was:

 Delete  (cost=132623778.83..139506491.18 rows=1 width=12)
   ->  Merge Anti Join  (cost=132623778.83..139506491.18 rows=1
width=12)
         Merge Cond: ((("DbTranLogRecord"."tranImageSeqNo")::numeric
= (r."tranImageSeqNo")::numeric) AND
(("DbTranLogRecord"."countyNo")::smallint =
(r."countyNo")::smallint))
         ->  Sort  (cost=107941675.79..109630189.55 rows=675405504
width=20)
               Sort Key: "DbTranLogRecord"."tranImageSeqNo",
"DbTranLogRecord"."countyNo"
               ->  Seq Scan on "DbTranLogRecord"
(cost=0.00..7306496.14 rows=675405504 width=20)
         ->  Materialize  (cost=24682103.04..25443983.12
rows=152376016 width=20)
               ->  Sort  (cost=24682103.04..25063043.08
rows=152376016 width=20)
                     Sort Key: r."tranImageSeqNo", r."countyNo"
                     ->  Seq Scan on "DbTranRepository" r
(cost=0.00..3793304.86 rows=152376016 width=20)

That looked like it had potential, so I started that off and went
home before I got your post.  It finished in 3 hours and 31 minutes
-- more than twice as fast as the nestloop plan used under 8.3.

But wait -- it turns out that this pain was self-inflicted.  Based
on heavy testing of the interactive queries which users run against
this database we tuned the database for "fully-cached" settings,
with both random_page_cost and _seq_page_cost at 0.1.  In a
practical sense, the users are almost always running these queries
against very recent data which is, in fact, heavily cached -- so
it's no surprise that the queries they run perform best with plans
based on such costing.  The problem is that these weekly maintenance
runs need to pass the entire database, so caching effects are far
less pronounced.  If I set seq_page_cost = 1 and random_page_cost =
2 I get exactly the same (fast) plan as above.

I guess the lesson here is not to use the same costing for
database-wide off-hours maintenance queries as for ad hoc queries
against a smaller set of recent data by users who expect quick
response time.  I'm fine with tweaking the costs in our maintenance
scripts, but it does tend to make me daydream about how the
optimizer might possibly auto-tweak such things....

I assume there's now no need to get timings for the old plan?

-Kevin


В списке pgsql-performance по дате сообщения:

От: Tom Lane
Дата:
Сообщение: Re: anti-join chosen even when slower than old plan
От: Robert Haas
Дата:
Сообщение: Re: anti-join chosen even when slower than old plan