Re: anti-join chosen even when slower than old plan

От:
Тема: Re: anti-join chosen even when slower than old plan
Дата: ,
Msg-id: 20101111155625.AQO09772@ms14.lnh.mail.rcn.net
(см: обсуждение, исходный текст)
Ответ на: Re: anti-join chosen even when slower than old plan  (Robert Haas)
Ответы: Re: anti-join chosen even when slower than old plan  (Kenneth Marshall)
Список: pgsql-performance

Скрыть дерево обсуждения

anti-join chosen even when slower than old plan  ("Kevin Grittner", )
 Re: anti-join chosen even when slower than old plan  ("Kevin Grittner", )
  Re: anti-join chosen even when slower than old plan  (Tom Lane, )
   Re: anti-join chosen even when slower than old plan  ("Kevin Grittner", )
 Re: anti-join chosen even when slower than old plan  (Tom Lane, )
  Re: anti-join chosen even when slower than old plan  (Grzegorz Jaśkiewicz, )
  Re: anti-join chosen even when slower than old plan  ("Kevin Grittner", )
   Re: anti-join chosen even when slower than old plan  (Tom Lane, )
   Re: anti-join chosen even when slower than old plan  (Robert Haas, )
    Re: anti-join chosen even when slower than old plan  ("Kevin Grittner", )
     Re: anti-join chosen even when slower than old plan  (Tom Lane, )
      Re: anti-join chosen even when slower than old plan  (Robert Haas, )
       Re: anti-join chosen even when slower than old plan  (Kenneth Marshall, )
        Re: anti-join chosen even when slower than old plan  (Mladen Gogala, )
         Re: anti-join chosen even when slower than old plan  (Kenneth Marshall, )
         Re: anti-join chosen even when slower than old plan  (Bob Lunney, )
     Re: anti-join chosen even when slower than old plan  (Mladen Gogala, )
      Re: anti-join chosen even when slower than old plan  ("Kevin Grittner", )
       Re: anti-join chosen even when slower than old plan  (Mladen Gogala, )
        Re: anti-join chosen even when slower than old plan  (Tom Lane, )
         Re: anti-join chosen even when slower than old plan  (Mladen Gogala, )
         Re: anti-join chosen even when slower than old plan  (Bruce Momjian, )
        Re: anti-join chosen even when slower than old plan  (Craig James, )
       Re: anti-join chosen even when slower than old plan  (Robert Haas, )
        Re: anti-join chosen even when slower than old plan  (Tom Lane, )
         Re: anti-join chosen even when slower than old plan  (Tom Lane, )
          Re: anti-join chosen even when slower than old plan  (Robert Haas, )
           Re: anti-join chosen even when slower than old plan  (Tom Lane, )
            Re: anti-join chosen even when slower than old plan  ("Kevin Grittner", )
            Re: anti-join chosen even when slower than old plan  (Robert Haas, )
             Re: anti-join chosen even when slower than old plan  (<>, )
              Re: anti-join chosen even when slower than old plan  (Kenneth Marshall, )
             Re: anti-join chosen even when slower than old plan  (Cédric Villemain, )
              Re: anti-join chosen even when slower than old plan  (Robert Haas, )
               Re: anti-join chosen even when slower than old plan  (Tom Lane, )
                Re: anti-join chosen even when slower than old plan  (Robert Haas, )
                Re: anti-join chosen even when slower than old plan  (Cédric Villemain, )
                 Re: anti-join chosen even when slower than old plan  ("Marc Mamin", )
                  Re: anti-join chosen even when slower than old plan  (bricklen, )
                  Re: anti-join chosen even when slower than old plan  (Robert Haas, )
                Re: anti-join chosen even when slower than old plan  (Bruce Momjian, )
                 Re: anti-join chosen even when slower than old plan  (Cédric Villemain, )
                  Re: anti-join chosen even when slower than old plan  (Robert Haas, )
                   Re: anti-join chosen even when slower than old plan  (Cédric Villemain, )
                   Re: anti-join chosen even when slower than old plan  (Bruce Momjian, )
             Re: anti-join chosen even when slower than old plan  (Bruce Momjian, )
              Re: anti-join chosen even when slower than old plan  (Cédric Villemain, )
            Re: anti-join chosen even when slower than old plan  (Cédric Villemain, )
             Re: anti-join chosen even when slower than old plan  (Vitalii Tymchyshyn, )
              Re: anti-join chosen even when slower than old plan  (Cédric Villemain, )
               Re: anti-join chosen even when slower than old plan  (Vitalii Tymchyshyn, )
                Re: anti-join chosen even when slower than old plan  (Cédric Villemain, )
          Re: anti-join chosen even when slower than old plan  ("Kevin Grittner", )
           Re: anti-join chosen even when slower than old plan  ("Kevin Grittner", )
           Re: anti-join chosen even when slower than old plan  (Tom Lane, )
            Re: anti-join chosen even when slower than old plan  ("Kevin Grittner", )
            Re: anti-join chosen even when slower than old plan  (Jon Nelson, )
          Re: anti-join chosen even when slower than old plan  (Andres Freund, )
         Re: anti-join chosen even when slower than old plan  (Robert Haas, )
          Re: anti-join chosen even when slower than old plan  (Tom Lane, )
    Re: anti-join chosen even when slower than old plan  (Віталій Тимчишин, )
 Re: anti-join chosen even when slower than old plan  ("Kevin Grittner", )

---- Original message ----
>Date: Thu, 11 Nov 2010 15:29:40 -0500
>From:  (on behalf of Robert Haas <>)
>Subject: Re: [PERFORM] anti-join chosen even when slower than old plan
>To: Tom Lane <>
>Cc: Kevin Grittner <>,Mladen Gogala
<>,""<> 
>
>On Thu, Nov 11, 2010 at 2:35 PM, Tom Lane <> wrote:
>> Robert Haas <> writes:
>>> Yeah.  For Kevin's case, it seems like we want the caching percentage
>>> to vary not so much based on which table we're hitting at the moment
>>> but on how much of it we're actually reading.
>>
>> Well, we could certainly take the expected number of pages to read and
>> compare that to effective_cache_size.  The thing that's missing in that
>> equation is how much other stuff is competing for cache space.  I've
>> tried to avoid having the planner need to know the total size of the
>> database cluster, but it's kind of hard to avoid that if you want to
>> model this honestly.
>
>I'm not sure I agree with that.  I mean, you could easily have a
>database that is much larger than effective_cache_size, but only that
>much of it is hot.  Or, the hot portion could move around over time.
>And for reasons of both technical complexity and plan stability, I
>don't think we want to try to model that.  It seems perfectly
>reasonable to say that reading 25% of effective_cache_size will be
>more expensive *per-page* than reading 5% of effective_cache_size,
>independently of what the total cluster size is.
>
>> Would it be at all workable to have an estimate that so many megs of a
>> table are in cache (independently of any other table), and then we could
>> scale the cost based on the expected number of pages to read versus that
>> number?  The trick here is that DBAs really aren't going to want to set
>> such a per-table number (at least, most of the time) so we need a
>> formula to get to a default estimate for that number based on some simple
>> system-wide parameters.  I'm not sure if that's any easier.
>
>That's an interesting idea.  For the sake of argument, suppose we
>assume that a relation which is less than 5% of effective_cache_size
>will be fully cached; and anything larger we'll assume that much of it
>is cached.  Consider a 4GB machine with effective_cache_size set to
>3GB.  Then we'll assume that any relation less than 153MB table is
>100% cached, a 1 GB table is 15% cached, and a 3 GB table is 5%
>cached.  That doesn't seem quite right, though: the caching percentage
>drops off very quickly after you exceed the threshold.
>
>*thinks*
>
>I wondering if we could do something with a formula like 3 *
>amount_of_data_to_read / (3 * amount_of_data_to_read +
>effective_cache_size) = percentage NOT cached.  That is, if we're
>reading an amount of data equal to effective_cache_size, we assume 25%
>caching, and plot a smooth curve through that point.  In the examples
>above, we would assume that a 150MB read is 87% cached, a 1GB read is
>50% cached, and a 3GB read is 25% cached.
>
>> BTW, it seems that all these variants have an implicit assumption that
>> if you're reading a small part of the table it's probably part of the
>> working set; which is an assumption that could be 100% wrong.  I don't
>> see a way around it without trying to characterize the data access at
>> an unworkably fine level, though.
>
>Me neither, but I think it will frequently be true, and I'm not sure
>it will hurt very much when it isn't.  I mean, if you execute the same
>query repeatedly, that data will become hot soon enough.  If you
>execute a lot of different queries that each touch a small portion of
>a big, cold table, we might underestimate the costs of the index
>probes, but so what?  There's probably no better strategy for
>accessing that table anyway.  Perhaps you can construct an example
>where this underestimate affects the join order in an undesirable
>fashion, but I'm having a hard time getting worked up about that as a
>potential problem case.  Our current system - where we essentially
>assume that the caching percentage is uniform across the board - can
>have the same problem in less artificial cases.
>
>--
>Robert Haas
>EnterpriseDB: http://www.enterprisedb.com
>The Enterprise PostgreSQL Company
>
>--
>Sent via pgsql-performance mailing list ()
>To make changes to your subscription:
>http://www.postgresql.org/mailpref/pgsql-performance

On a thread some time ago, on a similar subject, I opined that I missed the ability to assign tables to tablespaces and
buffersto tablespaces, thus having the ability to isolate needed tables (perhaps a One True Lookup Table, for example;
ora Customer table) to memory without fear of eviction. 

I was sounding beaten about the face and breast.  It really is an "Enterprise" way of handling the situation.

regards,
Robert


В списке pgsql-performance по дате сообщения:

От: Jon Nelson
Дата:
Сообщение: Re: postmaster consuming /lots/ of memory with hash aggregate. why?
От: Pavel Stehule
Дата:
Сообщение: Re: postmaster consuming /lots/ of memory with hash aggregate. why?