Re: incorrect row estimates for primary key join

Поиск
Список
Период
Сортировка
От Ben
Тема Re: incorrect row estimates for primary key join
Дата
Msg-id D05BECA9-0015-4B27-B974-B825D35DE1D9@gmail.com
обсуждение исходный текст
Ответ на Re: incorrect row estimates for primary key join  (Kevin Grittner <kgrittn@ymail.com>)
Ответы Re: incorrect row estimates for primary key join
Список pgsql-performance
hello --

thanks kevin for the tuning advice, i will answer your questions below and try different tuning configurations and
reportback.  but first allow me take a step back and ask a couple simple questions : 

it seems to me that an equality join between two relations (call them A and B) using columns in relation B with a
uniqueconstraint should yield row estimates which are at most equal to the row estimates for relation A.  my questions
are

1 - is this correct?

2 - does the postgresql planner implement this when generating row estimates?

it seems like if the answers to 1 and 2 are yes, then the row estimates for my join should always come back less or
equalto the estimates for jointable, regardless of what the query plan is.  indeed this is what i find experimentally
forsmaller examples.  what is perplexing to me is why this is not true for this large table.  (the fact that the table
sizeis greater than 2^31 is probably a red herring but hasn't escaped my attention.)  while i do have a performance
issue(i'd like for it to select the index scan) which might be solved by better configuration, that at the moment is a
secondaryquestion -- right now i'm interested in why the row estimates are off. 

moving on to your remarks :

On Jun 25, 2013, at 6:20 AM, Kevin Grittner wrote:

> Ben <midfield@gmail.com> wrote:
>
>> PostgreSQL 9.1.1 on x86_64-suse-linux-gnu, compiled by gcc (SUSE Linux) 4.6.2, 64-bit
>
> Consider applying the latest bug fixes for 9.1 -- which would leave
> you showing 9.1.9.

i will bring it up with our ops people.  do you have any particular fixes in mind, or is this a (very sensible) blanket
suggestion?

>> default_statistics_target    | 50                | configuration file
>
> Why did you change this from the default of 100?

sorry, i do not know.  it is possible this was copied from the configuration of a different server, which is serving
somevery very large tables with gist indexes, where the statistics do not help the selectivity estimations much if at
all(as far as i can tell gist indexes often use hard-coded selectivity estimates as opposed to using the statistics.)
inthat case it is an oversight and i will correct it.  but i believe the statistics for the tables in question are
closeenough, and certainly do not explain the off row estimates in the query plan. 

>> effective_cache_size        | 5632MB            | configuration file
>
> How much RAM is on this machine?  What else is running on it?
> (Normally people set this to 50% to 75% of total RAM.  Lower values
> discourage index usage in queries like your example.)

24GB.  i can up it to 12 or 16GB and report back.

> Do you get a different plan if you set cpu_tuple_cost = 0.03?  How
> about 0.05?  You can set this just for a single connection and run
> explain on the query to do a quick check.

setting cpu_tuple_cost to 0.03 or 0.05 has no effect on the choice of plan or the row estimates for the un-limited
queryor the limited query. 

best regards, ben



В списке pgsql-performance по дате отправления:

Предыдущее
От: Julien Cigar
Дата:
Сообщение: Re: incorrect row estimates for primary key join
Следующее
От: Kevin Grittner
Дата:
Сообщение: Re: incorrect row estimates for primary key join