Re: Thinking About Correlated Columns (again)

От: Andrew Dunstan
Тема: Re: Thinking About Correlated Columns (again)
Дата: ,
Msg-id: 5193BA52.7040007@dunslane.net
(см: обсуждение, исходный текст)
Ответ на: Re: Thinking About Correlated Columns (again)  (Craig James)
Список: pgsql-performance

Скрыть дерево обсуждения

Thinking About Correlated Columns (again)  (Shaun Thomas, )
 Re: Thinking About Correlated Columns (again)  (Heikki Linnakangas, )
  Re: Thinking About Correlated Columns (again)  (Shaun Thomas, )
  Re: Thinking About Correlated Columns (again)  (Nikolas Everett, )
   Re: Thinking About Correlated Columns (again)  (eggyknap, )
  Re: Thinking About Correlated Columns (again)  (Gavin Flower, )
 Re: Thinking About Correlated Columns (again)  (Craig James, )
  Re: Thinking About Correlated Columns (again)  (Andrew Dunstan, )
  Re: Thinking About Correlated Columns (again)  (Gavin Flower, )
   Re: Thinking About Correlated Columns (again)  (Craig James, )
 Re: Thinking About Correlated Columns (again)  (Thomas Kellerer, )
  Re: Thinking About Correlated Columns (again)  (Shaun Thomas, )

On 05/15/2013 12:23 PM, Craig James wrote:
> On Wed, May 15, 2013 at 8:31 AM, Shaun Thomas
> < <mailto:>> wrote:
>
>     [Inefficient plans for correlated columns] has been a pain point
>     for quite a while. While we've had several discussions in the
>     area, it always seems to just kinda trail off and eventually
>     vanish every time it comes up.
>
>     ...
>     Since there really is no fix for this aside from completely
>     rewriting the query or horribly misusing CTEs (to force sequence
>     scans instead of bloated nested loops, for example)...
>     I'm worried that without an easy answer for cases like this, more
>     DBAs will abuse optimization fences to get what they want and
>     we'll end up in a scenario that's actually worse than query hints.
>     Theoretically, query hints can be deprecated or have GUCs to
>     remove their influence, but CTEs are forever, or until the next
>     code refactor.
>
>     I've seen conversations on this since at least 2005. There were
>     even proposed patches every once in a while, but never any
>     consensus. Anyone care to comment?
>
>
> It's a very hard problem.  There's no way you can keep statistics
> about all possible correlations since the number of possibilities is
> O(N^2) with the number of columns.
>

I don't see why we couldn't allow the DBA to specify some small subset
of the combinations of columns for which correlation stats would be
needed. I suspect in most cases no more than a handful for any given
table would be required.

cheers

andrew



В списке pgsql-performance по дате сообщения:

От: Craig James
Дата:
Сообщение: Re: Thinking About Correlated Columns (again)
От: Andrea Suisani
Дата:
Сообщение: Re: [OT] linux 3.10 kernel will improve ipc,sysv semaphore scalability