Re: Why is GIN index slowing down my query?

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Why is GIN index slowing down my query?
Дата
Msg-id 12830.1422846945@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Why is GIN index slowing down my query?  (AlexK987 <alex.cue.987@gmail.com>)
Список pgsql-performance
AlexK987 <alex.cue.987@gmail.com> writes:
> This is a realistic case: everyone have Python and Java skills, but PostGis
> and Haskell and Closure are rare. If we are looking for a person that has
> all the skills required for a task (array[1, 15]), that is "skills <@
> array[1, 15] " and not the opposite, right?

One of us has this backwards.  It might be me, but I don't think so.
Consider a person who has the two desired skills plus skill #42:

regression=# select array[1,15,42] <@ array[1,15];
 ?column?
----------
 f
(1 row)

regression=# select array[1,15,42] @> array[1,15];
 ?column?
----------
 t
(1 row)

> Also can you explain why " entries for "0" and "1" swamp everything else so
> that the planner
> doesn't know that eg "15" is really rare. " I thought that if a value is not
> found in the histogram, than clearly that value is rare, correct? What am I
> missing here?

The problem is *how* rare.  The planner will take the lowest frequency
seen among the most common elements as an upper bound for the frequency of
unlisted elements --- but if all you have in the stats array is 0 and 1,
and they both have frequency 1.0, that doesn't tell you anything.  And
that's what I see for this example:

regression=# select most_common_elems,most_common_elem_freqs from pg_stats where tablename = 'talent' and attname =
'skills';
 most_common_elems | most_common_elem_freqs
-------------------+------------------------
 {0,1}             | {1,1,1,1,0}
(1 row)

With a less skewed distribution, that rule of thumb would work better :-(

            regards, tom lane


В списке pgsql-performance по дате отправления:

Предыдущее
От: AlexK987
Дата:
Сообщение: Re: Why is GIN index slowing down my query?
Следующее
От: Christian Weyer
Дата:
Сообщение: Re: Unexpected (bad) performance when querying indexed JSONB column