Re: [HACKERS] Surjective functional indexes

Поиск
Список
Период
Сортировка
От Konstantin Knizhnik
Тема Re: [HACKERS] Surjective functional indexes
Дата
Msg-id 188663b1-3392-2e2a-0f41-cf1e1e3200ae@postgrespro.ru
обсуждение исходный текст
Ответ на Re: [HACKERS] Surjective functional indexes  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers

On 09.11.2018 2:27, Tom Lane wrote:
> I wrote:
>> The bigger picture here, and the reason for my skepticism about having
>> any intelligence in the enabling logic, is that there is no scenario
>> in which this code can be smarter than the user about what to do.
>> We have no insight today, and are unlikely to have any in future, about
>> whether a specific index expression is many-to-one or not.
> Hmm ... actually, I take that back.  Since we're only interested in this
> for expression indexes, we can expect that statistics will be available
> for the expression index, at least for tables that have been around
> long enough that UPDATE performance is really an exciting topic.
> So you could imagine pulling up the stadistinct numbers for the index
> column(s) and the underlying column(s), and enabling the optimization
> when their ratio is less than $something.  Figuring out how to merge
> numbers for multiple columns might be tricky, but it's not going to be
> completely fact-free.  (I still think that the cost-estimate logic is
> quite bogus, however.)
>
> Another issue in all this is the cost of doing this work over again
> after any relcache flush.  Maybe we could move the responsibility
> into ANALYZE?
>
> BTW, the existing code appears to be prepared to enable this logic
> if *any* index column is an expression, but surely we should do so
> only if they are *all* expressions?
>
>             regards, tom lane

 From my point of view "auto" value should be default, otherwise it has 
not so much sense.
If somebody decides to switch on this optimization for some particular 
index, then it will set it to "on", not "auto".
So I agree with your previous opinion, that if this optimization is 
disabled by default, then it is enough to have boolean parameter.

Concerning muticolumn indexes: why we should apply this optimization 
only if *all* of index columns are expressions?
Assume very simple example: we have some kind of document storage 
represented by the following table:

      create table document(owner integer, name text, last_updated 
timestamp, description json);

So there are some static document attributes (name, date,...) and some 
dynamic, stored in json field.
Consider that most frequently users will search among their own documents.
So we may create index like this:

      create index by_title on documents(owner,(description->>'title'));

Document description may include many attributes which are updated quite 
frequently, like "comments", "keywords",...
But "title" is rarely changed, so this optimization will be very useful 
for such index.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: BUG #15212: Default values in partition tables don't work asexpected and allow NOT NULL violation
Следующее
От: "David G. Johnston"
Дата:
Сообщение: Re: Alternative to \copy in psql modelled after \g