Re: [EXT] Re: Why using a partial index is doing slightly more logical I/O than a normal index

Поиск
Список
Период
Сортировка
От Jeff Janes
Тема Re: [EXT] Re: Why using a partial index is doing slightly more logical I/O than a normal index
Дата
Msg-id CAMkU=1zBz4pRLiZaxMXrQPwD9Ee-XZLGHt00+QxiFf3_PamArQ@mail.gmail.com
обсуждение исходный текст
Ответ на RE: [EXT] Re: Why using a partial index is doing slightly more logical I/O than a normal index  ("Dirschel, Steve" <steve.dirschel@thomsonreuters.com>)
Список pgsql-general
On Wed, May 3, 2023 at 2:00 PM Dirschel, Steve <steve.dirschel@thomsonreuters.com> wrote:

Thanks for the reply Jeff.  Yes-  more of an academic question.  Regarding this part:

 

   Index Cond: ((workflow_id = 1070) AND ((status)::text = ANY ('{NOT_STARTED,PAUSED,PENDING,RUNNING}'::text[])))

   Filter: (deleted_millis <= 0)

   Buffers: shared hit=24

 

For this usage, the =ANY is applied as an "in-index filter".  It only descends the index once, to where workflow_id=1070, and then scans forward applying the =ANY to each index-tuple until it exhausts the =1070 condition.  As long as all the =1070 entries fit into just a few buffers, the count of buffers accessed by doing this is fewer than doing the re-descents.  (Stepping from tuple to tuple in the same index page doesn't count as a new access.  While a re-descent releases and reacquires the buffer)

 

There are 2,981,425 rows where workflow_id = 1070.  Does that change your theory of using an “in-index filter” for that plan? 


Yes.  There is no way that that many index tuples will fit in just 24 index leaf pages, so I think it must be using the re-descending method for both plans.  Then I don't know why there is a difference in the number of buffer accesses.  But the difference seems trivial, so I wouldn't put much effort into investigating it.

 

When you say there was a bit of speculation on the “boundard condition” vs “in-index filter” is the speculation on if Postgres has 2 different ways of processing a =ANY filter or is the speculation that one is being used by one plan and the other is being used by the other plan?


The speculation was that this applied to your query.  But going back to some of my original tests, I see that I remembered some of the details wrong on the broader topic as well.  When it voluntarily doesn't use the =ANY as a boundary condition, that shows up in the plan as having the condition evicted from "Index Cond" line and instead show up in a "Filter" line, at least in my one test case (which means it is no longer an in-index filter, as it jumps to the table and verifies visibility before applying the filter).  So the thing that the plans really don't distinguish is between when it just chooses not to use the extra index column for cost reasons, from when it thinks it is unable to use it for correctness/safety reasons.

Cheers,

Jeff

В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Invoking SQL function while doing CREATE OR REPLACE on it
Следующее
От: Evan Rempel
Дата:
Сообщение: Re: RHEL repo package crc mismatches