Re: [EXT] Re: Why using a partial index is doing slightly more logical I/O than a normal index

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: [EXT] Re: Why using a partial index is doing slightly more logical I/O than a normal index
Дата
Msg-id 3002223.1683140400@sss.pgh.pa.us
обсуждение исходный текст
Ответ на RE: [EXT] Re: Why using a partial index is doing slightly more logical I/O than a normal index  ("Dirschel, Steve" <steve.dirschel@thomsonreuters.com>)
Список pgsql-general
"Dirschel, Steve" <steve.dirschel@thomsonreuters.com> writes:
> There are 2,981,425 rows where workflow_id = 1070.  Does that change your theory of using an “in-index filter” for
thatplan?  When you say there was a bit of speculation on the “boundard condition” vs “in-index filter” is the
speculationon if Postgres has 2 different ways of processing a =ANY filter or is the speculation that one is being used
byone plan and the other is being used by the other plan? 

I don't believe the intelligence Jeff is postulating actually exists.
I see only one code path for this in nbtree.c, and what it's doing is
what he called the "boundary condition" implementation.  That is, it runs
one index search for "workflow_id = 1070 AND status = 'NOT_STARTED'",
then one for "workflow_id = 1070 AND status = 'PAUSED'", etc,
re-descending the index tree at the start of each of those four scans.

I'm not inclined to ascribe any great significance to the varying numbers
of buffer hits you observed.  I think that's likely explained by chance
layout of the two indexes' contents, so that some of these searches cross
different numbers of index pages even though they visit the same number of
index entries overall.  In particular, it doesn't look like the partial
index is buying anything for this specific test case.  The index's
constraint on "status" matters not at all, because in neither index will
we ever visit any regions of the index where other values of "status"
appear (save the one following entry where we detect that the status value
no longer matches in each scan; but it doesn't really matter what that
entry is).  The constraint on "deleted_millis" could help, but your second
EXPLAIN result shows that it didn't actually eliminate any rows:

>> Index Scan using test_workflow_execution_initial_ui_tabs on workflow_execution_test  (cost=0.56..15820.19 rows=4335
width=1309)(actual time=0.049..0.106 rows=56 loops=1) 
>>    Index Cond: ((workflow_id = 1070) AND ((status)::text = ANY ('{NOT_STARTED,PAUSED,PENDING,RUNNING}'::text[])))
>>    Filter: (deleted_millis <= 0)

(note the lack of any "Rows Removed by Filter" line).  We can therefore
conclude that the index regions satisfying the workflow_id+status
conditions had no entries with deleted_millis <= 0 either.  So these
two test cases visited exactly the same number of index entries, and
any differences in "buffers hit" had to be chance layout effects, or
possibly the result of different index entry sizes.  How large is
that "result" column in reality?

            regards, tom lane



В списке pgsql-general по дате отправления:

Предыдущее
От: "Nagendra Mahesh (namahesh)"
Дата:
Сообщение: Invoking SQL function while doing CREATE OR REPLACE on it
Следующее
От: Erik Wienhold
Дата:
Сообщение: Re: Invoking SQL function while doing CREATE OR REPLACE on it