Re: Index Skip Scan

Поиск
Список
Период
Сортировка
От Alexander Kuzmenkov
Тема Re: Index Skip Scan
Дата
Msg-id aa0e6d35-5772-d5cb-04b7-65166eaa399f@postgrespro.ru
обсуждение исходный текст
Ответ на Re: Index Skip Scan  (Jesper Pedersen <jesper.pedersen@redhat.com>)
Ответы Re: Index Skip Scan  (James Coleman <jtc331@gmail.com>)
Re: Index Skip Scan  (Jesper Pedersen <jesper.pedersen@redhat.com>)
Список pgsql-hackers
On 9/27/18 16:59, Jesper Pedersen wrote:
> Hi Dmitry,
>
> On 9/15/18 3:52 PM, Dmitry Dolgov wrote:
>>> On Thu, 13 Sep 2018 at 21:36, Alexander Kuzmenkov 
>>> <a.kuzmenkov@postgrespro.ru> wrote:
>>> El 13/09/18 a las 18:39, Jesper Pedersen escribió:
>>>> I think we can improve this,
>>>> and the skip scan can be strictly faster than index scan regardless of
>>>> the data.
>>>> <...>
>>>>
>>>
>>> This is something to look at -- maybe there is a way to use
>>> btpo_next/btpo_prev instead/too in order to speed things up. Atm we 
>>> just
>>> have the scan key in BTScanOpaqueData. I'll take a look after my
>>> upcoming vacation; feel free to contribute those changes in the 
>>> meantime
>>> of course.
>>
>> But having this logic inside _bt_next means that it will make a 
>> non-skip index
>> only scan a bit slower, am I right?
>
> Correct.

Well, it depends on how the skip scan is implemented. We don't have to 
make normal scans slower, because skip scan is just a separate thing.

My main point was that current implementation is good as a proof of 
concept, but it is inefficient for some data and needs some unreliable 
planner logic to work around this inefficiency. And now we also have 
execution-time fallback because planning-time fallback isn't good 
enough. This looks overly complicated. Let's try to design an algorithm 
that works well regardless of the particular data and doesn't need these 
heuristics. It should be possible to do so for btree.

Of course, I understand the reluctance to implement an entire new type 
of btree traversal. Anastasia Lubennikova suggested a tweak for the 
current method that may improve the performance for small groups of 
equal values. When we search for the next unique key, first check if it 
is contained on the current btree page using its 'high key'. If it is, 
find it on the current page. If not, search from the root of the tree 
like we do now. This should improve the performance for small equal 
groups, because there are going to be several such groups on the page. 
And this is exactly where we have the regression compared to unique + 
index scan.

By the way, what is the data for which we intend this feature to work? 
Obviously a non-unique btree index, but how wide are the tuples, and how 
big the equal groups? It would be good to have some real-world examples.

-- 
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Ideriha, Takeshi"
Дата:
Сообщение: RE: Protect syscache from bloating with negative cache entries
Следующее
От: "Daniel Verite"
Дата:
Сообщение: Re: Doc patch on psql output formats