Re: Optimize single tuple fetch from nbtree index

Поиск
Список
Период
Сортировка
От Floris Van Nee
Тема Re: Optimize single tuple fetch from nbtree index
Дата
Msg-id 1564784330839.3645@Optiver.com
обсуждение исходный текст
Ответ на Re: Optimize single tuple fetch from nbtree index  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Hi Tom,

Thanks for your quick reply!

> Regardless of whether there's actually a LIMIT 1?  That seems disastrous
> for every other case than the narrow one where the optimization wins.
> Because every other case is now going to have to touch the index page
> twice.  That's more CPU and about double the contention --- if you could
> not measure any degradation from that, you're not measuring the right
> thing.

I thought the same as well at first. Note that I did measure degradation of 2-3% as mentioned on some cases, but
initiallyI also expected worse. Do you have any ideas on cases that would suffer the most? I thought the tight inner
nestedloop that I posted in my performance tests would have this index lookup as bottleneck. I know they are the
bottleneckfor the LIMIT 1 query (because these improve by a factor 2-3 with the patch). And my theory is that for a
LIMIT3, the price paid for this optimization is highest, because it would touch the page twice and read all items from
it,while only returning three of them. 

> In principle, you could pass down knowledge of whether there's a LIMIT,
> using the same mechanism used to enable top-N sorting.  But it'd have to
> also go through the AM interface layer, so I'm not sure how messy this
> would be.

This was an idea I had as well and I would be willing to implement such a thing if this is deemed interesting enough by
thecommunity. However, I didn't want to do this for the first version of this patch, as it would be quite some extra
work,which would be useless if the idea of the patch itself gets rejected already. :-) I'd appreciate any pointers in
theright direction - I can take a look at how top-N sorting pushes the LIMIT down. Given enough interest for the basic
ideaof this patch, I will implement it. 

>> This calls _bt_first in nbtsearch.c, which will, if there are scankeys, descend the tree to a leaf page and read
justthe first (or possibly two) tuples. It won't touch the rest of the page yet. If indeed just one tuple was required,
therewon't be a call to _bt_next and we're done. If we do need more than one tuple, _bt_next will resume reading tuples
fromthe index page at the point where we left off. 

> How do you know how many index entries you have to fetch to get a tuple
that's live/visible to the query?

Indeed we don't know that - that's why this initial patch does not make any assumptions about this and just assumes the
good-weatherscenario that everything is visible. I'm not sure if it's possible to give an estimation of this and
whetheror not that would be useful. Currently, if it turns out that the tuple is not visible, there'll just be another
callto _bt_next again which will resume reading the page as normal. I'm open to implement any suggestions that may
improvethis. 

>> - We need to take into account page splits, insertions and vacuums while we do not have the read-lock in between
_bt_firstand the first call to _bt_next. This made my patch quite a bit more complicated than my initial
implementation.

> Meh.  I think the odds that you got this 100% right are small, and the
> odds that it would be maintainable are smaller.  There's too much that
> can happen if you're not holding any lock --- and there's a lot of active
> work on btree indexes, which could break whatever assumptions you might
> make today.

Agreed, which is also why I posted this initial version of the patch here already, to get some input from the experts
onthis topic what assumptions can be made now and in the future. If it turns out that it's completely not feasible to
doan optimization like this, because of other constraints in the btree implementation, then we're done pretty quickly
here.:-) For what it's worth: the patch at least passes make check consistently - I caught a lot of these edge cases
relatedto page splits and insertions while running the regression tests, which runs the modified bits of code quite
oftenand in parallel. There may be plenty of edge cases left however... 

> I'm not unalterably opposed to doing something like this, but my sense
> is that the complexity and probable negative performance impact on other
> cases are not going to look like a good trade-off for optimizing the
> case at hand.

I do think it could be a big win if we could get something like this working. Cases with a LIMIT seem common enough to
meto make it possible to add some extra optimizations, especially if that could lead to 2-3x the TPS for these kind of
queries.However, it indeed needs to be within a reasonable complexity. If it turns out that in order for us to optimize
this,we need to add a lot of extra complexity, it may not be worth it to add it. 

> BTW, you haven't even really made the case that optimizing a query that
> behaves this way is the right thing to be doing ... maybe some other
> plan shape that isn't a nestloop around a LIMIT query would be a better
> solution.

It is pretty difficult to come up with any faster plans than this unfortunately. We have a large database with many
tableswith timeseries data, and when tables get large and when there's an efficient multi-column index and when you
wantto do these kind of time-base or item-based lookups, the nested loop is generally the fastest option. I can
elaboratemore on this, but I'm not sure if this thread is the best place for that. 

I appreciate you taking the time to take a look at this. I'd be happy to look more into any suggestions you come up
with.Working on this has taught me a lot about the internals of Postgres already - I find it really interesting! 

-Floris



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ivan Panchenko
Дата:
Сообщение: jsonb_plperl bug
Следующее
От: Tom Lane
Дата:
Сообщение: Re: The unused_oids script should have a reminder to use the 8000-8999 OID range