Re: Index Skip Scan

Поиск
Список
Период
Сортировка
От Floris Van Nee
Тема Re: Index Skip Scan
Дата
Msg-id 1559361698401.22864@Optiver.com
обсуждение исходный текст
Ответ на Re: Index Skip Scan  (Dmitry Dolgov <9erthalion6@gmail.com>)
Ответы Re: Index Skip Scan  (Floris Van Nee <florisvannee@Optiver.com>)
Список pgsql-hackers

After some talks with Jesper at PGCon about the Index Skip Scan, I started testing this patch, because it seems to have great potential in speeding up many of our queries (great conference by the way, really enjoyed my first time being there!). I haven't looked in depth to the code itself, but I focused on some testing with real data that we have.

Let me start by sketching our primary use case for this, as it is similar, but slightly different than what was discussed earlier in this thread. I think this use case is something a lot of people who handle timeseries data have. Our database has many tables with timeseries data. We don't update rows, but just insert new rows each time. One example of this would be a table with prices for instruments. Instruments are identified by a column called feedcode. Prices of instrument update with a certain frequency. Each time it updates we insert a new row with the new value and the timestamp at that time. So in the simplest form, you could see it as a table like this:


create table prices (feedcode text, updated_at timestamptz, value float8); -- there'll be some other columns as well, this is just an example

create unique index on prices (feedcode, updated_at desc);


This table perfectly fits the criteria for the index skip scan as there are relatively few distinct feedcodes, but each of them has quite a lot of price inserts (imagine 1000 distinct feedcodes, each of them having one price per second). We normally partition our data by a certain time interval, so let's say we're just looking at one day of prices here. We have other cases with higher update frequencies and/or more distinct values though.


Typical queries on this table involve querying the price at a certain point in time, or simply querying the latest update. If we know the feedcode, this is easy:

select * from prices where feedcode='A' and updated_at <= '2019-06-01 12:00' order by feedcode, updated_at desc limit 1


Unfortunately, this gets hard if you want to know the price of everything at a certain point in time. The query then becomes:

select distinct on (feedcode) * from prices where updated_at <= '2019-06-01 12:00' order by feedcode, updated_at desc

Up until now (even with this patch) this uses a regular index scan + a unique node which scans the full index, which is terribly slow and is also not constant - as the table grows it becomes slower and slower.


Obviously there are currently already ways to speed this up using the recursive loose index scan, but I think everybody agrees that those are pretty unreadable. However, since they're also several orders of magnitude faster, we actually use them everywhere. Eg.

-- certain point in time

-- first query *all* distinct feedcodes (disregarding time), then look do an index scan for every feedcode found to see if it has an update in the time window we're interested in

-- this essentially means we're doing 2*N index scans

with recursive t as (
   select feedcode from prices order by feedcode, updated_at desc limit 1
   union all 
   select n.feedcode from t
   cross join lateral (select feedcode from prices where feedcode > t.feedcode order by feedcode, updated_at desc limit 1) n
) select n.* from t
  cross join lateral (select * from prices where feedcode=t.feedcode and updated_at <= '2019-06-01 12:00' order by feedcode, updated_at desc limit 1) n

-- just latest value
-- if we're interested in just the latest value, it can actually be optimized to just N index scans like this.
-- to make it even more confusing - there's a tradeoff here.. if you're querying a timestamp close to the latest available timestamp, it is often faster to use this method anyway and just put the filter for updated_at inside this query. this avoids the overhead of 2*N index scans, at the expense that the LIMIT 1 may have to scan several tuples before finding one that matches the timestamp criteria. With the 2*N method above we're guaranteed that the first tuple it sees is the correct tuple, but we're doing many more scans...
with recursive t as (
   select * from prices order by feedcode, updated_at desc limit 1
   union all 
   select n.* from t
   cross join lateral (select * from prices where feedcode > t.feedcode order by feedcode, updated_at desc limit 1) _
) select * from t


I hope this makes our current situation clear. Please do ask if I need to elaborate on something here.


So what changes with this patch? The great thing is that the recursive CTE is not required anymore! This is a big win for readability and it helps performance as well. It makes everything much better. I am really happy with these results. If the index skip scan triggers, it is easily over 100x faster than the naive 'distinct on' query in earlier versions of Postgres. It is also quite a bit faster than the recursive CTE version of the query.


I have a few remarks though. I tested some of our queries with the patch and found that the following query would (with patch) work best now for arbitrary timestamps:

-- first query all distinct values using the index skip scan, then do an index scan for each of these

select r.* from (
   select distinct feedcode from prices
) k
cross join lateral (
   select *
   from prices
   where feedcode=k.feedcode and updated_at <= '2019-06-01 12:00'
   order by feedcode, updated_at desc
   limit 1
) r​


While certainly a big improvement over the recursive CTE, it would be nice if the even simpler form with the 'distinct on' would work out of the box using an index skip scan.

select distinct on (feedcode) * from prices where updated_at <= '2019-06-01 12:00' order by feedcode, updated_at desc


As far as I can see, there are two main problems with that at the moment.

1) Only support for Index-Only scan at the moment, not for regular index scans. This was already mentioned upthread and I can understand that it was left out until now to constrain the scope of this. However, if we were to support 'distinct on' + selecting columns that are not part of the index we need a regular index scan instead of the index only scan.

2) The complicating part that we're interested in the value 'at a specific point in time'. This comparison with updated_at messes up all efficiency, as the index scan first looks at the *latest* updated_at for a certain feedcode and then walks the tree until it finds a tuple that matches the updated_at criteria (which possibly never happens, in which case it will happily walk over the full index). I'm actually unsure if there is anything we can do about this (without adding a lot of complexity), aside from just rewriting the query itself in the way I did where we're doing the skip scan for all distinct items followed by a second round of index scans for the specific point in time. I'm struggling a bit to explain this part clearly - I hope it's clear though. Please let me know if I should elaborate. Perhaps it's easiest to see by the difference in speed between the following two queries:

select distinct feedcode from prices -- approx 10ms

select distinct feedcode from prices where updated_at <= '1999-01-01 00:00' -- approx 200ms

Both use the index skip scan, but the first one is very fast, because it can skip large parts of the index. The second one scans the full index, because it never finds any row that matches the where condition so it can never skip anything.


Thanks for this great patch. It is already very useful and ​fills a gap that has existed for a long time. It is going to make our queries so much more readable and performant if we won't have to resort to recursive CTEs anymore!


-Floris


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeremy Finzel
Дата:
Сообщение: Docs for refresh materialized view concurrently
Следующее
От: Floris Van Nee
Дата:
Сообщение: Re: Index Skip Scan