Обсуждение: BRIN indexes and ORDER BY

Поиск
Список
Период
Сортировка

BRIN indexes and ORDER BY

От
Darren Lafreniere
Дата:
Hello, 

We're curious about the current behavior in 9.5.4, and possible future enhancements, of BRIN indexes with respect to ordering.

In the docs, section 11.4. "Indexes and ORDER BY" (https://www.postgresql.org/docs/9.5/static/indexes-ordering.html) is clear that anything other than B-tree indexes have unspecified ordering:

"In addition to simply finding the rows to be returned by a query, an index may be able to deliver them in a specific sorted order. This allows a query's ORDER BY specification to be honored without a separate sorting step. Of the index types currently supported by PostgreSQL, only B-tree can produce sorted output — the other index types return matching rows in an unspecified, implementation-dependent order."

We found a pgsql-hackers thread from about a year ago about optimizing ORDER BY for BRIN indexes. Tom Lane suggested that he was working on it: https://www.postgresql.org/message-id/11881.1443393360%40sss.pgh.pa.us


Our current test shows that ordering by a BRIN indexed column still performs an unoptimized sort:

SELECT generate_series(1, 10000000) AS id INTO test;
CREATE INDEX idx_test_id ON test USING BRIN (id);
EXPLAIN SELECT id FROM test ORDER BY id DESC LIMIT 20;

Limit  (cost=410344.40..410344.45 rows=20 width=4)
  ->  Sort  (cost=410344.40..435344.40 rows=1000000 width=4)"
        Sort Key: id DESC
        ->  Seq Scan on test  (cost=0.00..144248.00 rows=10000000 width=4)

Is there anything we're missing to speed this up? Or is it still a future feature?

Thank you,
Darren Lafreniere

Re: BRIN indexes and ORDER BY

От
Alvaro Herrera
Дата:
Darren Lafreniere wrote:

> "In addition to simply finding the rows to be returned by a query, an index
> may be able to deliver them in a specific sorted order. This allows a
> query's ORDER BY specification to be honored without a separate sorting
> step. Of the index types currently supported by PostgreSQL, only B-tree can
> produce sorted output — the other index types return matching rows in an
> unspecified, implementation-dependent order."
>
> We found a pgsql-hackers thread from about a year ago about optimizing
> ORDER BY for BRIN indexes. Tom Lane suggested that he was working on it:
> https://www.postgresql.org/message-id/11881.1443393360%40sss.pgh.pa.us

Tom said he was working on some infrastructure planner changes
("upper-planner path-ification"), not that he was working on improving
usage of BRIN indexes.  As far as I know, nobody has worked on that.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: BRIN indexes and ORDER BY

От
Darren Lafreniere
Дата:
Ahh, yes. I misread that. Thank you for the clarification.

On Wed, Oct 5, 2016 at 2:27 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Darren Lafreniere wrote:

> "In addition to simply finding the rows to be returned by a query, an index
> may be able to deliver them in a specific sorted order. This allows a
> query's ORDER BY specification to be honored without a separate sorting
> step. Of the index types currently supported by PostgreSQL, only B-tree can
> produce sorted output — the other index types return matching rows in an
> unspecified, implementation-dependent order."
>
> We found a pgsql-hackers thread from about a year ago about optimizing
> ORDER BY for BRIN indexes. Tom Lane suggested that he was working on it:
> https://www.postgresql.org/message-id/11881.1443393360%40sss.pgh.pa.us

Tom said he was working on some infrastructure planner changes
("upper-planner path-ification"), not that he was working on improving
usage of BRIN indexes.  As far as I know, nobody has worked on that.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BRIN indexes and ORDER BY

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Darren Lafreniere wrote:
>> We found a pgsql-hackers thread from about a year ago about optimizing
>> ORDER BY for BRIN indexes. Tom Lane suggested that he was working on it:
>> https://www.postgresql.org/message-id/11881.1443393360%40sss.pgh.pa.us

> Tom said he was working on some infrastructure planner changes
> ("upper-planner path-ification"), not that he was working on improving
> usage of BRIN indexes.  As far as I know, nobody has worked on that.

Alvaro's reading is correct; I wasn't planning to work on any such thing,
and still am not.

Looking again at the original thread:

> Gavin Wahl wrote:
>> It seems trivial to accelerate a MAX or MIN query with a BRIN index. You
>> just find the page range with the largest/smallest value, and then only
>> scan that one. Would that be hard to implement? I'm interested in working
>> on it if someone can give me some pointers.

I think this proposal is fairly broken anyway.  The page range with the
largest max-value may once have contained the largest live row, but
there's no guarantee that it still does.  It might even be completely
empty.  You could imagine an algorithm like this:

1. Find page-range with largest max.  Scan it to identify live row with
largest value.  If *no* live values, find page-range with next largest
max, repeat until no page ranges remain (whereupon return NULL).

2. For each remaining page-range whose indexed max exceeds the value
currently in hand, scan that page-range to see if any value exceeds
the one in hand, replacing the value if so.

This'd probably allow you to omit scanning some of the page-ranges
in the table, but in a lot of cases you'd end up scanning many of them;
and you'd need a lot of working state to remember which ranges you'd
already looked at.  It'd certainly always be a lot more expensive than
answering the same question with a btree index, because in no case do
you get to avoid scanning the entire contents of the index.

            regards, tom lane


Re: BRIN indexes and ORDER BY

От
Darren Lafreniere
Дата:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Gavin Wahl wrote:
>> It seems trivial to accelerate a MAX or MIN query with a BRIN index. You
>> just find the page range with the largest/smallest value, and then only
>> scan that one. Would that be hard to implement? I'm interested in working
>> on it if someone can give me some pointers.

I think this proposal is fairly broken anyway.  The page range with the
largest max-value may once have contained the largest live row, but
there's no guarantee that it still does.  It might even be completely
empty.  You could imagine an algorithm like this:

1. Find page-range with largest max.  Scan it to identify live row with
largest value.  If *no* live values, find page-range with next largest
max, repeat until no page ranges remain (whereupon return NULL).

2. For each remaining page-range whose indexed max exceeds the value
currently in hand, scan that page-range to see if any value exceeds
the one in hand, replacing the value if so.

This'd probably allow you to omit scanning some of the page-ranges
in the table, but in a lot of cases you'd end up scanning many of them;
and you'd need a lot of working state to remember which ranges you'd
already looked at.  It'd certainly always be a lot more expensive than
answering the same question with a btree index, because in no case do
you get to avoid scanning the entire contents of the index.

                        regards, tom lane


Thanks Tom,

A b-tree index would certainly be faster for ordering. But in scenarios where you have huge datasets that can't afford the space or update time required for b-tree, could such a BRIN-accelerated ordering algorithm at least be faster than ordering with no index?

Darren Lafreniere

Re: BRIN indexes and ORDER BY

От
Stephen Frost
Дата:
Darren,

* Darren Lafreniere (dlafreniere@onezero.com) wrote:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > Gavin Wahl wrote:
> > >> It seems trivial to accelerate a MAX or MIN query with a BRIN index. You
> > >> just find the page range with the largest/smallest value, and then only
> > >> scan that one. Would that be hard to implement? I'm interested in
> > working
> > >> on it if someone can give me some pointers.
> >
> > I think this proposal is fairly broken anyway.  The page range with the
> > largest max-value may once have contained the largest live row, but
> > there's no guarantee that it still does.  It might even be completely
> > empty.  You could imagine an algorithm like this:
> >
> > 1. Find page-range with largest max.  Scan it to identify live row with
> > largest value.  If *no* live values, find page-range with next largest
> > max, repeat until no page ranges remain (whereupon return NULL).
> >
> > 2. For each remaining page-range whose indexed max exceeds the value
> > currently in hand, scan that page-range to see if any value exceeds
> > the one in hand, replacing the value if so.
> >
> > This'd probably allow you to omit scanning some of the page-ranges
> > in the table, but in a lot of cases you'd end up scanning many of them;
> > and you'd need a lot of working state to remember which ranges you'd
> > already looked at.  It'd certainly always be a lot more expensive than
> > answering the same question with a btree index, because in no case do
> > you get to avoid scanning the entire contents of the index.
[...]
> A b-tree index would certainly be faster for ordering. But in scenarios
> where you have huge datasets that can't afford the space or update time
> required for b-tree, could such a BRIN-accelerated ordering algorithm at
> least be faster than ordering with no index?

For at least some of the common BRIN use-cases, where the rows are
inserted in-order and never/very-rarely modified or deleted, this
approach would work very well.

Certainly, using this would be much cheaper than a seqscan/top-N sort,
for small values of 'N', relative to the number of rows in the table,
in those cases.

In general, I like the idea of supporting this as BRIN indexes strike me
as very good for very large tables which have highly clumped data in
them and being able to do a top-N query on those can be very useful at
times.

Thanks!

Stephen

Вложения

Re: BRIN indexes and ORDER BY

От
Darren Lafreniere
Дата:
Stephen Frost <sfrost@snowman.net> wrote:
For at least some of the common BRIN use-cases, where the rows are
inserted in-order and never/very-rarely modified or deleted, this
approach would work very well.
 
Thanks Stephen, this is exactly our use case. 

Re: BRIN indexes and ORDER BY

От
Merlin Moncure
Дата:
On Wed, Oct 5, 2016 at 3:27 PM, Stephen Frost <sfrost@snowman.net> wrote:
> Darren,
>
> * Darren Lafreniere (dlafreniere@onezero.com) wrote:
>> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> > > Gavin Wahl wrote:
>> > >> It seems trivial to accelerate a MAX or MIN query with a BRIN index. You
>> > >> just find the page range with the largest/smallest value, and then only
>> > >> scan that one. Would that be hard to implement? I'm interested in
>> > working
>> > >> on it if someone can give me some pointers.
>> >
>> > I think this proposal is fairly broken anyway.  The page range with the
>> > largest max-value may once have contained the largest live row, but
>> > there's no guarantee that it still does.  It might even be completely
>> > empty.  You could imagine an algorithm like this:
>> >
>> > 1. Find page-range with largest max.  Scan it to identify live row with
>> > largest value.  If *no* live values, find page-range with next largest
>> > max, repeat until no page ranges remain (whereupon return NULL).
>> >
>> > 2. For each remaining page-range whose indexed max exceeds the value
>> > currently in hand, scan that page-range to see if any value exceeds
>> > the one in hand, replacing the value if so.
>> >
>> > This'd probably allow you to omit scanning some of the page-ranges
>> > in the table, but in a lot of cases you'd end up scanning many of them;
>> > and you'd need a lot of working state to remember which ranges you'd
>> > already looked at.  It'd certainly always be a lot more expensive than
>> > answering the same question with a btree index, because in no case do
>> > you get to avoid scanning the entire contents of the index.
> [...]
>> A b-tree index would certainly be faster for ordering. But in scenarios
>> where you have huge datasets that can't afford the space or update time
>> required for b-tree, could such a BRIN-accelerated ordering algorithm at
>> least be faster than ordering with no index?
>
> For at least some of the common BRIN use-cases, where the rows are
> inserted in-order and never/very-rarely modified or deleted, this
> approach would work very well.
>
> Certainly, using this would be much cheaper than a seqscan/top-N sort,
> for small values of 'N', relative to the number of rows in the table,
> in those cases.
>
> In general, I like the idea of supporting this as BRIN indexes strike me
> as very good for very large tables which have highly clumped data in
> them and being able to do a top-N query on those can be very useful at
> times.

Yeah.  If the brin average page overlap and % dead tuple coefficients
are low it absolutely makes sense to drive top N with brin.  It will
never beat a btree but typically brin is used when the btree index is
no good for various reasons.

brin indexes are pretty neat; they can provide stupefying amounts of
optimization in many common warehousing workloads.   They even beat
out index only scans for a tiny fraction of the storage.  Of course,
you have to work around the limitations... :-)

merlin