Обсуждение: BRIN indexes and ORDER BY

Поиск

Список

Период

Сортировка

BRIN indexes and ORDER BY

От

Darren Lafreniere

Дата:

05 октября 2016 г., 18:20:12

Hello,

We're curious about the current behavior in 9.5.4, and possible future enhancements, of BRIN indexes with respect to ordering.

In the docs, section 11.4. "Indexes and ORDER BY" (https://www.postgresql.org/docs/9.5/static/indexes-ordering.html) is clear that anything other than B-tree indexes have unspecified ordering:

"In addition to simply finding the rows to be returned by a query, an index may be able to deliver them in a specific sorted order. This allows a query's ORDER BY specification to be honored without a separate sorting step. Of the index types currently supported by PostgreSQL, only B-tree can produce sorted output — the other index types return matching rows in an unspecified, implementation-dependent order."

We found a pgsql-hackers thread from about a year ago about optimizing ORDER BY for BRIN indexes. Tom Lane suggested that he was working on it: https://www.postgresql.org/message-id/11881.1443393360%40sss.pgh.pa.us

Our current test shows that ordering by a BRIN indexed column still performs an unoptimized sort:

SELECT generate_series(1, 10000000) AS id INTO test;

CREATE INDEX idx_test_id ON test USING BRIN (id);

EXPLAIN SELECT id FROM test ORDER BY id DESC LIMIT 20;

Limit (cost=410344.40..410344.45 rows=20 width=4)

-> Sort (cost=410344.40..435344.40 rows=1000000 width=4)"

Sort Key: id DESC

-> Seq Scan on test (cost=0.00..144248.00 rows=10000000 width=4)

Is there anything we're missing to speed this up? Or is it still a future feature?

Thank you,

Darren Lafreniere

Re: BRIN indexes and ORDER BY

От

Alvaro Herrera

Дата:

05 октября 2016 г., 18:28:00

Darren Lafreniere wrote:

> "In addition to simply finding the rows to be returned by a query, an index
> may be able to deliver them in a specific sorted order. This allows a
> query's ORDER BY specification to be honored without a separate sorting
> step. Of the index types currently supported by PostgreSQL, only B-tree can
> produce sorted output — the other index types return matching rows in an
> unspecified, implementation-dependent order."
>
> We found a pgsql-hackers thread from about a year ago about optimizing
> ORDER BY for BRIN indexes. Tom Lane suggested that he was working on it:
> https://www.postgresql.org/message-id/11881.1443393360%40sss.pgh.pa.us

Tom said he was working on some infrastructure planner changes
("upper-planner path-ification"), not that he was working on improving
usage of BRIN indexes.  As far as I know, nobody has worked on that.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BRIN indexes and ORDER BY

От

Darren Lafreniere

Дата:

05 октября 2016 г., 18:30:39

Ahh, yes. I misread that. Thank you for the clarification.

On Wed, Oct 5, 2016 at 2:27 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

Darren Lafreniere wrote:

> "In addition to simply finding the rows to be returned by a query, an index
> may be able to deliver them in a specific sorted order. This allows a
> query's ORDER BY specification to be honored without a separate sorting
> step. Of the index types currently supported by PostgreSQL, only B-tree can
> produce sorted output — the other index types return matching rows in an
> unspecified, implementation-dependent order."
>
> We found a pgsql-hackers thread from about a year ago about optimizing
> ORDER BY for BRIN indexes. Tom Lane suggested that he was working on it:
> https://www.postgresql.org/message-id/11881.1443393360%40sss.pgh.pa.us

Tom said he was working on some infrastructure planner changes
("upper-planner path-ification"), not that he was working on improving
usage of BRIN indexes. As far as I know, nobody has worked on that.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BRIN indexes and ORDER BY

От

Tom Lane

Дата:

05 октября 2016 г., 19:01:07

Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Darren Lafreniere wrote:
>> We found a pgsql-hackers thread from about a year ago about optimizing
>> ORDER BY for BRIN indexes. Tom Lane suggested that he was working on it:
>> https://www.postgresql.org/message-id/11881.1443393360%40sss.pgh.pa.us

> Tom said he was working on some infrastructure planner changes
> ("upper-planner path-ification"), not that he was working on improving
> usage of BRIN indexes.  As far as I know, nobody has worked on that.

Alvaro's reading is correct; I wasn't planning to work on any such thing,
and still am not.

Looking again at the original thread:

> Gavin Wahl wrote:
>> It seems trivial to accelerate a MAX or MIN query with a BRIN index. You
>> just find the page range with the largest/smallest value, and then only
>> scan that one. Would that be hard to implement? I'm interested in working
>> on it if someone can give me some pointers.

I think this proposal is fairly broken anyway.  The page range with the
largest max-value may once have contained the largest live row, but
there's no guarantee that it still does.  It might even be completely
empty.  You could imagine an algorithm like this:

1. Find page-range with largest max.  Scan it to identify live row with
largest value.  If *no* live values, find page-range with next largest
max, repeat until no page ranges remain (whereupon return NULL).

2. For each remaining page-range whose indexed max exceeds the value
currently in hand, scan that page-range to see if any value exceeds
the one in hand, replacing the value if so.

This'd probably allow you to omit scanning some of the page-ranges
in the table, but in a lot of cases you'd end up scanning many of them;
and you'd need a lot of working state to remember which ranges you'd
already looked at.  It'd certainly always be a lot more expensive than
answering the same question with a btree index, because in no case do
you get to avoid scanning the entire contents of the index.

            regards, tom lane

Re: BRIN indexes and ORDER BY

От

Darren Lafreniere

Дата:

05 октября 2016 г., 19:51:38

Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Gavin Wahl wrote:
>> It seems trivial to accelerate a MAX or MIN query with a BRIN index. You
>> just find the page range with the largest/smallest value, and then only
>> scan that one. Would that be hard to implement? I'm interested in working
>> on it if someone can give me some pointers.

I think this proposal is fairly broken anyway. The page range with the
largest max-value may once have contained the largest live row, but
there's no guarantee that it still does. It might even be completely
empty. You could imagine an algorithm like this:

1. Find page-range with largest max. Scan it to identify live row with
largest value. If *no* live values, find page-range with next largest
max, repeat until no page ranges remain (whereupon return NULL).

2. For each remaining page-range whose indexed max exceeds the value
currently in hand, scan that page-range to see if any value exceeds
the one in hand, replacing the value if so.

This'd probably allow you to omit scanning some of the page-ranges
in the table, but in a lot of cases you'd end up scanning many of them;
and you'd need a lot of working state to remember which ranges you'd
already looked at. It'd certainly always be a lot more expensive than
answering the same question with a btree index, because in no case do
you get to avoid scanning the entire contents of the index.

regards, tom lane

Thanks Tom,

A b-tree index would certainly be faster for ordering. But in scenarios where you have huge datasets that can't afford the space or update time required for b-tree, could such a BRIN-accelerated ordering algorithm at least be faster than ordering with no index?

Darren Lafreniere

Re: BRIN indexes and ORDER BY

От

Stephen Frost

Дата:

05 октября 2016 г., 20:27:58

Darren,

* Darren Lafreniere (dlafreniere@onezero.com) wrote:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > Gavin Wahl wrote:
> > >> It seems trivial to accelerate a MAX or MIN query with a BRIN index. You
> > >> just find the page range with the largest/smallest value, and then only
> > >> scan that one. Would that be hard to implement? I'm interested in
> > working
> > >> on it if someone can give me some pointers.
> >
> > I think this proposal is fairly broken anyway.  The page range with the
> > largest max-value may once have contained the largest live row, but
> > there's no guarantee that it still does.  It might even be completely
> > empty.  You could imagine an algorithm like this:
> >
> > 1. Find page-range with largest max.  Scan it to identify live row with
> > largest value.  If *no* live values, find page-range with next largest
> > max, repeat until no page ranges remain (whereupon return NULL).
> >
> > 2. For each remaining page-range whose indexed max exceeds the value
> > currently in hand, scan that page-range to see if any value exceeds
> > the one in hand, replacing the value if so.
> >
> > This'd probably allow you to omit scanning some of the page-ranges
> > in the table, but in a lot of cases you'd end up scanning many of them;
> > and you'd need a lot of working state to remember which ranges you'd
> > already looked at.  It'd certainly always be a lot more expensive than
> > answering the same question with a btree index, because in no case do
> > you get to avoid scanning the entire contents of the index.
[...]
> A b-tree index would certainly be faster for ordering. But in scenarios
> where you have huge datasets that can't afford the space or update time
> required for b-tree, could such a BRIN-accelerated ordering algorithm at
> least be faster than ordering with no index?

For at least some of the common BRIN use-cases, where the rows are
inserted in-order and never/very-rarely modified or deleted, this
approach would work very well.

Certainly, using this would be much cheaper than a seqscan/top-N sort,
for small values of 'N', relative to the number of rows in the table,
in those cases.

In general, I like the idea of supporting this as BRIN indexes strike me
as very good for very large tables which have highly clumped data in
them and being able to do a top-N query on those can be very useful at
times.

Thanks!

Stephen

Вложения

signature.asc

Re: BRIN indexes and ORDER BY

От

Darren Lafreniere

Дата:

05 октября 2016 г., 20:44:55

Stephen Frost <sfrost@snowman.net> wrote:

For at least some of the common BRIN use-cases, where the rows are
inserted in-order and never/very-rarely modified or deleted, this
approach would work very well.

Thanks Stephen, this is exactly our use case.

Re: BRIN indexes and ORDER BY

От

Merlin Moncure

Дата:

06 октября 2016 г., 20:12:12

On Wed, Oct 5, 2016 at 3:27 PM, Stephen Frost <sfrost@snowman.net> wrote:
> Darren,
>
> * Darren Lafreniere (dlafreniere@onezero.com) wrote:
>> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> > > Gavin Wahl wrote:
>> > >> It seems trivial to accelerate a MAX or MIN query with a BRIN index. You
>> > >> just find the page range with the largest/smallest value, and then only
>> > >> scan that one. Would that be hard to implement? I'm interested in
>> > working
>> > >> on it if someone can give me some pointers.
>> >
>> > I think this proposal is fairly broken anyway.  The page range with the
>> > largest max-value may once have contained the largest live row, but
>> > there's no guarantee that it still does.  It might even be completely
>> > empty.  You could imagine an algorithm like this:
>> >
>> > 1. Find page-range with largest max.  Scan it to identify live row with
>> > largest value.  If *no* live values, find page-range with next largest
>> > max, repeat until no page ranges remain (whereupon return NULL).
>> >
>> > 2. For each remaining page-range whose indexed max exceeds the value
>> > currently in hand, scan that page-range to see if any value exceeds
>> > the one in hand, replacing the value if so.
>> >
>> > This'd probably allow you to omit scanning some of the page-ranges
>> > in the table, but in a lot of cases you'd end up scanning many of them;
>> > and you'd need a lot of working state to remember which ranges you'd
>> > already looked at.  It'd certainly always be a lot more expensive than
>> > answering the same question with a btree index, because in no case do
>> > you get to avoid scanning the entire contents of the index.
> [...]
>> A b-tree index would certainly be faster for ordering. But in scenarios
>> where you have huge datasets that can't afford the space or update time
>> required for b-tree, could such a BRIN-accelerated ordering algorithm at
>> least be faster than ordering with no index?
>
> For at least some of the common BRIN use-cases, where the rows are
> inserted in-order and never/very-rarely modified or deleted, this
> approach would work very well.
>
> Certainly, using this would be much cheaper than a seqscan/top-N sort,
> for small values of 'N', relative to the number of rows in the table,
> in those cases.
>
> In general, I like the idea of supporting this as BRIN indexes strike me
> as very good for very large tables which have highly clumped data in
> them and being able to do a top-N query on those can be very useful at
> times.

Yeah.  If the brin average page overlap and % dead tuple coefficients
are low it absolutely makes sense to drive top N with brin.  It will
never beat a btree but typically brin is used when the btree index is
no good for various reasons.

brin indexes are pretty neat; they can provide stupefying amounts of
optimization in many common warehousing workloads.   They even beat
out index only scans for a tiny fraction of the storage.  Of course,
you have to work around the limitations... :-)

merlin

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: BRIN indexes and ORDER BY

Вложения