Re: BRIN indexes and ORDER BY

Поиск

Список

Период

Сортировка

От	Merlin Moncure
Тема	Re: BRIN indexes and ORDER BY
Дата	6 октября 2016 г. 23:12:12
Msg-id	CAHyXU0xba4bnShOzTWMxdT=ZkN6dGB+K3iVKtSCB8q_erYJtpg@mail.gmail.com обсуждение исходный текст
Ответ на	Re: BRIN indexes and ORDER BY (Stephen Frost <sfrost@snowman.net>)
Список	pgsql-general

Дерево обсуждения

On Wed, Oct 5, 2016 at 3:27 PM, Stephen Frost <sfrost@snowman.net> wrote:
> Darren,
>
> * Darren Lafreniere (dlafreniere@onezero.com) wrote:
>> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> > > Gavin Wahl wrote:
>> > >> It seems trivial to accelerate a MAX or MIN query with a BRIN index. You
>> > >> just find the page range with the largest/smallest value, and then only
>> > >> scan that one. Would that be hard to implement? I'm interested in
>> > working
>> > >> on it if someone can give me some pointers.
>> >
>> > I think this proposal is fairly broken anyway.  The page range with the
>> > largest max-value may once have contained the largest live row, but
>> > there's no guarantee that it still does.  It might even be completely
>> > empty.  You could imagine an algorithm like this:
>> >
>> > 1. Find page-range with largest max.  Scan it to identify live row with
>> > largest value.  If *no* live values, find page-range with next largest
>> > max, repeat until no page ranges remain (whereupon return NULL).
>> >
>> > 2. For each remaining page-range whose indexed max exceeds the value
>> > currently in hand, scan that page-range to see if any value exceeds
>> > the one in hand, replacing the value if so.
>> >
>> > This'd probably allow you to omit scanning some of the page-ranges
>> > in the table, but in a lot of cases you'd end up scanning many of them;
>> > and you'd need a lot of working state to remember which ranges you'd
>> > already looked at.  It'd certainly always be a lot more expensive than
>> > answering the same question with a btree index, because in no case do
>> > you get to avoid scanning the entire contents of the index.
> [...]
>> A b-tree index would certainly be faster for ordering. But in scenarios
>> where you have huge datasets that can't afford the space or update time
>> required for b-tree, could such a BRIN-accelerated ordering algorithm at
>> least be faster than ordering with no index?
>
> For at least some of the common BRIN use-cases, where the rows are
> inserted in-order and never/very-rarely modified or deleted, this
> approach would work very well.
>
> Certainly, using this would be much cheaper than a seqscan/top-N sort,
> for small values of 'N', relative to the number of rows in the table,
> in those cases.
>
> In general, I like the idea of supporting this as BRIN indexes strike me
> as very good for very large tables which have highly clumped data in
> them and being able to do a top-N query on those can be very useful at
> times.

Yeah.  If the brin average page overlap and % dead tuple coefficients
are low it absolutely makes sense to drive top N with brin.  It will
never beat a btree but typically brin is used when the btree index is
no good for various reasons.

brin indexes are pretty neat; they can provide stupefying amounts of
optimization in many common warehousing workloads.   They even beat
out index only scans for a tiny fraction of the storage.  Of course,
you have to work around the limitations... :-)

merlin

В списке pgsql-general по дате отправления:

Предыдущее

От: Andrew Borodin
Дата: 06 октября 2016 г., 21:30:22
Сообщение: Re: Online course for those who want tot contribute

Следующее

От: Yogesh Sharma
Дата: 07 октября 2016 г., 08:05:36
Сообщение: postgresql-8.1.18-2.1 information

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: BRIN indexes and ORDER BY

Предыдущее

Следующее