Re: [RFC] Minmax indexes

Поиск
Список
Период
Сортировка
От Simon Riggs
Тема Re: [RFC] Minmax indexes
Дата
Msg-id CA+U5nMKL2h6-fXHTJix_YEktFKjDOXOTnD5=UtDF8qSoVpqmzQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [RFC] Minmax indexes  (Josh Berkus <josh@agliodbs.com>)
Ответы Re: [RFC] Minmax indexes  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On 15 June 2013 00:01, Josh Berkus <josh@agliodbs.com> wrote:
> Alvaro,
>
> This sounds really interesting, and I can see the possibilities.
> However ...
>
>> Value changes in columns that are part of a minmax index, and tuple insertion
>> in summarized pages, would invalidate the stored min/max values.  To support
>> this, each minmax index has a validity map; a range can only be considered in a
>> scan if it hasn't been invalidated by such changes (A range "not considered" in
>> the scan needs to be returned in whole regardless of the stored min/max values,
>> that is, it cannot be pruned per query quals).  The validity map is very
>> similar to the visibility map in terms of performance characteristics: quick
>> enough that it's not contentious, allowing updates and insertions to proceed
>> even when data values violate the minmax index conditions.  An invalidated
>> range can be made valid by re-summarization (see below).
>
> This begins to sound like these indexes are only useful on append-only
> tables.  Not that there aren't plenty of those, but ...

The index is basically using the "index only scan" mechanism. The
"only useful on append-only tables" comment would/should apply also to
index only scans. I can't see a reason to raise that specifically for
this index type.


>> Re-summarization is relatively expensive, because the complete page range has
>> to be scanned.
>
> Why?  Why can't we just update the affected pages in the index?

Again, same thing as index-only scans. For IOS, we reset the
visibility info at vacuum. The route proposed here follows exactly the
same timing, same mechanism. I can't see a reason for any difference
between the two.


>>  To avoid this, a table having a minmax index would be
>> configured so that inserts only go to the page(s) at the end of the table; this
>> avoids frequent invalidation of ranges in the middle of the table.  We provide
>> a table reloption that tweaks the FSM behavior, so that summarized pages are
>> not candidates for insertion.
>
> We haven't had an index type which modifies table insertion behavior
> before, and I'm not keen to start now; imagine having two indexes on the
> same table each with their own, conflicting, requirements.  This is
> sounding a lot more like a candidate for our prospective pluggable
> storage manager.  Also, the above doesn't help us at all with UPDATEs.
>
> If we're going to start adding reloptions for specific table behavior,
> I'd rather think of all of the optimizations we might have for a
> prospective "append-only table" and bundle those, rather than tying it
> to whether a certain index exists or not.

I agree that the FSM behaviour shouldn't be linked to index existence.

IMHO that should be a separate table parameter, WITH (fsm_mode = append)

Index only scans would also benefit from that.


> Also, I hate the name ... if this feature goes ahead, I'm going to be
> lobbying to change it.  But that's pretty minor compared to the update
> issues.

This feature has already had 3 different names. I don't think the name
is crucial, but it makes sense to give it a name up front. So if you
want to lobby for that then you'd need to come up with a name soon, so
poor Alvaro can cope with name #4.

(There's no consistency in naming from any other implementation either).

--Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: stray SIGALRM
Следующее
От: Tom Lane
Дата:
Сообщение: Re: stray SIGALRM