Re: Minmax indexes

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Minmax indexes
Дата
Msg-id CAA4eK1+YFF0wskPQhYs2kLW0abJMmFUhOotTcS9VgCkLQ_WgBA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Minmax indexes  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Fri, Sep 27, 2013 at 11:49 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Thu, Sep 26, 2013 at 1:46 AM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
>> Amit Kapila escribió:
>>> On Sun, Sep 15, 2013 at 5:44 AM, Alvaro Herrera
>>> <alvherre@2ndquadrant.com> wrote:
>>
>
>>
>>>   On Windows, patch gives below compilation errors:
>>>   src\backend\access\minmax\mmtuple.c(96): error C2057: expected
>>> constant expression
>>
>> I have fixed all these compile errors (fix attached).  Thanks for
>> reporting them.  I'll post a new version shortly.
>
>    Thanks for fixing it. In last few days I had spent some time
> reading about minmax or equivalent indexes in other databases (Netezza
> and Oracle) and going through some parts of your proposal. Its a bit
> bigger patch and needs much more time, but I would like to share my
> findings/thoughts I had developed till now.
>
> Firstly about interface and use case, as far as I could understand
> other databases provide this index automatically rather than having a
> separate Create Index command which may be because such an index can
> be mainly useful when the data is ordered or if it's distributed in
> such a way that it's quite useful for repeatedly executing queries.
> You have proposed it as a command which means user needs to take care
> of it which I find is okay for first version, later may be we can also
> have some optimisations so that it can get created automatically.
> For the page range, If I read correctly, currently you have used hash
> define, do you want to expose it to user in some way like GUC or
> maintain it internally and assign the right value based on performance
> of different queries?
>
> Operations on this index seems to be very fast, like Oracle has this
> as an in-memory structure and I read in Netezza that write operations
> doesn't carry any significant overhead for zone maps as compare to
> other indexes, so shouldn't we consider it to be without WAL logged?
> OTOH I think because these structures get automatically created in
> those databases, so it might be okay but if we provide it as a
> command, then user might be bothered if he didn't find it
> automatically on server restart.
>
> Few Questions and observations:
> 1.
> + When a new heap tuple is inserted in a summarized page range, it is
> possible to
> + compare the existing index tuple with the new heap tuple.  If the
> heap tuple is
> + outside the minimum/maximum boundaries given by the index tuple for
> any indexed
> + column (or if the new heap tuple contains null values but the index tuple
> + indicate there are no nulls), it is necessary to create a new index tuple with
> + the new values.  To do this, a new index tuple is inserted, and the
> reverse range
> + map is updated to point to it.  The old index tuple is left in
> place, for later
> + garbage collection.
>
>
> Is there a reason why we can't directly update the value rather then
> new insert in index, as I understand for other indexes like btree
> we do this because we might need to rollback, but here even if after
> updating the min or max value, rollback happens, it will not cause
> any harm (tuple loss).
>
> 2.
> + If the reverse range map points to an invalid TID, the corresponding
> page range
> + is not summarized.
>
> 3.
> It might be better if you can mention when range map will point to an
> invalid TID, it's not explained in your proposal, but you have used it
> in you proposal to explain some other things.
>
> 4.
> Range reverse map is a good terminology, but isn't Range translation
> map better. I don't mind either way, it's just a thought came to my
> mind while understanding concept of Range Reverse map.
>
> 5.
> /*
>  * As above, except that instead of scanning the complete heap, only the given
>  * range is scanned.  Scan to end-of-rel can be signalled by passing
>  * InvalidBlockNumber as end block number.
>  */
> double
> IndexBuildHeapRangeScan(Relation heapRelation,
> Relation indexRelation,
> IndexInfo *indexInfo,
> bool allow_sync,
> BlockNumber start_blockno,
> BlockNumber numblocks,
> IndexBuildCallback callback,
> void *callback_state)
>
> In comments you have used end block number, which parameter does it
> refer to? I could see only start_blockno and numb locks?
>
> 6.
> currently you are passing 0 as start block and InvalidBlockNumber as
> number of blocks, what's the logic for it?
> return IndexBuildHeapRangeScan(heapRelation, indexRelation,
>   indexInfo, allow_sync,
>   0, InvalidBlockNumber,
>   callback, callback_state);

I got it, I think here it means scan all the pages.

> 7.
> In mmbuildCallback, it only add's tuple to minmax index, if it
> satisfies page range, else this can lead to waste of big scan incase
> page range is large (1280 pages as you mentiones in one of your
> mails). Why can't we include it end of scan?
>
>
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Minmax indexes
Следующее
От: David Rowley
Дата:
Сообщение: Re: FW: REVIEW: Allow formatting in log_line_prefix