Re: Proposal: Global Index

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: Proposal: Global Index
Дата
Msg-id 20191219161207.GM30116@momjian.us
обсуждение исходный текст
Ответ на Re: Proposal: Global Index  (Jose Luis Tallon <jltallon@adv-solutions.net>)
Ответы Re: Proposal: Global Index  (Jeremy Schneider <schnjere@amazon.com>)
Список pgsql-hackers
On Thu, Dec 19, 2019 at 09:48:40AM +0100, Jose Luis Tallon wrote:
> On 19/12/19 4:03, Bruce Momjian wrote:
> > On Mon, Nov 25, 2019 at 03:44:39PM -0800, Jeremy Schneider wrote:
> > > On 11/25/19 15:05, Jeremy Schneider wrote:
> > > > ... the cost of doing the individual index lookups across 180
> > > > partitions (and 180 indexes) was very high, so they stored max and min
> > > > txn id per partition and would generate a query with all the dates that
> > > > a txn id could have been in so that only a small number of partition
> > > > indexes would be accessed.
> > > > 
> > > > .. If we are looking for higher concurrency, we can usually
> > > > add a hack/workaround that filters on a partition key to provide “pretty
> > > > good” pruning.  The net result is that you get 2-3x the IO due to the
> > > > lack of global index (same workaround as first story above).
> > > Is that basically like a global BRIN index with granularity at the
> > > partition level?
> > Exactly!  :-)
> 
> Actually, one "kind of" BRIN index *per partitioned table* mapping (key
> range) -> (partition oid)... and so concurrency doesn't need to be very
> affected.
> 
> (we don't need to do things just like other RDBMS do, ya know... ;)
> 
> 
> IIRC, this precise approach was suggested around 2016 when initially
> discussing the "declarative partitioning" which originated Postgres' current
> partitioning scheme, in order to optimize partition pruning.

Robert Haas identified two needs for global indexes:

    https://www.postgresql.org/message-id/CA+Tgmob_J2M2+QKWrhg2NjQEkMEwZNTfd7a6Ubg34fJuZPkN2g@mail.gmail.com
    
    One of the biggest reasons why people want it is to enforce uniqueness
    for secondary keys - e.g. the employees table is partitioned by
    employee ID, but SSN should also be unique, at least among employees
    for whom it's not NULL.
    
    But people also want it for faster data retrieval: if you're looking
    for a commonly-occurring value, an index per partition is fine. But if
    you're looking for values that occur only once or a few times across
    the whole hierarchy, an index scan per partition is very costly.

I don't see lossy BRIN indexes helping with the uniqueness use-case, and
I am not sure they would help with the rare case either.  They would
help for range-based partitions, but I thought our existing facilities
worked in that case.

-- 
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Read Uncommitted
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: [PATCH] Remove twice assignment with var pageop (nbtree.c).