Re: PG 13 release notes, first draft

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: PG 13 release notes, first draft
Дата
Msg-id 20200512001401.GG4666@momjian.us
обсуждение исходный текст
Ответ на Re: PG 13 release notes, first draft  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: PG 13 release notes, first draft  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers
On Mon, May 11, 2020 at 05:05:29PM -0700, Peter Geoghegan wrote:
> On Mon, May 11, 2020 at 4:10 PM Bruce Momjian <bruce@momjian.us> wrote:
> > > think that you should point out that deduplication works by storing
> > > the duplicates in the obvious way: Only storing the key once per
> > > distinct value (or once per distinct combination of values in the case
> > > of multi-column indexes), followed by an array of TIDs (i.e. a posting
> > > list). Each TID points to a separate row in the table.
> >
> > These are not details that should be in the release notes since the
> > internal representation is not important for its use.
> 
> I am not concerned about describing the specifics of the on-disk
> representation, and I don't feel too strongly about the storage
> parameter (leave it out). I only ask that the wording convey the fact
> that the deduplication feature is not just a quantitative improvement
> -- it's a qualitative behavioral change, that will help data
> warehousing in particular. This wasn't the case with the v12 work on
> B-Tree duplicates (as I said last year, I thought of the v12 stuff as
> fixing a problem more than an enhancement).
> 
> With the deduplication feature added to Postgres v13, the B-Tree code
> can now gracefully deal with low cardinality data by compressing the
> duplicates as needed. This is comparable to bitmap indexes in
> proprietary database systems, but without most of their disadvantages
> (in particular around heavyweight locking, deadlocks that abort
> transactions, etc). It's easy to imagine this making a big difference
> with analytics workloads. The v12 work made indexes with lots of
> duplicates 15%-30% smaller (compared to v11), but the v13 work can
> make them 60% - 80% smaller in many common cases (compared to v12). In
> extreme cases indexes might even be ~12x smaller (though that will be
> rare).

Agreed.  How is this?

    This allows efficient btree indexing of low cardinality columns.
    Users upgrading with pg_upgrade will need to use REINDEX to make use of
    this feature.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "David G. Johnston"
Дата:
Сообщение: Event trigger code comment duplication
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: PG 13 release notes, first draft