Re: PG 13 release notes, first draft

Поиск

Список

Период

Сортировка

От	Bruce Momjian
Тема	Re: PG 13 release notes, first draft
Дата	12 мая 2020 г. 03:14:01
Msg-id	20200512001401.GG4666@momjian.us обсуждение исходный текст
Ответ на	Re: PG 13 release notes, first draft (Peter Geoghegan <pg@bowt.ie>)
Ответы	Re: PG 13 release notes, first draft
Список	pgsql-hackers

Дерево обсуждения

On Mon, May 11, 2020 at 05:05:29PM -0700, Peter Geoghegan wrote:
> On Mon, May 11, 2020 at 4:10 PM Bruce Momjian <bruce@momjian.us> wrote:
> > > think that you should point out that deduplication works by storing
> > > the duplicates in the obvious way: Only storing the key once per
> > > distinct value (or once per distinct combination of values in the case
> > > of multi-column indexes), followed by an array of TIDs (i.e. a posting
> > > list). Each TID points to a separate row in the table.
> >
> > These are not details that should be in the release notes since the
> > internal representation is not important for its use.
> 
> I am not concerned about describing the specifics of the on-disk
> representation, and I don't feel too strongly about the storage
> parameter (leave it out). I only ask that the wording convey the fact
> that the deduplication feature is not just a quantitative improvement
> -- it's a qualitative behavioral change, that will help data
> warehousing in particular. This wasn't the case with the v12 work on
> B-Tree duplicates (as I said last year, I thought of the v12 stuff as
> fixing a problem more than an enhancement).
> 
> With the deduplication feature added to Postgres v13, the B-Tree code
> can now gracefully deal with low cardinality data by compressing the
> duplicates as needed. This is comparable to bitmap indexes in
> proprietary database systems, but without most of their disadvantages
> (in particular around heavyweight locking, deadlocks that abort
> transactions, etc). It's easy to imagine this making a big difference
> with analytics workloads. The v12 work made indexes with lots of
> duplicates 15%-30% smaller (compared to v11), but the v13 work can
> make them 60% - 80% smaller in many common cases (compared to v12). In
> extreme cases indexes might even be ~12x smaller (though that will be
> rare).

Agreed.  How is this?

    This allows efficient btree indexing of low cardinality columns.
    Users upgrading with pg_upgrade will need to use REINDEX to make use of
    this feature.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

В списке pgsql-hackers по дате отправления:

Предыдущее

От: "David G. Johnston"
Дата: 12 мая 2020 г., 03:13:38
Сообщение: Event trigger code comment duplication

Следующее

От: Bruce Momjian
Дата: 12 мая 2020 г., 03:17:44
Сообщение: Re: PG 13 release notes, first draft

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: PG 13 release notes, first draft

Предыдущее

Следующее