Re: pg_class.reltuples of brin indexes

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: pg_class.reltuples of brin indexes
Дата
Msg-id 48c2d25d-984a-4875-8a72-d23119b6461d@2ndquadrant.com
обсуждение исходный текст
Ответ на pg_class.reltuples of brin indexes  (Masahiko Sawada <sawada.mshk@gmail.com>)
Ответы Re: pg_class.reltuples of brin indexes  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On 03/27/2018 01:58 PM, Masahiko Sawada wrote:
> Hi,
> 
> I found that pg_class.reltuples of brin indexes can be either the
> number of index tuples or the number of heap tuples.
> 
> =# create table test as select generate_series(1,100000) as c;
> =# create index test_brin on test using brin (c);
> =# analyze test;
> =# select relname, reltuples, relpages from pg_class where relname in
> ('test', 'test_brin');
>   relname  | reltuples | relpages
> -----------+-----------+----------
>  test      |     100000 |       443
>  test_brin |     100000 |        3
> (2 rows)
> 
> =# vacuum test;
> =# select relname, reltuples, relpages from pg_class where relname in
> ('test', 'test_brin');
>   relname  | reltuples | relpages
> -----------+-----------+----------
>  test      |     100000 |       443
>  test_brin |         3 |        3
> (2 rows)
> 

Ouch!

> If I understand correctly pg_class.reltuples of indexes should have 
> the number of index tuples but especially for brin indexes it would
> be hard to estimate it in the analyze code.
I'm not sure it's that clear, unfortunately - it's probably more a
question of how the value is used for costing, etc.

> I thought that we can change brinvacuumcleanup so that it returns the
> estimated number of index tuples and do vac_update_relstats using
> that value but it would break API contract. Better ideas?
> 

I think number of index tuples makes sense, as long as that's what the
costing needs. That is, it's up to the index AM to define it. But it
clearly should not flap like this ...

And it's not just BRIN. This is what I get with a GIN index:

archie=# create index on messages using gin(subject_tsvector);

archie=# select relname, reltuples from pg_class
         where relname = 'messages_subject_tsvector_idx';

            relname            |  reltuples
-------------------------------+-------------
 messages_subject_tsvector_idx | 6.58566e+06
(1 row)

archie=# vacuum messages;

archie=# select relname, reltuples from pg_class
         where relname = 'messages_subject_tsvector_idx';

            relname            |  reltuples
-------------------------------+-------------
 messages_subject_tsvector_idx | 6.58566e+06
(1 row)

archie=# analyze messages;

archie=# select relname, reltuples from pg_class
         where relname = 'messages_subject_tsvector_idx';

            relname            |  reltuples
-------------------------------+-------------
 messages_subject_tsvector_idx | 1.23463e+06
(1 row)


And it's even worse with a partial index:

archie=# create index on messages using gin(subject_tsvector)
         where lower(substr(subject, 0, 4)) <> 're:'::text;

archie=# select relname, reltuples from pg_class
         where relname = 'messages_subject_tsvector_idx';

            relname            | reltuples
-------------------------------+------------
 messages_subject_tsvector_idx | 1.4397e+06
(1 row)

archie=# vacuum messages;

archie=# select relname, reltuples from pg_class
         where relname = 'messages_subject_tsvector_idx';

            relname            | reltuples
-------------------------------+------------
 messages_subject_tsvector_idx | 1.4397e+06
(1 row)

archie=# analyze messages;

archie=# select relname, reltuples from pg_class
         where relname = 'messages_subject_tsvector_idx';

            relname            | reltuples
-------------------------------+-----------
 messages_subject_tsvector_idx |    295107
(1 row)


The good thing is that in this case VACUUM/ANALYZE don't flap, it's just
the initial reltuples estimate set by CREATE INDEX.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Teodor Sigaev
Дата:
Сообщение: Re: [HACKERS] GSoC 2017 : Patch for predicate locking in Gist index
Следующее
От: Etsuro Fujita
Дата:
Сообщение: Oddity in COPY FROM handling of check constraints on partition tables