Обсуждение: [GENERAL] bloat indexes - opinion

Поиск
Список
Период
Сортировка

[GENERAL] bloat indexes - opinion

От
Patrick B
Дата:
Hi guys,

I've got a lot of bloat indexes on my 4TB database.

Let's take this example:
Table: seg
Index: ix_filter_by_tree
Times_used: 1018082183
Table_size: 18 GB -- wrong. The table is mostly on pg_toast table. Its real size is 2TB
Index_size: 17 GB
Num_writes 16245023
Index definition: CREATE INDEX ix_filter_by_tree ON seg USING btree (full_path varchar_pattern_ops) WHERE (full_path IS NOT NULL)


What is the real impact of a bloat index? If I reindex it, queries will be faster?

Thanks
Patrick

Re: [GENERAL] bloat indexes - opinion

От
Adrian Klaver
Дата:
On 02/21/2017 01:44 PM, Patrick B wrote:
> Hi guys,
>
> I've got a lot of bloat indexes on my 4TB database.
>
> Let's take this example:
>
>     Table: seg
>     Index: ix_filter_by_tree
>     Times_used: 1018082183
>     Table_size: 18 GB -- wrong. The table is mostly on pg_toast table.
>     Its real size is 2TB

How do you know one number is right and the other is wrong?

Have you looked at the functions here?:
https://www.postgresql.org/docs/9.6/static/functions-admin.html#FUNCTIONS-ADMIN-DBOBJECT

>     Index_size: 17 GB
>     Num_writes 16245023
>     Index definition: CREATE INDEX ix_filter_by_tree ON seg USING btree
>     (full_path varchar_pattern_ops) WHERE (full_path IS NOT NULL)
>
>
>
> What is the real impact of a bloat index? If I reindex it, queries will
> be faster?
>
> Thanks
> Patrick


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: [GENERAL] bloat indexes - opinion

От
Patrick B
Дата:
2017-02-22 10:59 GMT+13:00 Adrian Klaver <adrian.klaver@aklaver.com>:
On 02/21/2017 01:44 PM, Patrick B wrote:
> Hi guys,
>
> I've got a lot of bloat indexes on my 4TB database.
>
> Let's take this example:
>
>     Table: seg
>     Index: ix_filter_by_tree
>     Times_used: 1018082183
>     Table_size: 18 GB -- wrong. The table is mostly on pg_toast table.
>     Its real size is 2TB

How do you know one number is right and the other is wrong?


1. on that table (seg) i store binary data. It is impossible to have only 18GB of it.
2. 
SELECT schema_name, 
       pg_size_pretty(sum(table_size)::bigint),
       (sum(table_size) / pg_database_size(current_database())) * 100
FROM (
  SELECT pg_catalog.pg_namespace.nspname as schema_name,
         pg_relation_size(pg_catalog.pg_class.oid) as table_size
  FROM   pg_catalog.pg_class
     JOIN pg_catalog.pg_namespace ON relnamespace = pg_catalog.pg_namespace.oid
) t
GROUP BY schema_name
ORDER BY schema_name

pg_toast 2706 GB 82.62112838877240860000 <-- this belongs to the seg table.

 

Have you looked at the functions here?:
https://www.postgresql.org/docs/9.6/static/functions-admin.html#FUNCTIONS-ADMIN-DBOBJECT

>     Index_size: 17 GB
>     Num_writes 16245023
>     Index definition: CREATE INDEX ix_filter_by_tree ON seg USING btree
>     (full_path varchar_pattern_ops) WHERE (full_path IS NOT NULL)
>
>
>
> What is the real impact of a bloat index? If I reindex it, queries will
> be faster?
>
> Thanks
> Patrick


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: [GENERAL] bloat indexes - opinion

От
Patrick B
Дата:
2017-02-22 11:11 GMT+13:00 Patrick B <patrickbakerbr@gmail.com>:
2017-02-22 10:59 GMT+13:00 Adrian Klaver <adrian.klaver@aklaver.com>:
On 02/21/2017 01:44 PM, Patrick B wrote:
> Hi guys,
>
> I've got a lot of bloat indexes on my 4TB database.
>
> Let's take this example:
>
>     Table: seg
>     Index: ix_filter_by_tree
>     Times_used: 1018082183
>     Table_size: 18 GB -- wrong. The table is mostly on pg_toast table.
>     Its real size is 2TB

How do you know one number is right and the other is wrong?


1. on that table (seg) i store binary data. It is impossible to have only 18GB of it.
2. 
SELECT schema_name, 
       pg_size_pretty(sum(table_size)::bigint),
       (sum(table_size) / pg_database_size(current_database())) * 100
FROM (
  SELECT pg_catalog.pg_namespace.nspname as schema_name,
         pg_relation_size(pg_catalog.pg_class.oid) as table_size
  FROM   pg_catalog.pg_class
     JOIN pg_catalog.pg_namespace ON relnamespace = pg_catalog.pg_namespace.oid
) t
GROUP BY schema_name
ORDER BY schema_name

pg_toast 2706 GB 82.62112838877240860000 <-- this belongs to the seg table.

 

Have you looked at the functions here?:
https://www.postgresql.org/docs/9.6/static/functions-admin.html#FUNCTIONS-ADMIN-DBOBJECT

>     Index_size: 17 GB
>     Num_writes 16245023
>     Index definition: CREATE INDEX ix_filter_by_tree ON seg USING btree
>     (full_path varchar_pattern_ops) WHERE (full_path IS NOT NULL)
>
>
>
> What is the real impact of a bloat index? If I reindex it, queries will
> be faster?
>
> Thanks
> Patrick



I ran the query before and after the reindex, and it seems it did not help on performance.

The query I used:
explain analyze select * from seg where full_path = '/userfile/123';

Before reindex:
Index Scan using ix_filter_by_tree on seg  (cost=0.00..144.87 rows=215 width=8) (actual time=0.047..0.047 rows=1 loops=1)
  Index Cond: (full_path = '/userfile/123')
Total runtime: 0.059 ms
(3 rows)

After reindex:
Index Scan using ix_filter_by_tree on seg  (cost=0.00..141.83 rows=220 width=8) (actual time=0.021..0.021 rows=1 loops=1)
  Index Cond: (full_path = '/userfile/123')
Total runtime: 0.036 ms
(3 rows)

Note that the 'cost' is pretty much the same.

My question is:
If I have a bloat index. Why do I need to reindex it if I got none performance improvements? 

Cheers
Patrick

Re: [GENERAL] bloat indexes - opinion

От
Adrian Klaver
Дата:
On 02/21/2017 03:41 PM, Patrick B wrote:
> 2017-02-22 11:11 GMT+13:00 Patrick B <patrickbakerbr@gmail.com
> <mailto:patrickbakerbr@gmail.com>>:
>
>     2017-02-22 10:59 GMT+13:00 Adrian Klaver <adrian.klaver@aklaver.com
>     <mailto:adrian.klaver@aklaver.com>>:
>
>         On 02/21/2017 01:44 PM, Patrick B wrote:
>         > Hi guys,
>         >
>         > I've got a lot of bloat indexes on my 4TB database.
>         >
>         > Let's take this example:
>         >
>         >     Table: seg
>         >     Index: ix_filter_by_tree
>         >     Times_used: 1018082183
>         >     Table_size: 18 GB -- wrong. The table is mostly on pg_toast table.
>         >     Its real size is 2TB
>
>         How do you know one number is right and the other is wrong?
>
>
>
>     1. on that table (seg) i store binary data. It is impossible to have
>     only 18GB of it.
>     2.
>
>             SELECT schema_name,
>
>                    pg_size_pretty(sum(table_size)::bigint),
>
>                    (sum(table_size) /
>             pg_database_size(current_database())) * 100
>
>             FROM (
>
>               SELECT pg_catalog.pg_namespace.nspname as schema_name,
>
>                      pg_relation_size(pg_catalog.pg_class.oid) as table_size
>
>               FROM   pg_catalog.pg_class
>
>                  JOIN pg_catalog.pg_namespace ON relnamespace =
>             pg_catalog.pg_namespace.oid
>
>             ) t
>
>             GROUP BY schema_name
>
>             ORDER BY schema_name
>
>
>     pg_toast2706 GB82.62112838877240860000 <-- this belongs to the seg
>     table.
>
>
>
>
>         Have you looked at the functions here?:
>         https://www.postgresql.org/docs/9.6/static/functions-admin.html#FUNCTIONS-ADMIN-DBOBJECT
>         <https://www.postgresql.org/docs/9.6/static/functions-admin.html#FUNCTIONS-ADMIN-DBOBJECT>
>
>         >     Index_size: 17 GB
>         >     Num_writes 16245023
>         >     Index definition: CREATE INDEX ix_filter_by_tree ON seg USING btree
>         >     (full_path varchar_pattern_ops) WHERE (full_path IS NOT NULL)
>         >
>         >
>         >
>         > What is the real impact of a bloat index? If I reindex it, queries will
>         > be faster?
>         >
>         > Thanks
>         > Patrick
>
>
>
>
> I ran the query before and after the reindex, and it seems it did not
> help on performance.
>
> *The query I used:*
>
>     explain analyze select * from seg where full_path = '/userfile/123';

The table schema would be useful.

>
>
> *Before reindex:*
>
>     Index Scan using ix_filter_by_tree on seg  (cost=0.00..144.87
>     rows=215 width=8) (actual time=0.047..0.047 rows=1 loops=1)
>       Index Cond: (full_path = '/userfile/123')
>     Total runtime: 0.059 ms
>     (3 rows)
>
>
> *After reindex:*
>
>     Index Scan using ix_filter_by_tree on seg  (cost=0.00..141.83
>     rows=220 width=8) (actual time=0.021..0.021 rows=1 loops=1)
>       Index Cond: (full_path = '/userfile/123')
>     Total runtime: 0.036 ms
>     (3 rows)

Not showing the complete explain analyze makes the above not all that enlightening.

>
>
> Note that the '*/cost/*' is pretty much the same.
>
> *My question is:*
> If I have a bloat index. Why do I need to reindex it if I got none
> performance improvements?

Because it is an indication that you may not have index bloat?

Not sure a runtime of 0.036 to 0.036 ms over a 2TB table is symptomatic of a problem.

Might be worth taking a look at:

https://www.postgresql.org/docs/9.6/static/monitoring-stats.html#PG-STAT-ALL-TABLES-VIEW

>
> Cheers
> Patrick
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: [GENERAL] bloat indexes - opinion

От
Jeff Janes
Дата:
On Tue, Feb 21, 2017 at 1:44 PM, Patrick B <patrickbakerbr@gmail.com> wrote:
Hi guys,

I've got a lot of bloat indexes on my 4TB database.

Let's take this example:
Table: seg
Index: ix_filter_by_tree
Times_used: 1018082183
Table_size: 18 GB -- wrong. The table is mostly on pg_toast table. Its real size is 2TB
Index_size: 17 GB
Num_writes 16245023
Index definition: CREATE INDEX ix_filter_by_tree ON seg USING btree (full_path varchar_pattern_ops) WHERE (full_path IS NOT NULL)

What is this from?  If you think the table size reported should include toast, then change it to do that, or request the author of whatever-this-is to make that change.

What indication is there that the index is bloated?  If the meat-and-potatoes of a table is held in toast, then wouldn't you expect the size of the table and the size of the index to be about the same?
 
Cheers,

Jeff

Re: [GENERAL] bloat indexes - opinion

От
Patrick B
Дата:


2017-02-22 13:10 GMT+13:00 Adrian Klaver <adrian.klaver@aklaver.com>:
On 02/21/2017 03:41 PM, Patrick B wrote:
> 2017-02-22 11:11 GMT+13:00 Patrick B <patrickbakerbr@gmail.com
> <mailto:patrickbakerbr@gmail.com>>:
>
>     2017-02-22 10:59 GMT+13:00 Adrian Klaver <adrian.klaver@aklaver.com
>     <mailto:adrian.klaver@aklaver.com>>:
>
>         On 02/21/2017 01:44 PM, Patrick B wrote:
>         > Hi guys,
>         >
>         > I've got a lot of bloat indexes on my 4TB database.
>         >
>         > Let's take this example:
>         >
>         >     Table: seg
>         >     Index: ix_filter_by_tree
>         >     Times_used: 1018082183
>         >     Table_size: 18 GB -- wrong. The table is mostly on pg_toast table.
>         >     Its real size is 2TB
>
>         How do you know one number is right and the other is wrong?
>
>
>
>     1. on that table (seg) i store binary data. It is impossible to have
>     only 18GB of it.
>     2.
>
>             SELECT schema_name,
>
>                    pg_size_pretty(sum(table_size)::bigint),
>
>                    (sum(table_size) /
>             pg_database_size(current_database())) * 100
>
>             FROM (
>
>               SELECT pg_catalog.pg_namespace.nspname as schema_name,
>
>                      pg_relation_size(pg_catalog.pg_class.oid) as table_size
>
>               FROM   pg_catalog.pg_class
>
>                  JOIN pg_catalog.pg_namespace ON relnamespace =
>             pg_catalog.pg_namespace.oid
>
>             ) t
>
>             GROUP BY schema_name
>
>             ORDER BY schema_name
>
>
>     pg_toast2706 GB82.62112838877240860000 <-- this belongs to the seg
>     table.
>
>
>
>
>         Have you looked at the functions here?:
>         https://www.postgresql.org/docs/9.6/static/functions-admin.html#FUNCTIONS-ADMIN-DBOBJECT
>         <https://www.postgresql.org/docs/9.6/static/functions-admin.html#FUNCTIONS-ADMIN-DBOBJECT>
>
>         >     Index_size: 17 GB
>         >     Num_writes 16245023
>         >     Index definition: CREATE INDEX ix_filter_by_tree ON seg USING btree
>         >     (full_path varchar_pattern_ops) WHERE (full_path IS NOT NULL)
>         >
>         >
>         >
>         > What is the real impact of a bloat index? If I reindex it, queries will
>         > be faster?
>         >
>         > Thanks
>         > Patrick
>
>
>
>
> I ran the query before and after the reindex, and it seems it did not
> help on performance.
>
> *The query I used:*
>
>     explain analyze select * from seg where full_path = '/userfile/123';

The table schema would be useful.

Why? If i just wanna know how bloat indexes work?

 

>
>
> *Before reindex:*
>
>     Index Scan using ix_filter_by_tree on seg  (cost=0.00..144.87
>     rows=215 width=8) (actual time=0.047..0.047 rows=1 loops=1)
>       Index Cond: (full_path = '/userfile/123')
>     Total runtime: 0.059 ms
>     (3 rows)
>
>
> *After reindex:*
>
>     Index Scan using ix_filter_by_tree on seg  (cost=0.00..141.83
>     rows=220 width=8) (actual time=0.021..0.021 rows=1 loops=1)
>       Index Cond: (full_path = '/userfile/123')
>     Total runtime: 0.036 ms
>     (3 rows)

Not showing the complete explain analyze makes the above not all that enlightening.

I am showing the whole explain analyze mate.
 

>
>
> Note that the '*/cost/*' is pretty much the same.
>
> *My question is:*
> If I have a bloat index. Why do I need to reindex it if I got none
> performance improvements?

Because it is an indication that you may not have index bloat?

Queries for bloat indexes show that I do have bloat indexes. Also, it is really simple to look.

\d tablename

The table is 18GB big and the index is 17GB big.. this clearly shows me bloated index.
 

Not sure a runtime of 0.036 to 0.036 ms over a 2TB table is symptomatic of a problem.

Might be worth taking a look at:

https://www.postgresql.org/docs/9.6/static/monitoring-stats.html#PG-STAT-ALL-TABLES-VIEW



Patrick.
 

Re: [GENERAL] bloat indexes - opinion

От
Patrick B
Дата:


2017-02-23 11:46 GMT+13:00 Jeff Janes <jeff.janes@gmail.com>:
On Tue, Feb 21, 2017 at 1:44 PM, Patrick B <patrickbakerbr@gmail.com> wrote:
Hi guys,

I've got a lot of bloat indexes on my 4TB database.

Let's take this example:
Table: seg
Index: ix_filter_by_tree
Times_used: 1018082183
Table_size: 18 GB -- wrong. The table is mostly on pg_toast table. Its real size is 2TB
Index_size: 17 GB
Num_writes 16245023
Index definition: CREATE INDEX ix_filter_by_tree ON seg USING btree (full_path varchar_pattern_ops) WHERE (full_path IS NOT NULL)

What is this from?  If you think the table size reported should include toast, then change it to do that, or request the author of whatever-this-is to make that change.

What indication is there that the index is bloated?  If the meat-and-potatoes of a table is held in toast, then wouldn't you expect the size of the table and the size of the index to be about the same?
 
Cheers,

Jeff


I am running queries to see bloat indexes [1]. Also i understand an index can't have same size as table.
If you have any other table that can prove the index is indeed bloat, please let me know and i will be happy to post results here.


Patrick 

Re: [GENERAL] bloat indexes - opinion

От
Patrick B
Дата:
2017-02-25 17:53 GMT+13:00 Patrick B <patrickbakerbr@gmail.com>:


2017-02-23 11:46 GMT+13:00 Jeff Janes <jeff.janes@gmail.com>:
On Tue, Feb 21, 2017 at 1:44 PM, Patrick B <patrickbakerbr@gmail.com> wrote:
Hi guys,

I've got a lot of bloat indexes on my 4TB database.

Let's take this example:
Table: seg
Index: ix_filter_by_tree
Times_used: 1018082183
Table_size: 18 GB -- wrong. The table is mostly on pg_toast table. Its real size is 2TB
Index_size: 17 GB
Num_writes 16245023
Index definition: CREATE INDEX ix_filter_by_tree ON seg USING btree (full_path varchar_pattern_ops) WHERE (full_path IS NOT NULL)

What is this from?  If you think the table size reported should include toast, then change it to do that, or request the author of whatever-this-is to make that change.

What indication is there that the index is bloated?  If the meat-and-potatoes of a table is held in toast, then wouldn't you expect the size of the table and the size of the index to be about the same?
 
Cheers,

Jeff


I am running queries to see bloat indexes [1]. Also i understand an index can't have same size as table.
If you have any other table that can prove the index is indeed bloat, please let me know and i will be happy to post results here.


Patrick 



FYI - using this query to see the index size:

SELECT idstat.schemaname AS schema,
       idstat.relname AS table_name,
       indexrelname AS index_name,
       idstat.idx_scan AS times_used,
       pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) || '.' || quote_ident(idstat.relname))) AS table_size, 
       pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) || '.' || quote_ident(indexrelname))) AS index_size,
       n_tup_upd + n_tup_ins + n_tup_del as num_writes,
       indexdef AS definition
FROM pg_stat_user_indexes AS idstat 
JOIN pg_indexes ON indexrelname = indexname
JOIN pg_stat_user_tables AS tabstat ON idstat.relname = tabstat.relname
WHERE indexrelname = 'index_name';