Обсуждение: [GENERAL] bloat indexes - opinion
Hi guys,
I've got a lot of bloat indexes on my 4TB database.
Let's take this example:
Table: segIndex: ix_filter_by_treeTimes_used: 1018082183Table_size: 18 GB -- wrong. The table is mostly on pg_toast table. Its real size is 2TBIndex_size: 17 GBNum_writes 16245023Index definition: CREATE INDEX ix_filter_by_tree ON seg USING btree (full_path varchar_pattern_ops) WHERE (full_path IS NOT NULL)
What is the real impact of a bloat index? If I reindex it, queries will be faster?
Thanks
Patrick
On 02/21/2017 01:44 PM, Patrick B wrote: > Hi guys, > > I've got a lot of bloat indexes on my 4TB database. > > Let's take this example: > > Table: seg > Index: ix_filter_by_tree > Times_used: 1018082183 > Table_size: 18 GB -- wrong. The table is mostly on pg_toast table. > Its real size is 2TB How do you know one number is right and the other is wrong? Have you looked at the functions here?: https://www.postgresql.org/docs/9.6/static/functions-admin.html#FUNCTIONS-ADMIN-DBOBJECT > Index_size: 17 GB > Num_writes 16245023 > Index definition: CREATE INDEX ix_filter_by_tree ON seg USING btree > (full_path varchar_pattern_ops) WHERE (full_path IS NOT NULL) > > > > What is the real impact of a bloat index? If I reindex it, queries will > be faster? > > Thanks > Patrick -- Adrian Klaver adrian.klaver@aklaver.com
2017-02-22 10:59 GMT+13:00 Adrian Klaver <adrian.klaver@aklaver.com>:
On 02/21/2017 01:44 PM, Patrick B wrote:
> Hi guys,
>
> I've got a lot of bloat indexes on my 4TB database.
>
> Let's take this example:
>
> Table: seg
> Index: ix_filter_by_tree
> Times_used: 1018082183
> Table_size: 18 GB -- wrong. The table is mostly on pg_toast table.
> Its real size is 2TB
How do you know one number is right and the other is wrong?
1. on that table (seg) i store binary data. It is impossible to have only 18GB of it.
2.
SELECT schema_name,pg_size_pretty(sum(table_size)::bigint), (sum(table_size) / pg_database_size(current_database())) * 100 FROM (SELECT pg_catalog.pg_namespace.nspname as schema_name, pg_relation_size(pg_catalog.pg_class.oid) as table_size FROM pg_catalog.pg_classJOIN pg_catalog.pg_namespace ON relnamespace = pg_catalog.pg_namespace.oid) tGROUP BY schema_nameORDER BY schema_name
pg_toast 2706 GB 82.62112838877240860000 <-- this belongs to the seg table.
Have you looked at the functions here?:
https://www.postgresql.org/docs/9.6/static/functions-admin. html#FUNCTIONS-ADMIN-DBOBJECT --
> Index_size: 17 GB
> Num_writes 16245023
> Index definition: CREATE INDEX ix_filter_by_tree ON seg USING btree
> (full_path varchar_pattern_ops) WHERE (full_path IS NOT NULL)
>
>
>
> What is the real impact of a bloat index? If I reindex it, queries will
> be faster?
>
> Thanks
> Patrick
Adrian Klaver
adrian.klaver@aklaver.com
2017-02-22 11:11 GMT+13:00 Patrick B <patrickbakerbr@gmail.com>:
2017-02-22 10:59 GMT+13:00 Adrian Klaver <adrian.klaver@aklaver.com>:On 02/21/2017 01:44 PM, Patrick B wrote:
> Hi guys,
>
> I've got a lot of bloat indexes on my 4TB database.
>
> Let's take this example:
>
> Table: seg
> Index: ix_filter_by_tree
> Times_used: 1018082183
> Table_size: 18 GB -- wrong. The table is mostly on pg_toast table.
> Its real size is 2TB
How do you know one number is right and the other is wrong?1. on that table (seg) i store binary data. It is impossible to have only 18GB of it.2.SELECT schema_name,pg_size_pretty(sum(table_size)::bigint), (sum(table_size) / pg_database_size(current_database())) * 100 FROM (SELECT pg_catalog.pg_namespace.nspname as schema_name, pg_relation_size(pg_catalog.pg_class.oid) as table_size FROM pg_catalog.pg_classJOIN pg_catalog.pg_namespace ON relnamespace = pg_catalog.pg_namespace.oid) tGROUP BY schema_nameORDER BY schema_namepg_toast 2706 GB 82.62112838877240860000 <-- this belongs to the seg table.
Have you looked at the functions here?:
https://www.postgresql.org/docs/9.6/static/functions-admin.h tml#FUNCTIONS-ADMIN-DBOBJECT
> Index_size: 17 GB
> Num_writes 16245023
> Index definition: CREATE INDEX ix_filter_by_tree ON seg USING btree
> (full_path varchar_pattern_ops) WHERE (full_path IS NOT NULL)
>
>
>
> What is the real impact of a bloat index? If I reindex it, queries will
> be faster?
>
> Thanks
> Patrick
I ran the query before and after the reindex, and it seems it did not help on performance.
The query I used:
explain analyze select * from seg where full_path = '/userfile/123';
Before reindex:
Index Scan using ix_filter_by_tree on seg (cost=0.00..144.87 rows=215 width=8) (actual time=0.047..0.047 rows=1 loops=1)Index Cond: (full_path = '/userfile/123')Total runtime: 0.059 ms(3 rows)
After reindex:
Index Scan using ix_filter_by_tree on seg (cost=0.00..141.83 rows=220 width=8) (actual time=0.021..0.021 rows=1 loops=1)Index Cond: (full_path = '/userfile/123')Total runtime: 0.036 ms(3 rows)
Note that the 'cost' is pretty much the same.
My question is:
If I have a bloat index. Why do I need to reindex it if I got none performance improvements?
Cheers
Patrick
On 02/21/2017 03:41 PM, Patrick B wrote: > 2017-02-22 11:11 GMT+13:00 Patrick B <patrickbakerbr@gmail.com > <mailto:patrickbakerbr@gmail.com>>: > > 2017-02-22 10:59 GMT+13:00 Adrian Klaver <adrian.klaver@aklaver.com > <mailto:adrian.klaver@aklaver.com>>: > > On 02/21/2017 01:44 PM, Patrick B wrote: > > Hi guys, > > > > I've got a lot of bloat indexes on my 4TB database. > > > > Let's take this example: > > > > Table: seg > > Index: ix_filter_by_tree > > Times_used: 1018082183 > > Table_size: 18 GB -- wrong. The table is mostly on pg_toast table. > > Its real size is 2TB > > How do you know one number is right and the other is wrong? > > > > 1. on that table (seg) i store binary data. It is impossible to have > only 18GB of it. > 2. > > SELECT schema_name, > > pg_size_pretty(sum(table_size)::bigint), > > (sum(table_size) / > pg_database_size(current_database())) * 100 > > FROM ( > > SELECT pg_catalog.pg_namespace.nspname as schema_name, > > pg_relation_size(pg_catalog.pg_class.oid) as table_size > > FROM pg_catalog.pg_class > > JOIN pg_catalog.pg_namespace ON relnamespace = > pg_catalog.pg_namespace.oid > > ) t > > GROUP BY schema_name > > ORDER BY schema_name > > > pg_toast2706 GB82.62112838877240860000 <-- this belongs to the seg > table. > > > > > Have you looked at the functions here?: > https://www.postgresql.org/docs/9.6/static/functions-admin.html#FUNCTIONS-ADMIN-DBOBJECT > <https://www.postgresql.org/docs/9.6/static/functions-admin.html#FUNCTIONS-ADMIN-DBOBJECT> > > > Index_size: 17 GB > > Num_writes 16245023 > > Index definition: CREATE INDEX ix_filter_by_tree ON seg USING btree > > (full_path varchar_pattern_ops) WHERE (full_path IS NOT NULL) > > > > > > > > What is the real impact of a bloat index? If I reindex it, queries will > > be faster? > > > > Thanks > > Patrick > > > > > I ran the query before and after the reindex, and it seems it did not > help on performance. > > *The query I used:* > > explain analyze select * from seg where full_path = '/userfile/123'; The table schema would be useful. > > > *Before reindex:* > > Index Scan using ix_filter_by_tree on seg (cost=0.00..144.87 > rows=215 width=8) (actual time=0.047..0.047 rows=1 loops=1) > Index Cond: (full_path = '/userfile/123') > Total runtime: 0.059 ms > (3 rows) > > > *After reindex:* > > Index Scan using ix_filter_by_tree on seg (cost=0.00..141.83 > rows=220 width=8) (actual time=0.021..0.021 rows=1 loops=1) > Index Cond: (full_path = '/userfile/123') > Total runtime: 0.036 ms > (3 rows) Not showing the complete explain analyze makes the above not all that enlightening. > > > Note that the '*/cost/*' is pretty much the same. > > *My question is:* > If I have a bloat index. Why do I need to reindex it if I got none > performance improvements? Because it is an indication that you may not have index bloat? Not sure a runtime of 0.036 to 0.036 ms over a 2TB table is symptomatic of a problem. Might be worth taking a look at: https://www.postgresql.org/docs/9.6/static/monitoring-stats.html#PG-STAT-ALL-TABLES-VIEW > > Cheers > Patrick > -- Adrian Klaver adrian.klaver@aklaver.com
On Tue, Feb 21, 2017 at 1:44 PM, Patrick B <patrickbakerbr@gmail.com> wrote:
Hi guys,I've got a lot of bloat indexes on my 4TB database.Let's take this example:Table: segIndex: ix_filter_by_treeTimes_used: 1018082183Table_size: 18 GB -- wrong. The table is mostly on pg_toast table. Its real size is 2TBIndex_size: 17 GBNum_writes 16245023Index definition: CREATE INDEX ix_filter_by_tree ON seg USING btree (full_path varchar_pattern_ops) WHERE (full_path IS NOT NULL)
What is this from? If you think the table size reported should include toast, then change it to do that, or request the author of whatever-this-is to make that change.
What indication is there that the index is bloated? If the meat-and-potatoes of a table is held in toast, then wouldn't you expect the size of the table and the size of the index to be about the same?
Cheers,
Jeff
2017-02-22 13:10 GMT+13:00 Adrian Klaver <adrian.klaver@aklaver.com>:
On 02/21/2017 03:41 PM, Patrick B wrote:
> 2017-02-22 11:11 GMT+13:00 Patrick B <patrickbakerbr@gmail.com
> <mailto:patrickbakerbr@gmail.com>>:
>
> 2017-02-22 10:59 GMT+13:00 Adrian Klaver <adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>>: > pg_toast2706 GB82.62112838877240860000 <-- this belongs to the seg>
> On 02/21/2017 01:44 PM, Patrick B wrote:
> > Hi guys,
> >
> > I've got a lot of bloat indexes on my 4TB database.
> >
> > Let's take this example:
> >
> > Table: seg
> > Index: ix_filter_by_tree
> > Times_used: 1018082183
> > Table_size: 18 GB -- wrong. The table is mostly on pg_toast table.
> > Its real size is 2TB
>
> How do you know one number is right and the other is wrong?
>
>
>
> 1. on that table (seg) i store binary data. It is impossible to have
> only 18GB of it.
> 2.
>
> SELECT schema_name,
>
> pg_size_pretty(sum(table_size)::bigint),
>
> (sum(table_size) /
> pg_database_size(current_database())) * 100
>
> FROM (
>
> SELECT pg_catalog.pg_namespace.nspname as schema_name,
>
> pg_relation_size(pg_catalog.pg_class.oid) as table_size
>
> FROM pg_catalog.pg_class
>
> JOIN pg_catalog.pg_namespace ON relnamespace =
> pg_catalog.pg_namespace.oid
>
> ) t
>
> GROUP BY schema_name
>
> ORDER BY schema_name
>
>
> table.
>
>
>
>
> Have you looked at the functions here?:
> https://www.postgresql.org/docs/9.6/static/functions- admin.html#FUNCTIONS-ADMIN- DBOBJECT
> <https://www.postgresql.org/docs/9.6/static/functions- admin.html#FUNCTIONS-ADMIN- DBOBJECT>
>
> > Index_size: 17 GB
> > Num_writes 16245023
> > Index definition: CREATE INDEX ix_filter_by_tree ON seg USING btree
> > (full_path varchar_pattern_ops) WHERE (full_path IS NOT NULL)
> >
> >
> >
> > What is the real impact of a bloat index? If I reindex it, queries will
> > be faster?
> >
> > Thanks
> > Patrick
>
>
>
>
> I ran the query before and after the reindex, and it seems it did not
> help on performance.
>
> *The query I used:*
>
> explain analyze select * from seg where full_path = '/userfile/123';
The table schema would be useful.
Why? If i just wanna know how bloat indexes work?
>
>
> *Before reindex:*
>
> Index Scan using ix_filter_by_tree on seg (cost=0.00..144.87
> rows=215 width=8) (actual time=0.047..0.047 rows=1 loops=1)
> Index Cond: (full_path = '/userfile/123')
> Total runtime: 0.059 ms
> (3 rows)
>
>
> *After reindex:*
>
> Index Scan using ix_filter_by_tree on seg (cost=0.00..141.83
> rows=220 width=8) (actual time=0.021..0.021 rows=1 loops=1)
> Index Cond: (full_path = '/userfile/123')
> Total runtime: 0.036 ms
> (3 rows)
Not showing the complete explain analyze makes the above not all that enlightening.
I am showing the whole explain analyze mate.
>
>
> Note that the '*/cost/*' is pretty much the same.
>
> *My question is:*
> If I have a bloat index. Why do I need to reindex it if I got none
> performance improvements?
Because it is an indication that you may not have index bloat?
Queries for bloat indexes show that I do have bloat indexes. Also, it is really simple to look.
\d tablename
The table is 18GB big and the index is 17GB big.. this clearly shows me bloated index.
Not sure a runtime of 0.036 to 0.036 ms over a 2TB table is symptomatic of a problem.
Might be worth taking a look at:
https://www.postgresql.org/docs/9.6/static/monitoring- stats.html#PG-STAT-ALL-TABLES- VIEW
Patrick.
2017-02-23 11:46 GMT+13:00 Jeff Janes <jeff.janes@gmail.com>:
On Tue, Feb 21, 2017 at 1:44 PM, Patrick B <patrickbakerbr@gmail.com> wrote:Hi guys,I've got a lot of bloat indexes on my 4TB database.Let's take this example:Table: segIndex: ix_filter_by_treeTimes_used: 1018082183Table_size: 18 GB -- wrong. The table is mostly on pg_toast table. Its real size is 2TBIndex_size: 17 GBNum_writes 16245023Index definition: CREATE INDEX ix_filter_by_tree ON seg USING btree (full_path varchar_pattern_ops) WHERE (full_path IS NOT NULL)What is this from? If you think the table size reported should include toast, then change it to do that, or request the author of whatever-this-is to make that change.What indication is there that the index is bloated? If the meat-and-potatoes of a table is held in toast, then wouldn't you expect the size of the table and the size of the index to be about the same?Cheers,Jeff
I am running queries to see bloat indexes [1]. Also i understand an index can't have same size as table.
If you have any other table that can prove the index is indeed bloat, please let me know and i will be happy to post results here.
Patrick
2017-02-25 17:53 GMT+13:00 Patrick B <patrickbakerbr@gmail.com>:
2017-02-23 11:46 GMT+13:00 Jeff Janes <jeff.janes@gmail.com>:On Tue, Feb 21, 2017 at 1:44 PM, Patrick B <patrickbakerbr@gmail.com> wrote:Hi guys,I've got a lot of bloat indexes on my 4TB database.Let's take this example:Table: segIndex: ix_filter_by_treeTimes_used: 1018082183Table_size: 18 GB -- wrong. The table is mostly on pg_toast table. Its real size is 2TBIndex_size: 17 GBNum_writes 16245023Index definition: CREATE INDEX ix_filter_by_tree ON seg USING btree (full_path varchar_pattern_ops) WHERE (full_path IS NOT NULL)What is this from? If you think the table size reported should include toast, then change it to do that, or request the author of whatever-this-is to make that change.What indication is there that the index is bloated? If the meat-and-potatoes of a table is held in toast, then wouldn't you expect the size of the table and the size of the index to be about the same?Cheers,JeffI am running queries to see bloat indexes [1]. Also i understand an index can't have same size as table.If you have any other table that can prove the index is indeed bloat, please let me know and i will be happy to post results here.Patrick
FYI - using this query to see the index size:
SELECT idstat.schemaname AS schema,idstat.relname AS table_name,indexrelname AS index_name,idstat.idx_scan AS times_used,pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) || '.' || quote_ident(idstat.relname))) AS table_size,pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) || '.' || quote_ident(indexrelname))) AS index_size,n_tup_upd + n_tup_ins + n_tup_del as num_writes,indexdef AS definitionFROM pg_stat_user_indexes AS idstatJOIN pg_indexes ON indexrelname = indexnameJOIN pg_stat_user_tables AS tabstat ON idstat.relname = tabstat.relnameWHERE indexrelname = 'index_name';