Обсуждение: Index Bloat - how to tell?

Поиск
Список
Период
Сортировка

Index Bloat - how to tell?

От
John W Strange
Дата:
How can you tell when your indexes are starting to get bloated and when you need to rebuild them.  I haven't seen a
quickway to tell and not sure if it's being tracked.
 

_______________________________________________________________________________________________
| John W. Strange | Investment Bank | Global Commodities Technology 
| J.P. Morgan | 700 Louisiana, 11th Floor | T: 713-236-4122 | C: 281-744-6476 | F: 713 236-3333
| john.w.strange@jpmchase.com | jpmorgan.com

This communication is for informational purposes only. It is not
intended as an offer or solicitation for the purchase or sale of
any financial instrument or as an official confirmation of any
transaction. All market prices, data and other information are not
warranted as to completeness or accuracy and are subject to change
without notice. Any comments or statements made herein do not
necessarily reflect those of JPMorgan Chase & Co., its subsidiaries
and affiliates.

This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law. If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, or
use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED. Although this transmission and any
attachments are believed to be free of any virus or other defect
that might affect any computer system into which it is received and
opened, it is the responsibility of the recipient to ensure that it
is virus free and no responsibility is accepted by JPMorgan Chase &
Co., its subsidiaries and affiliates, as applicable, for any loss
or damage arising in any way from its use. If you received this
transmission in error, please immediately contact the sender and
destroy the material in its entirety, whether in electronic or hard
copy format. Thank you.

Please refer to http://www.jpmorgan.com/pages/disclosures for
disclosures relating to European legal entities.

Re: Index Bloat - how to tell?

От
"Plugge, Joe R."
Дата:
I have used this in the past ... run this against the database that you want to inspect.


SELECT
  current_database(), schemaname, tablename, /*reltuples::bigint, relpages::bigint, otta,*/
  ROUND(CASE WHEN otta=0 THEN 0.0 ELSE sml.relpages/otta::numeric END,1) AS tbloat,
  CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END AS wastedbytes,
  iname, /*ituples::bigint, ipages::bigint, iotta,*/
  ROUND(CASE WHEN iotta=0 OR ipages=0 THEN 0.0 ELSE ipages/iotta::numeric END,1) AS ibloat,
  CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes
FROM (
  SELECT
    schemaname, tablename, cc.reltuples, cc.relpages, bs,
    CEIL((cc.reltuples*((datahdr+ma-
      (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)) AS otta,
    COALESCE(c2.relname,'?') AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages,
    COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0) AS iotta -- very rough approximation, assumes all cols
  FROM (
    SELECT
      ma,bs,schemaname,tablename,
      (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr,
      (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2
    FROM (
      SELECT
        schemaname, tablename, hdr, ma, bs,
        SUM((1-null_frac)*avg_width) AS datawidth,
        MAX(null_frac) AS maxfracsum,
        hdr+(
          SELECT 1+count(*)/8
          FROM pg_stats s2
          WHERE null_frac<>0 AND s2.schemaname = s.schemaname AND s2.tablename = s.tablename
        ) AS nullhdr
      FROM pg_stats s, (
        SELECT
          (SELECT current_setting('block_size')::numeric) AS bs,
          CASE WHEN substring(v,12,3) IN ('8.0','8.1','8.2') THEN 27 ELSE 23 END AS hdr,
          CASE WHEN v ~ 'mingw32' THEN 8 ELSE 4 END AS ma
        FROM (SELECT version() AS v) AS foo
      ) AS constants
      GROUP BY 1,2,3,4,5
    ) AS foo
  ) AS rs
  JOIN pg_class cc ON cc.relname = rs.tablename
  JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname = rs.schemaname AND nn.nspname <>
'information_schema'
  LEFT JOIN pg_index i ON indrelid = cc.oid
  LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid
) AS sml
ORDER BY wastedbytes DESC

-----Original Message-----
From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of John W
Strange
Sent: Tuesday, December 14, 2010 8:48 AM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Index Bloat - how to tell?

How can you tell when your indexes are starting to get bloated and when you need to rebuild them.  I haven't seen a
quickway to tell and not sure if it's being tracked. 



_______________________________________________________________________________________________

| John W. Strange | Investment Bank | Global Commodities Technology

| J.P. Morgan | 700 Louisiana, 11th Floor | T: 713-236-4122 | C: 281-744-6476 | F: 713 236-3333

| john.w.strange@jpmchase.com | jpmorgan.com



This communication is for informational purposes only. It is not
intended as an offer or solicitation for the purchase or sale of
any financial instrument or as an official confirmation of any
transaction. All market prices, data and other information are not
warranted as to completeness or accuracy and are subject to change
without notice. Any comments or statements made herein do not
necessarily reflect those of JPMorgan Chase & Co., its subsidiaries
and affiliates.



This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law. If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, or
use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED. Although this transmission and any
attachments are believed to be free of any virus or other defect
that might affect any computer system into which it is received and
opened, it is the responsibility of the recipient to ensure that it
is virus free and no responsibility is accepted by JPMorgan Chase &
Co., its subsidiaries and affiliates, as applicable, for any loss
or damage arising in any way from its use. If you received this
transmission in error, please immediately contact the sender and
destroy the material in its entirety, whether in electronic or hard
copy format. Thank you.



Please refer to http://www.jpmorgan.com/pages/disclosures for
disclosures relating to European legal entities.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: Index Bloat - how to tell?

От
Mladen Gogala
Дата:
Can you explain this query a bit? It isn't at all clear to me.


Plugge, Joe R. wrote:
> I have used this in the past ... run this against the database that you want to inspect.
>
>
> SELECT
>   current_database(), schemaname, tablename, /*reltuples::bigint, relpages::bigint, otta,*/
>   ROUND(CASE WHEN otta=0 THEN 0.0 ELSE sml.relpages/otta::numeric END,1) AS tbloat,
>   CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END AS wastedbytes,
>   iname, /*ituples::bigint, ipages::bigint, iotta,*/
>   ROUND(CASE WHEN iotta=0 OR ipages=0 THEN 0.0 ELSE ipages/iotta::numeric END,1) AS ibloat,
>   CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes
> FROM (
>   SELECT
>     schemaname, tablename, cc.reltuples, cc.relpages, bs,
>     CEIL((cc.reltuples*((datahdr+ma-
>       (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)) AS otta,
>     COALESCE(c2.relname,'?') AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages,
>     COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0) AS iotta -- very rough approximation, assumes all
cols
>   FROM (
>     SELECT
>       ma,bs,schemaname,tablename,
>       (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr,
>       (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2
>     FROM (
>       SELECT
>         schemaname, tablename, hdr, ma, bs,
>         SUM((1-null_frac)*avg_width) AS datawidth,
>         MAX(null_frac) AS maxfracsum,
>         hdr+(
>           SELECT 1+count(*)/8
>           FROM pg_stats s2
>           WHERE null_frac<>0 AND s2.schemaname = s.schemaname AND s2.tablename = s.tablename
>         ) AS nullhdr
>       FROM pg_stats s, (
>         SELECT
>           (SELECT current_setting('block_size')::numeric) AS bs,
>           CASE WHEN substring(v,12,3) IN ('8.0','8.1','8.2') THEN 27 ELSE 23 END AS hdr,
>           CASE WHEN v ~ 'mingw32' THEN 8 ELSE 4 END AS ma
>         FROM (SELECT version() AS v) AS foo
>       ) AS constants
>       GROUP BY 1,2,3,4,5
>     ) AS foo
>   ) AS rs
>   JOIN pg_class cc ON cc.relname = rs.tablename
>   JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname = rs.schemaname AND nn.nspname <>
'information_schema'
>   LEFT JOIN pg_index i ON indrelid = cc.oid
>   LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid
> ) AS sml
> ORDER BY wastedbytes DESC
>
> -----Original Message-----
> From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of John W
Strange
> Sent: Tuesday, December 14, 2010 8:48 AM
> To: pgsql-performance@postgresql.org
> Subject: [PERFORM] Index Bloat - how to tell?
>
> How can you tell when your indexes are starting to get bloated and when you need to rebuild them.  I haven't seen a
quickway to tell and not sure if it's being tracked. 
>
>
>
> _______________________________________________________________________________________________
>
> | John W. Strange | Investment Bank | Global Commodities Technology
>
> | J.P. Morgan | 700 Louisiana, 11th Floor | T: 713-236-4122 | C: 281-744-6476 | F: 713 236-3333
>
> | john.w.strange@jpmchase.com | jpmorgan.com
>
>
>
> This communication is for informational purposes only. It is not
> intended as an offer or solicitation for the purchase or sale of
> any financial instrument or as an official confirmation of any
> transaction. All market prices, data and other information are not
> warranted as to completeness or accuracy and are subject to change
> without notice. Any comments or statements made herein do not
> necessarily reflect those of JPMorgan Chase & Co., its subsidiaries
> and affiliates.
>
>
>
> This transmission may contain information that is privileged,
> confidential, legally privileged, and/or exempt from disclosure
> under applicable law. If you are not the intended recipient, you
> are hereby notified that any disclosure, copying, distribution, or
> use of the information contained herein (including any reliance
> thereon) is STRICTLY PROHIBITED. Although this transmission and any
> attachments are believed to be free of any virus or other defect
> that might affect any computer system into which it is received and
> opened, it is the responsibility of the recipient to ensure that it
> is virus free and no responsibility is accepted by JPMorgan Chase &
> Co., its subsidiaries and affiliates, as applicable, for any loss
> or damage arising in any way from its use. If you received this
> transmission in error, please immediately contact the sender and
> destroy the material in its entirety, whether in electronic or hard
> copy format. Thank you.
>
>
>
> Please refer to http://www.jpmorgan.com/pages/disclosures for
> disclosures relating to European legal entities.
>
>


--

Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
http://www.vmsinfo.com
The Leader in Integrated Media Intelligence Solutions




Re: Index Bloat - how to tell?

От
Dave Crooke
Дата:
There is a plugin called pgstattuple which can be quite informative .... however, it actually does a full scan of the table / index files, which may be a bit invasive depending on your environment and load.

http://www.postgresql.org/docs/current/static/pgstattuple.html

It's in the contrib (at least for 8.4), and so you have to import its functions into your schema using the script in the contrib directory.

Cheers
Dave

On Tue, Dec 14, 2010 at 8:54 AM, Plugge, Joe R. <JRPlugge@west.com> wrote:
I have used this in the past ... run this against the database that you want to inspect.


SELECT
 current_database(), schemaname, tablename, /*reltuples::bigint, relpages::bigint, otta,*/
 ROUND(CASE WHEN otta=0 THEN 0.0 ELSE sml.relpages/otta::numeric END,1) AS tbloat,
 CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END AS wastedbytes,
 iname, /*ituples::bigint, ipages::bigint, iotta,*/
 ROUND(CASE WHEN iotta=0 OR ipages=0 THEN 0.0 ELSE ipages/iotta::numeric END,1) AS ibloat,
 CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes
FROM (
 SELECT
   schemaname, tablename, cc.reltuples, cc.relpages, bs,
   CEIL((cc.reltuples*((datahdr+ma-
     (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)) AS otta,
   COALESCE(c2.relname,'?') AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages,
   COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0) AS iotta -- very rough approximation, assumes all cols
 FROM (
   SELECT
     ma,bs,schemaname,tablename,
     (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr,
     (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2
   FROM (
     SELECT
       schemaname, tablename, hdr, ma, bs,
       SUM((1-null_frac)*avg_width) AS datawidth,
       MAX(null_frac) AS maxfracsum,
       hdr+(
         SELECT 1+count(*)/8
         FROM pg_stats s2
         WHERE null_frac<>0 AND s2.schemaname = s.schemaname AND s2.tablename = s.tablename
       ) AS nullhdr
     FROM pg_stats s, (
       SELECT
         (SELECT current_setting('block_size')::numeric) AS bs,
         CASE WHEN substring(v,12,3) IN ('8.0','8.1','8.2') THEN 27 ELSE 23 END AS hdr,
         CASE WHEN v ~ 'mingw32' THEN 8 ELSE 4 END AS ma
       FROM (SELECT version() AS v) AS foo
     ) AS constants
     GROUP BY 1,2,3,4,5
   ) AS foo
 ) AS rs
 JOIN pg_class cc ON cc.relname = rs.tablename
 JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname = rs.schemaname AND nn.nspname <> 'information_schema'
 LEFT JOIN pg_index i ON indrelid = cc.oid
 LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid
) AS sml
ORDER BY wastedbytes DESC

-----Original Message-----
From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of John W Strange
Sent: Tuesday, December 14, 2010 8:48 AM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Index Bloat - how to tell?

How can you tell when your indexes are starting to get bloated and when you need to rebuild them.  I haven't seen a quick way to tell and not sure if it's being tracked.



_______________________________________________________________________________________________

| John W. Strange | Investment Bank | Global Commodities Technology

| J.P. Morgan | 700 Louisiana, 11th Floor | T: 713-236-4122 | C: 281-744-6476 | F: 713 236-3333

| john.w.strange@jpmchase.com | jpmorgan.com



This communication is for informational purposes only. It is not
intended as an offer or solicitation for the purchase or sale of
any financial instrument or as an official confirmation of any
transaction. All market prices, data and other information are not
warranted as to completeness or accuracy and are subject to change
without notice. Any comments or statements made herein do not
necessarily reflect those of JPMorgan Chase & Co., its subsidiaries
and affiliates.



This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law. If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, or
use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED. Although this transmission and any
attachments are believed to be free of any virus or other defect
that might affect any computer system into which it is received and
opened, it is the responsibility of the recipient to ensure that it
is virus free and no responsibility is accepted by JPMorgan Chase &
Co., its subsidiaries and affiliates, as applicable, for any loss
or damage arising in any way from its use. If you received this
transmission in error, please immediately contact the sender and
destroy the material in its entirety, whether in electronic or hard
copy format. Thank you.



Please refer to http://www.jpmorgan.com/pages/disclosures for
disclosures relating to European legal entities.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: Index Bloat - how to tell?

От
Mark Kirkwood
Дата:
On 15/12/10 09:12, Dave Crooke wrote:
There is a plugin called pgstattuple which can be quite informative .... however, it actually does a full scan of the table / index files, which may be a bit invasive depending on your environment and load.

http://www.postgresql.org/docs/current/static/pgstattuple.html

It's in the contrib (at least for 8.4), and so you have to import its functions into your schema using the script in the contrib directory.


If you are using 8.4 or later, try the Freespacemap module:

http://www.postgresql.org/docs/current/static/pgfreespacemap.html

I tend to run this query:

        SELECT oid::regclass,               pg_relation_size(oid)/(1024*1024) AS mb,              sum(free)/(1024*1024) AS free_mb        FROM            (SELECT oid, (pg_freespace(oid)).avail AS free             FROM pg_class) AS a        GROUP BY a.oid ORDER BY free_mb DESC;

to show up potentially troublesome amounts of bloat.

regards

Mark

Re: Index Bloat - how to tell?

От
Mladen Gogala
Дата:
Dave Crooke wrote:
> There is a plugin called pgstattuple which can be quite informative
> .... however, it actually does a full scan of the table / index files,
> which may be a bit invasive depending on your environment and load.
>
> http://www.postgresql.org/docs/current/static/pgstattuple.html
>
> It's in the contrib (at least for 8.4), and so you have to import its
> functions into your schema using the script in the contrib directory.
>
> Cheers
> Dave
I tried it with one of my databases:


testtrack=# select * from pgstatindex('public.defects_pkey');
 version | tree_level | index_size | root_block_no | internal_pages |
leaf_pages | empty_pages | deleted_pages | avg_leaf_density |
leaf_fragmentation


---------+------------+------------+---------------+----------------+------------+-------------+---------------+------------------+-------------------
-
       2 |          1 |     827392 |             3 |              0
|        100 |           0 |             0 |            70.12
|                 22
(1 row)


What is "leaf_fragmentation"? How is it defined? I wasn't able to find
out any definition of that number. How is it calculated. I verified that
running reindex makes it 0:


testtrack=# reindex table public.defects;
REINDEX
testtrack=# select * from pgstatindex('public.defects_pkey');
 version | tree_level | index_size | root_block_no | internal_pages |
leaf_pages | empty_pages | deleted_pages | avg_leaf_density |
leaf_fragmentation


---------+------------+------------+---------------+----------------+------------+-------------+---------------+------------------+-------------------
-
       2 |          1 |     647168 |             3 |              0
|         78 |           0 |             0 |            89.67
|                  0
(1 row)


--

Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
http://www.vmsinfo.com
The Leader in Integrated Media Intelligence Solutions




Re: Index Bloat - how to tell?

От
Robert Haas
Дата:
On Thu, Dec 16, 2010 at 2:27 PM, Mladen Gogala
<mladen.gogala@vmsinfo.com> wrote:
> What is "leaf_fragmentation"? How is it defined? I wasn't able to find out
> any definition of that number. How is it calculated. I verified that running
> reindex makes it 0:

Well, according to the code:

                        /*
                         * If the next leaf is on an earlier block, it means a
                         * fragmentation.
                         */
                        if (opaque->btpo_next != P_NONE &&
opaque->btpo_next < blkno)
                                indexStat.fragments++;

And then the final value is calculated thus:

                snprintf(values[j++], 32, "%.2f", (double)
indexStat.fragments / (double) indexStat.leaf_pages * 100.0);

This doesn't really match my definition of the word "fragmentation", though...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Index Bloat - how to tell?

От
Mladen Gogala
Дата:
Robert Haas wrote:
>
> This doesn't really match my definition of the word "fragmentation", though...
>
>
Same here. However, I did run "reindex" on one table and this indicator
did drop to 0. I will shoot an email to the author, he's probably
smarter than me and will be able to provide a reasonable explanation.

--
Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
www.vmsinfo.com