Обсуждение: Almost happy ending (from "Data files became huge with no apparent reason" thread)

Поиск

Список

Период

Сортировка

Almost happy ending (from "Data files became huge with no apparent reason" thread)

От

dfumagalli@tin.it

Дата:

02 сентября 2002 г., 04:49:39

Hi all,

Al last I recieved a new disk and was able to VACUUM ANALYZE.

The repaired database from the initial 500MB became 185 MB.
I'm very happy that the combined disk full + unexpected server shutdown
didn't suffice to crash the data beyond recovery.

The next thing I did was to further reduce this rescued database. In fact,
the original data files took only 26 MB, not 185.

I checked the objects file size (by looking at pg_class -> it's a bit difficult
with those "numeric table names") and quicky found that I had some 150 MB
of index files!

My first try was to go stand alone and to reindex force the entire database,
but the rebuilt indexes still took the same space (I thought the reindex
would rebuild them from scratch, freeing unused index pages, but perhaps
I'm wrong, at least for PostgreSQL 7.1.1).

So I dropped all non-primary key indexes (they were auto-created as *_pkey)
and recreated them. I did not drop primary keys because I fear it could
break something about constraints. Now the database is 47MB and I'm pretty
happy with it. I only I could safely drop and recreate the primary keys
I'm sure the database would return to its original 26 MB.

Now, three simple questions to the pros on this mailing list:
- Is there a cleaner way of recreating packed indexes?
- May I drop and recreate primary indexes without breaking rules, triggers,
constraints and sequences?
- We are evaluating PostgreSQL for a mission critical application, currently
managed by an Informix IDS 2000 server on a Sun Ultra 420 with Solaris 8
with raid disks and SAN for storage. Currently there are about 1000 concurrent
users (each with one to three ODBC sessions), with a CPU load of about 15%.
There are 30 databases: some have only some thousand records, while others
have a thousand tables with some of them totalling 10M+ records, with need
of sub-selects, joins and transactions (no constraints for now). Is PostgreSQL
able to manage *RELIABLY* and 24/7/365 this amount of data? Does the newest
version come with some handy tool like the Informix Onstat (it reports users
sessions and executed queries given a connection id)? Does it use all the
available processors or only one? How may we work around the periodic need
of a VACUUM on such big tables (and they are vastly modified every day),
without disrupting the server availability?

Best regards,
Dario Fumagalli

Triggers and Rules

От

"Jerome Chochon"

Дата:

02 сентября 2002 г., 08:20:00

Hi all.

I study the possibilities of triggers and rules.

I know that we can create triggers and rules on tables, but i want ton know
if we can create these(rules and triggers) for a view or index (other object
called relations).

Thanks for your reponse.

Jérome Chochon

Re: Almost happy ending (from "Data files became huge with

От

Bruce Momjian

Дата:

02 сентября 2002 г., 09:50:27

dfumagalli@tin.it wrote:
> Hi all,
>
> Al last I recieved a new disk and was able to VACUUM ANALYZE.
>
> The repaired database from the initial 500MB became 185 MB.
> I'm very happy that the combined disk full + unexpected server shutdown
> didn't suffice to crash the data beyond recovery.
>
> The next thing I did was to further reduce this rescued database. In fact,
> the original data files took only 26 MB, not 185.
>
> I checked the objects file size (by looking at pg_class -> it's a bit difficult
> with those "numeric table names") and quicky found that I had some 150 MB
> of index files!
>
> My first try was to go stand alone and to reindex force the entire database,
> but the rebuilt indexes still took the same space (I thought the reindex
> would rebuild them from scratch, freeing unused index pages, but perhaps
> I'm wrong, at least for PostgreSQL 7.1.1).

That is strange.  How did you reindex, and did you reindex the toast
tables too?  7.3, due out in a few months, has a reindex script that
reindexes everything.

> So I dropped all non-primary key indexes (they were auto-created as *_pkey)
> and recreated them. I did not drop primary keys because I fear it could
> break something about constraints. Now the database is 47MB and I'm pretty
> happy with it. I only I could safely drop and recreate the primary keys
> I'm sure the database would return to its original 26 MB.
>
> Now, three simple questions to the pros on this mailing list:
> - Is there a cleaner way of recreating packed indexes?

Yes, 7.3 will have a reindexdb script.

> - May I drop and recreate primary indexes without breaking rules, triggers,
> constraints and sequences?

Not easily, no.

> - We are evaluating PostgreSQL for a mission critical application, currently
> managed by an Informix IDS 2000 server on a Sun Ultra 420 with Solaris 8
> with raid disks and SAN for storage. Currently there are about 1000 concurrent
> users (each with one to three ODBC sessions), with a CPU load of about 15%.
> There are 30 databases: some have only some thousand records, while others
> have a thousand tables with some of them totalling 10M+ records, with need
> of sub-selects, joins and transactions (no constraints for now). Is PostgreSQL
> able to manage *RELIABLY* and 24/7/365 this amount of data? Does the newest

You need 7.2.2, which was designed for 24 hour operation.

> version come with some handy tool like the Informix Onstat (it reports users
> sessions and executed queries given a connection id)? Does it use all the

There is pgmonitor on gborg.postgresql.org that displays active sessions
and queries.

> available processors or only one? How may we work around the periodic need
> of a VACUUM on such big tables (and they are vastly modified every day),
> without disrupting the server availability?

In 7.2.2, VACUUM doesn't lock tables so you can vacuum more easily.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: Almost happy ending (from "Data files became huge with no apparent reason" thread)

От

Dario Fumagalli

Дата:

03 сентября 2002 г., 06:36:10

Bruce Momjian wrote:

> dfumagalli@tin.it wrote:
>
>>Hi all,
>>
>>Al last I recieved a new disk and was able to VACUUM ANALYZE.
>>



[Snip]


>>My first try was to go stand alone and to reindex force the entire database,
>>but the rebuilt indexes still took the same space (I thought the reindex
>>would rebuild them from scratch, freeing unused index pages, but perhaps
>>I'm wrong, at least for PostgreSQL 7.1.1).
>>
>
> That is strange.  How did you reindex, and did you reindex the toast
> tables too?  7.3, due out in a few months, has a reindex script that
> reindexes everything.


Step by step procedure:
1) I stopped the postmaster and made sure no process was still on (ps aux)
2) As postgres user I started postgres -O -P -D path etc.
3) A prompt appeared. Here I typed reindex proj_store force (I'm trying
to remember it since I'm not in the company the server is in).
4) Messages appeared stating indexes were being rebuilt (no errors shown).
5) I hit CTRL-D. Postgres exited.
6) I restarted the postmaster.


With regards to the toast tables, I don't know how to reindex / manage
them. But I don't think they are the culprit.

Just to be sure I append the output of:

select relname, relfilenode as node, reltoastrelid as relid,
reltoastidxid as idxid, relkind as k, relpages as pag from pg_class
order by relname


              relname             |  node   | relid  | idxid  | k | pag
---------------------------------+---------+--------+--------+---+------
  audio_connectors                |  258312 |      0 |      0 | r |    0
  audio_connectors_pkey           |  258325 |      0 |      0 | i |    1
  audio_connectors_seq            |  258293 |      0 |      0 | S |   10
  cats_and_subcats                |  258166 |      0 |      0 | r |    1
  cats_and_subcats_pkey           |  258182 |      0 |      0 | i |    2
  cats_and_subcats_seq            |  258147 |      0 |      0 | S |   10
  companies                       |  242645 |      0 |      0 | r |   31
  companies_millelumen            |  242837 |      0 |      0 | r |    0
  companies_millelumen_pkey       |  242904 |      0 |      0 | i |    1
  companies_millelumen_seq        |  242818 |      0 |      0 | S |   10
  companies_pkey                  |  242772 |      0 |      0 | i |    7
  companies_seq                   |  242626 |      0 |      0 | S |   10
  computer_vid_conns              |  258347 |      0 |      0 | r |    0
  computer_vid_conns_pkey         |  258360 |      0 |      0 | i |    1
  computer_vid_conns_seq          |  258328 |      0 |      0 | S |   10
  dep_names                       |  258382 |      0 |      0 | r |    1
  dep_names_pkey                  |  258395 |      0 |      0 | i |   12
  dep_names_seq                   |  258363 |      0 |      0 | S |   10
  det_prod_aud_conns              |  258667 |      0 |      0 | r |    0
  det_prod_aud_conns_pkey         |  258681 |      0 |      0 | i |    1
  det_prod_aud_conns_seq          |  258648 |      0 |      0 | S |   10
  det_prod_comp_vid_conns         |  258703 |      0 |      0 | r |    0
  det_prod_comp_vid_conns_pkey    |  258717 |      0 |      0 | i |    1
  det_prod_comp_vid_conns_seq     |  258684 |      0 |      0 | S |   10
  det_prod_dep_consumpt           |  258739 |      0 |      0 | r |   21
  det_prod_dep_consumpt_pkey      |  258764 |      0 |      0 | i | 3286
  det_prod_dep_consumpt_seq       |  258720 |      0 |      0 | S |   10
  det_prod_vid_conns              |  258631 |      0 |      0 | r |    0
  det_prod_vid_conns_pkey         |  258645 |      0 |      0 | i |    1
  det_prod_vid_conns_seq          |  258612 |      0 |      0 | S |   10
  det_prod_vid_res                |  258562 |      0 |      0 | r |    0
  det_prod_vid_res_pkey           |  258575 |      0 |      0 | i |    1
  det_prod_vid_res_seq            |  258543 |      0 |      0 | S |   10
  det_prod_vid_stds               |  258597 |      0 |      0 | r |    0
  det_prod_vid_stds_pkey          |  258609 |      0 |      0 | i |    1
  det_prod_vid_stds_seq           |  258578 |      0 |      0 | S |   10
  det_turnover                    |  242794 |      0 |      0 | r |    6
  det_turnover_pkey               |  242815 |      0 |      0 | i |    2
  det_turnover_seq                |  242775 |      0 |      0 | S |   10
  idx_cats_and_subcats_cs         | 1094029 |      0 |      0 | i |    2
  idx_cats_and_subcats_v          | 1094032 |      0 |      0 | i |    2
  idx_companies_c0                | 1094071 |      0 |      0 | i |    4
  idx_companies_c1                | 1094080 |      0 |      0 | i |    5
  idx_companies_c2                | 1094086 |      0 |      0 | i |    4
  idx_companies_millelumenc       | 1094101 |      0 |      0 | i |    1
  idx_companies_n                 | 1094077 |      0 |      0 | i |    9
  idx_companies_p                 | 1094083 |      0 |      0 | i |    4
  idx_companies_u                 | 1094074 |      0 |      0 | i |    5
  idx_companies_v0                | 1094089 |      0 |      0 | i |    5
  idx_companies_v1                | 1094092 |      0 |      0 | i |    4
  idx_dep_named                   | 1094035 |      0 |      0 | i |    2
  idx_det_prod_aud_connsp         | 1094062 |      0 |      0 | i |    1
  idx_det_prod_comp_vid_connsp    | 1094065 |      0 |      0 | i |    1
  idx_det_prod_dep_consumptdp     | 1094068 |      0 |      0 | i |   10
  idx_det_prod_vid_connsp         | 1094059 |      0 |      0 | i |    1
  idx_det_prod_vid_resp           | 1094053 |      0 |      0 | i |    1
  idx_det_prod_vid_stdsp          | 1094056 |      0 |      0 | i |    1
  idx_det_turnover_c              | 1094095 |      0 |      0 | i |    4
  idx_det_turnover_t              | 1094098 |      0 |      0 | i |    4
  idx_products_b                  | 1094044 |      0 |      0 | i |   19
  idx_products_b2                 | 1094047 |      0 |      0 | i |    9
  idx_products_csc                | 1094041 |      0 |      0 | i |   15
  idx_products_v                  | 1094050 |      0 |      0 | i |    9
  idx_video_resolutions_hv        | 1094038 |      0 |      0 | i |    2
  pg_aggregate                    |   16960 |      0 |      0 | r |    1
  pg_aggregate_name_type_index    | 1092622 |      0 |      0 | i |    2
  pg_am                           |   16653 |      0 |      0 | r |    1
  pg_am_name_index                | 1092616 |      0 |      0 | i |    2
  pg_amop                         |   16685 |      0 |      0 | r |    2
  pg_amop_opid_index              | 1092617 |      0 |      0 | i |    2
  pg_amop_strategy_index          | 1092618 |      0 |      0 | i |    2
  pg_amproc                       |   16867 |      0 |      0 | r |    1
  pg_attrdef                      |    1215 |  17201 |  17213 | r |    3
  pg_attrdef_adrelid_index        | 1092605 |      0 |      0 | i |    2
  pg_attribute                    |    1249 |      0 |      0 | r |   17
  pg_attribute_relid_attnam_index |   17112 |      0 |      0 | i |   11
  pg_attribute_relid_attnum_index |   17115 |      0 |      0 | i |    6
  pg_class                        |    1259 |      0 |      0 | r |    3
  pg_class_oid_index              |   17118 |      0 |      0 | i |    2
  pg_class_relname_index          |   17121 |      0 |      0 | i |    4
  pg_database                     |    1262 |      0 |      0 | r |    1
  pg_description                  |   17086 |  17216 |  17228 | r |   10
  pg_description_objoid_index     | 1092623 |      0 |      0 | i |    5
  pg_group                        |    1261 |      0 |      0 | r |    0
  pg_group_name_index             |   17127 |      0 |      0 | i |    1
  pg_group_sysid_index            |   17130 |      0 |      0 | i |    1
  pg_index                        |   16579 |      0 |      0 | r |    2
  pg_index_indexrelid_index       | 1092611 |      0 |      0 | i |    2
  pg_index_indrelid_index         | 1092610 |      0 |      0 | i |    2
  pg_indexes                      |   17350 |      0 |      0 | v |   10
  pg_inheritproc                  |   17045 |      0 |      0 | r |    0
  pg_inherits                     |   16567 |      0 |      0 | r |    0
  pg_inherits_relid_seqno_index   | 1092609 |      0 |      0 | i |    1
  pg_ipl                          |   17033 |      0 |      0 | r |    0
  pg_language                     |   16934 |      0 |      0 | r |    1
  pg_language_name_index          | 1092619 |      0 |      0 | i |    2
  pg_language_oid_index           | 1092620 |      0 |      0 | i |    2
  pg_largeobject                  |   16948 |      0 |      0 | r |    0
  pg_largeobject_loid_pn_index    | 1092621 |      0 |      0 | i |    1
  pg_listener                     |   17074 |      0 |      0 | r |    0
  pg_listener_pid_relname_index   | 1092624 |      0 |      0 | i |    1
  pg_log                          |    1269 |      0 |      0 | s |    0
  pg_opclass                      |   16642 |      0 |      0 | r |    1
  pg_opclass_deftype_index        | 1092614 |      0 |      0 | i |    2
  pg_opclass_name_index           | 1092615 |      0 |      0 | i |    2
  pg_operator                     |   16617 |      0 |      0 | r |    9
  pg_operator_oid_index           | 1092612 |      0 |      0 | i |    4
  pg_operator_oprname_l_r_k_index | 1092613 |      0 |      0 | i |    7
  pg_proc                         |    1255 |  17231 |  17243 | r |   28
  pg_proc_oid_index               |   17166 |      0 |      0 | i |    5
  pg_proc_proname_narg_type_index |   17169 |      0 |      0 | i |   20


As you may see, *_pkey primary keys are BIG. They are the only ones I
didn't dare to drop and re-create.

Ex.

  products_pkey                   |  258540 |      0 |      0 | i | 1653

where the entire table takes 66 pages, or, worse

  det_prod_dep_consumpt_pkey      |  258764 |      0 |      0 | i | 3286

where the entire table takes 21 pages.


>
>>So I dropped all non-primary key indexes (they were auto-created as *_pkey)
>>and recreated them. I did not drop primary keys because I fear it could
>>break something about constraints. Now the database is 47MB and I'm pretty
>>happy with it. I only I could safely drop and recreate the primary keys
>>I'm sure the database would return to its original 26 MB.
>>
>>Now, three simple questions to the pros on this mailing list:
>>- Is there a cleaner way of recreating packed indexes?
>>
>
> Yes, 7.3 will have a reindexdb script.
>
>
>>- May I drop and recreate primary indexes without breaking rules, triggers,
>>constraints and sequences?
>>
>
> Not easily, no.
>


And "not easily"? I may make backups, go standalone, and do whatever
evil you may think ;)
BTW I have the scripts to re-create indexes or constraints and know how
to hack them, since I'm the programmer in charge for all (sql, programs,
db administration) for that company.


>
>>- We are evaluating PostgreSQL for a mission critical application, currently
>>managed by an Informix IDS 2000 server on a Sun Ultra 420 with Solaris 8
>>with raid disks and SAN for storage. Currently there are about 1000 concurrent
>>users (each with one to three ODBC sessions), with a CPU load of about 15%.
>>There are 30 databases: some have only some thousand records, while others
>>have a thousand tables with some of them totalling 10M+ records, with need
>>of sub-selects, joins and transactions (no constraints for now). Is PostgreSQL
>>able to manage *RELIABLY* and 24/7/365 this amount of data? Does the newest
>>
>
> You need 7.2.2, which was designed for 24 hour operation.
>
>
>>version come with some handy tool like the Informix Onstat (it reports users
>>sessions and executed queries given a connection id)? Does it use all the
>>
>
> There is pgmonitor on gborg.postgresql.org that displays active sessions
> and queries.
>
>
>>available processors or only one? How may we work around the periodic need
>>of a VACUUM on such big tables (and they are vastly modified every day),
>>without disrupting the server availability?
>>
>
> In 7.2.2, VACUUM doesn't lock tables so you can vacuum more easily.
>
>

Now that Informix (the current, very stable and working solution that
exceedingly meet the above specifications) is somehow "gone", we are
uncertain between chosing IBM's DB2 (but we don't need something so
sophisticated for our huge but simple stuff) and PostgreSQL.

I spent some time reading the list archives and a commonly reported
issue is postgres unable to reclaim unused space. This is of the most
concern to us, since we simply cannot afford the database being
unavailable (for example because the DB grew until filled the disks).
Further more, in this other company there are some tech guys, but they
are only expert in AS400 and general Solaris and cannot cope with some
"postgres-typical" emergencies, like the need to perform full VACUUMs
or, worse,  dump / reload (and due to the sheer amount of data coming
from everywhere in the world there are many rows containing random
binary garbage, that I fear make reloads problematic at best).
There IS a db administrator that could care about some problems, but she
is in love with Informix and hates everything other (expecially open
source projects!) and won't help.

It would be acceptable, instead, to have a database that is never
optimized at 100% (i.e. it has "gaps" because of non full vacuums that
add for about another almost FIXED 30%), but that NEVER, NEVER grows out
of control. We may prepare a db maintenance plan that includes a full
vacuum every 6 months.


Thanks for your (and the other list lurkers) prompt responses, I for one
never had the same feedback for (highly) paid support.



My best regards,
Dario Fumagalli.

Re: Almost happy ending (from "Data files became huge with no apparent reason" thread)

От

Martijn van Oosterhout

Дата:

03 сентября 2002 г., 06:57:54

On Tue, Sep 03, 2002 at 12:39:01PM +0200, Dario Fumagalli wrote:
> Step by step procedure:
> 1) I stopped the postmaster and made sure no process was still on (ps aux)
> 2) As postgres user I started postgres -O -P -D path etc.
> 3) A prompt appeared. Here I typed reindex proj_store force (I'm trying
> to remember it since I'm not in the company the server is in).
> 4) Messages appeared stating indexes were being rebuilt (no errors shown).
> 5) I hit CTRL-D. Postgres exited.
> 6) I restarted the postmaster.

Should work but seems excessive.

>   pg_operator_oprname_l_r_k_index | 1092613 |      0 |      0 | i |    7
>   pg_proc                         |    1255 |  17231 |  17243 | r |   28
>   pg_proc_oid_index               |   17166 |      0 |      0 | i |    5
>   pg_proc_proname_narg_type_index |   17169 |      0 |      0 | i |   20

Your list seems truncated?

> As you may see, *_pkey primary keys are BIG. They are the only ones I
> didn't dare to drop and re-create.
>
> Ex.
>
>   products_pkey                   |  258540 |      0 |      0 | i | 1653
>
> where the entire table takes 66 pages, or, worse
>
>   det_prod_dep_consumpt_pkey      |  258764 |      0 |      0 | i | 3286
>
> where the entire table takes 21 pages.

What happens if you say "REINDEX INDEX products_pkey" or "REINDEX INDEX
det_prod_dep_consumpt_pkey". Do those numbers change?

What is the output of "VACUUM VERBOSE ANALYSE products".

> And "not easily"? I may make backups, go standalone, and do whatever
> evil you may think ;)
> BTW I have the scripts to re-create indexes or constraints and know how
> to hack them, since I'm the programmer in charge for all (sql, programs,
> db administration) for that company.

REINDEX should do it. Please provide the output of the vacuum command.

> It would be acceptable, instead, to have a database that is never
> optimized at 100% (i.e. it has "gaps" because of non full vacuums that
> add for about another almost FIXED 30%), but that NEVER, NEVER grows out
> of control. We may prepare a db maintenance plan that includes a full
> vacuum every 6 months.

Firstly, for 7.2 vacuum doesn't lock any tables, so you can run it as often
as you like. And why only every six months? Just do it daily. If your tables
are so small, it should take seconds to vacuum to whole database. The
default debian setup runs vacuum daily, as is recommended in the docs.

Hope this helps,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> There are 10 kinds of people in the world, those that can do binary
> arithmetic and those that can't.

Call closed: Almost happy ending (from "Data files became huge with no apparent reason" thread)

От

Dario Fumagalli

Дата:

03 сентября 2002 г., 10:06:49

Martijn van Oosterhout wrote:

> On Tue, Sep 03, 2002 at 12:39:01PM +0200, Dario Fumagalli wrote:
>
>>Step by step procedure:
>>1) I stopped the postmaster and made sure no process was still on (ps aux)
>>2) As postgres user I started postgres -O -P -D path etc.
>>3) A prompt appeared. Here I typed reindex proj_store force (I'm trying
>>to remember it since I'm not in the company the server is in).
>>4) Messages appeared stating indexes were being rebuilt (no errors shown).
>>5) I hit CTRL-D. Postgres exited.
>>6) I restarted the postmaster.
>>
>
> Should work but seems excessive.
>
>
>>  pg_operator_oprname_l_r_k_index | 1092613 |      0 |      0 | i |    7
>>  pg_proc                         |    1255 |  17231 |  17243 | r |   28
>>  pg_proc_oid_index               |   17166 |      0 |      0 | i |    5
>>  pg_proc_proname_narg_type_index |   17169 |      0 |      0 | i |   20
>>
>
> Your list seems truncated?
>


True (I think due to some clipboard blues with telnet). Here there are
the remaining part:

  pg_relcheck                     |    1216 |  17246 |  17258 | r |    0
  pg_relcheck_rcrelid_index       | 1092625 |      0 |      0 | i |    1
  pg_rewrite                      |   17058 |  17261 |  17273 | r |    1
  pg_rewrite_oid_index            | 1092626 |      0 |      0 | i |    2
  pg_rewrite_rulename_index       | 1092627 |      0 |      0 | i |    2
  pg_rules                        |   17309 |      0 |      0 | v |   10
  pg_shadow                       |    1260 |      0 |      0 | r |    1
  pg_statistic                    |   16600 |  17276 |  17288 | r |    4
  pg_statistic_relid_att_index    | 1092628 |      0 |      0 | i |    4
  pg_tables                       |   17335 |      0 |      0 | v |   10
  pg_toast_1215                   |   17201 |      0 |      0 | t |    0
  pg_toast_1215_idx               |   17213 |      0 |      0 | i |    1
  pg_toast_1216                   |   17246 |      0 |      0 | t |    0
  pg_toast_1216_idx               |   17258 |      0 |      0 | i |    1
  pg_toast_1255                   |   17231 |      0 |      0 | t |    0
  pg_toast_1255_idx               |   17243 |      0 |      0 | i |    1
  pg_toast_16600                  |   17276 |      0 |      0 | t |    1
  pg_toast_16600_idx              |   17288 |      0 |      0 | i |    2
  pg_toast_17058                  |   17261 |      0 |      0 | t |    0
  pg_toast_17058_idx              |   17273 |      0 |      0 | i |    1
  pg_toast_17086                  |   17216 |      0 |      0 | t |    0
  pg_toast_17086_idx              |   17228 |      0 |      0 | i |    1
  pg_toast_258417                 |  258525 |      0 |      0 | t |    0
  pg_toast_258417_idx             |  258537 |      0 |      0 | i |    1
  pg_trigger                      |    1219 |      0 |      0 | r |    1
  pg_trigger_tgconstrname_index   | 1092606 |      0 |      0 | i |    2
  pg_trigger_tgconstrrelid_index  | 1092607 |      0 |      0 | i |    2
  pg_trigger_tgrelid_index        | 1092608 |      0 |      0 | i |    2
  pg_type                         |    1247 |      0 |      0 | r |    3
  pg_type_oid_index               |   17193 |      0 |      0 | i |    2
  pg_type_typname_index           |   17196 |      0 |      0 | i |    2
  pg_user                         |   17291 |      0 |      0 | v |   10
  pg_variable                     |    1264 |      0 |      0 | s |    0
  pg_views                        |   17322 |      0 |      0 | v |   10
  pg_xactlock                     |       0 |      0 |      0 | s |    0
  products                        |  258417 | 258525 | 258537 | r |   66
  products_pkey                   |  258540 |      0 |      0 | i | 1653
  products_seq                    |  258398 |      0 |      0 | S |   10
  video_connectors                |  258277 |      0 |      0 | r |    0
  video_connectors_pkey           |  258290 |      0 |      0 | i |    1
  video_connectors_seq            |  258258 |      0 |      0 | S |   10
  video_resolutions               |  258204 |      0 |      0 | r |    1
  video_resolutions_pkey          |  258220 |      0 |      0 | i |    2
  video_resolutions_seq           |  258185 |      0 |      0 | S |   10
  video_standards                 |  258242 |      0 |      0 | r |    0
  video_standards_pkey            |  258255 |      0 |      0 | i |    1
  video_standards_seq             |  258223 |      0 |      0 | S |   10


>
>>As you may see, *_pkey primary keys are BIG. They are the only ones I
>>didn't dare to drop and re-create.
>>
>>Ex.
>>
>>  products_pkey                   |  258540 |      0 |      0 | i | 1653
>>
>>where the entire table takes 66 pages, or, worse
>>
>>  det_prod_dep_consumpt_pkey      |  258764 |      0 |      0 | i | 3286
>>
>>where the entire table takes 21 pages.
>>
>
> What happens if you say "REINDEX INDEX products_pkey" or "REINDEX INDEX
> det_prod_dep_consumpt_pkey". Do those numbers change?
>


Yes, it did it!

proj_store=# REINDEX INDEX products_pkey;
REINDEX
proj_store=# REINDEX INDEX det_prod_dep_consumpt_pkey
REINDEX
proj_store=# select relname, relfilenode as node, reltoastrelid as relid,
proj_store=# reltoastidxid as idxid, relkind as k, relpages as pag from
pg_class order by relname;

[Snip]
             relname             |  node   | relid  | idxid  | k | pag
--------------------------------+---------+--------+--------+---+------
det_prod_dep_consumpt_pkey      | 1224634 |      0 |      0 | i |    6
[Snip]
products_pkey                   | 1224633 |      0 |      0 | i |    5

BUT... there is a but...

The du -h command says that, despite the reported index reduction, the
overall database size has increased to 105 MB (?)!.

And this raises a question: why a reindex proj_store force in single
user mode did not packed the indexes while from psql / multiuser
postmaster it did? Bah!


> What is the output of "VACUUM VERBOSE ANALYSE products".
>


proj_store=# VACUUM VERBOSE ANALYSE products;
proj_store-#
NOTICE:  --Relation products--
NOTICE:  Pages 6142: Changed 66, reaped 6076, Empty 0, New 0; Tup 976:
Vac 90768
, Keep/VTL 0/0, Crash 0, UnUsed 3, MinLen 510, MaxLen 622; Re-using:
Free/Avail.
  Space 49289988/49289988; EndEmpty/Avail. Pages 0/6076. CPU 0.25s/0.05u
sec.
NOTICE:  Index products_pkey: Pages 5; Tuples 976: Deleted 0. CPU
0.01s/0.00u se
c.
NOTICE:  Index idx_products_csc: Pages 832; Tuples 976: Deleted 90768.
CPU 0.59s
/1.12u sec.
NOTICE:  Index idx_products_b: Pages 729; Tuples 976: Deleted 90768. CPU
0.38s/0
.93u sec.
NOTICE:  Index idx_products_b2: Pages 275; Tuples 976: Deleted 90768.
CPU 0.29s/
0.94u sec.
NOTICE:  Index idx_products_v: Pages 265; Tuples 976: Deleted 90768. CPU
0.20s/0
.95u sec.
NOTICE:  Rel products: Pages: 6142 --> 66; Tuple(s) moved: 976. CPU
5.05s/1.67u
sec.
NOTICE:  Index products_pkey: Pages 7; Tuples 976: Deleted 976. CPU
0.00s/0.02u
sec.
NOTICE:  Index idx_products_csc: Pages 832; Tuples 976: Deleted 976. CPU
0.55s/0
.23u sec.
NOTICE:  Index idx_products_b: Pages 730; Tuples 976: Deleted 976. CPU
0.16s/0.0
2u sec.
NOTICE:  Index idx_products_b2: Pages 278; Tuples 976: Deleted 976. CPU
0.06s/0.
01u sec.
NOTICE:  Index idx_products_v: Pages 267; Tuples 976: Deleted 976. CPU
0.05s/0.0
0u sec.
NOTICE:  --Relation pg_toast_258417--
NOTICE:  Pages 0: Changed 0, reaped 0, Empty 0, New 0; Tup 0: Vac 0,
Keep/VTL 0/
0, Crash 0, UnUsed 0, MinLen 0, MaxLen 0; Re-using: Free/Avail. Space
0/0; EndEm
pty/Avail. Pages 0/0. CPU 0.00s/0.00u sec.
NOTICE:  Index pg_toast_258417_idx: Pages 1; Tuples 0. CPU 0.00s/0.00u sec.
NOTICE:  Analyzing...
VACUUM
proj_store=#


>
>>And "not easily"? I may make backups, go standalone, and do whatever
>>evil you may think ;)
>>BTW I have the scripts to re-create indexes or constraints and know how
>>to hack them, since I'm the programmer in charge for all (sql, programs,
>>db administration) for that company.
>>
>
> REINDEX should do it. Please provide the output of the vacuum command.


I finally made a VACUUM VERBOSE ANALYZE.

It did things (lots of deletions and some moves) on the affected tables.
But now, if I reissue the pg_class query, I get figures like the following:
              relname             |  node   | relid  | idxid  | k | pag
---------------------------------+---------+--------+--------+---+------
  idx_det_prod_comp_vid_connsp    | 1094065 |      0 |      0 | i |   1
  idx_det_prod_dep_consumptdp     | 1094068 |      0 |      0 | i | 479
  idx_det_prod_vid_connsp         | 1094059 |      0 |      0 | i |   1
  idx_det_prod_vid_resp           | 1094053 |      0 |      0 | i |   1
  idx_det_prod_vid_stdsp          | 1094056 |      0 |      0 | i |   1
  idx_det_turnover_c              | 1094095 |      0 |      0 | i |   4
  idx_det_turnover_t              | 1094098 |      0 |      0 | i |   4
  idx_products_b                  | 1094044 |      0 |      0 | i | 733
  idx_products_b2                 | 1094047 |      0 |      0 | i | 281
  idx_products_csc                | 1094041 |      0 |      0 | i | 832
  idx_products_v                  | 1094050 |      0 |      0 | i | 270

So, new big index file have born!
After this vacuum, du -h reports:

[postgres@web base]$ du -h
1.6M    ./1
1.5M    ./18719
24M     ./242014
11M     ./46821
1.7M    ./197097
2.3M    ./279236
43M

A lower disk usage than after the reindex above (still more than the
expected 26MB). The development machine (with 3 weeks old outdated data,
unfortunately, but still with the same record figures (+-5%)):

[postgres@web base]$ du -h
1.6M    ./1
1.5M    ./18719
6.3M    ./250600
11M     ./46821
1.7M    ./197097
2.2M    ./259865
25M


>
>>It would be acceptable, instead, to have a database that is never
>>optimized at 100% (i.e. it has "gaps" because of non full vacuums that
>>add for about another almost FIXED 30%), but that NEVER, NEVER grows out
>>of control. We may prepare a db maintenance plan that includes a full
>>vacuum every 6 months.
>>
>
> Firstly, for 7.2 vacuum doesn't lock any tables, so you can run it as often
> as you like. And why only every six months? Just do it daily. If your tables
> are so small, it should take seconds to vacuum to whole database. The


This question is about a different database server for another (very
big) company. Here records are not hundreds, they are 10 millions up.
And a vacuum I fear will last for more that some seconds.


> default debian setup runs vacuum daily, as is recommended in the docs.


I hoped to be able to use a Debian. I had to mediate between Open Source
(Debian, the preferred for us the developers) and a closed, commercial
solution (loved by management... sigh). The solution was a Caldera... a
Linux solution but with some commercial bells and whistles on it to make
everyone happy. But I don't like it as Debian for programming purposes.
It is too Red-Hattish :) and comes with ancient tools and libraries.


>
> Hope this helps,
>

IT HELPED A LOT!!!
In fact using your suggestion and some bricolage :) I solved all the
problems!

Steps to reproduce it:

- Given that single user REINDEX [database name] FORCE in single user
modedid rebuild the indexes but did not optimize them;
- Given that a REINDEX INDEX [index name] did optimize and shrink a
single index but increased the overall data size;
- Given that a VACUUM VERBOSE ANALYZE somehow alters the overall
database files size (seems to re-distribute the file space evenly across
indexes);
- Given that I was at a loss and ready to do anything with a logical
sense (or not);

1) I reindexed all excessively big indexes one by one in psql:

REINDEX INDEX dep_names_pkey;
REINDEX INDEX det_prod_dep_consumpt_pkey;
REINDEX INDEX idx_det_prod_dep_consumptdp;
REINDEX INDEX idx_products_b;
REINDEX INDEX idx_products_b2;
REINDEX INDEX idx_products_csc;
REINDEX INDEX idx_products_v;

So there was no surplus pages anywhere.

2) I performed a VACUUM VERBOSE ANALYZE, that having nothing to
re-distribute, simply "ate" all the wasted space.

I know this is probably the weirdest and uninformed / untechnical
"reasoning" you saw in this list from years, but it has one small,
almost unnoticeable advantage: IT WORKED AND NOW THE DATABASE IS ITS
ORIGINAL SIZE AGAIN (even less).

Output of the du -h command of the production machine:

[postgres@web base]$ du -h
1.6M    ./1
1.5M    ./18719
4.0M    ./242014
11M     ./46821
1.7M    ./197097
2.3M    ./279236
22M

The machine is open to users again an is as fast as it was, making again
this LAPP (Linux + Apache + PostgreSQL + PHP 4) intranet / extranet
server the envy of the others bigger but bloated JSP / ASP / Commercial
solutions that sit near him.

Thanks again all for the excellent responses. They really helped a lot.

My best regards,
Dario Fumagalli

Re: Call closed: Almost happy ending (from "Data files became huge with no apparent reason" thread)

От

Martijn van Oosterhout

Дата:

03 сентября 2002 г., 10:24:51

On Tue, Sep 03, 2002 at 04:08:18PM +0200, Dario Fumagalli wrote:
> Martijn van Oosterhout wrote:
> > What happens if you say "REINDEX INDEX products_pkey" or "REINDEX INDEX
> > det_prod_dep_consumpt_pkey". Do those numbers change?
> >
>
>
> Yes, it did it!
>
> proj_store=# REINDEX INDEX products_pkey;
> REINDEX
> proj_store=# REINDEX INDEX det_prod_dep_consumpt_pkey
> REINDEX
> proj_store=# select relname, relfilenode as node, reltoastrelid as relid,
> proj_store=# reltoastidxid as idxid, relkind as k, relpages as pag from
> pg_class order by relname;
>
> [Snip]
>              relname             |  node   | relid  | idxid  | k | pag
> --------------------------------+---------+--------+--------+---+------
> det_prod_dep_consumpt_pkey      | 1224634 |      0 |      0 | i |    6
> [Snip]
> products_pkey                   | 1224633 |      0 |      0 | i |    5
>
> BUT... there is a but...
>
> The du -h command says that, despite the reported index reduction, the
> overall database size has increased to 105 MB (?)!.
>
> And this raises a question: why a reindex proj_store force in single
> user mode did not packed the indexes while from psql / multiuser
> postmaster it did? Bah!


It sounds like there is actually an awful lot of activity going on.

> NOTICE:  Index idx_products_csc: Pages 832; Tuples 976: Deleted 90768.
> CPU 0.59s
> /1.12u sec.

There you go. 90000 deleted tuples, 900 active. When was the last time you
did a vacuum? Basically, you need to do a vacuum whenever you want to reuse
space. Especially prior to 7.2.

> NOTICE:  --Relation pg_toast_258417--
> NOTICE:  Pages 0: Changed 0, reaped 0, Empty 0, New 0; Tup 0: Vac 0,
> Keep/VTL 0/
> 0, Crash 0, UnUsed 0, MinLen 0, MaxLen 0; Re-using: Free/Avail. Space
> 0/0; EndEm
> pty/Avail. Pages 0/0. CPU 0.00s/0.00u sec.
> NOTICE:  Index pg_toast_258417_idx: Pages 1; Tuples 0. CPU 0.00s/0.00u sec.

Toast table/index is empty, so it's not that.

> I finally made a VACUUM VERBOSE ANALYZE.
>
> It did things (lots of deletions and some moves) on the affected tables.
> But now, if I reissue the pg_class query, I get figures like the following:
>               relname             |  node   | relid  | idxid  | k | pag
> ---------------------------------+---------+--------+--------+---+------
>   idx_det_prod_comp_vid_connsp    | 1094065 |      0 |      0 | i |   1
>   idx_det_prod_dep_consumptdp     | 1094068 |      0 |      0 | i | 479
>   idx_det_prod_vid_connsp         | 1094059 |      0 |      0 | i |   1
>   idx_det_prod_vid_resp           | 1094053 |      0 |      0 | i |   1
>   idx_det_prod_vid_stdsp          | 1094056 |      0 |      0 | i |   1
>   idx_det_turnover_c              | 1094095 |      0 |      0 | i |   4
>   idx_det_turnover_t              | 1094098 |      0 |      0 | i |   4
>   idx_products_b                  | 1094044 |      0 |      0 | i | 733
>   idx_products_b2                 | 1094047 |      0 |      0 | i | 281
>   idx_products_csc                | 1094041 |      0 |      0 | i | 832
>   idx_products_v                  | 1094050 |      0 |      0 | i | 270
>
> So, new big index file have born!
> After this vacuum, du -h reports:
>
> [postgres@web base]$ du -h
> 1.6M    ./1
> 1.5M    ./18719
> 24M     ./242014
> 11M     ./46821
> 1.7M    ./197097
> 2.3M    ./279236
> 43M
>
> A lower disk usage than after the reindex above (still more than the
> expected 26MB). The development machine (with 3 weeks old outdated data,
> unfortunately, but still with the same record figures (+-5%)):

Your disk space seems to fluctuate an awful lot. Are you doing any queries
that UPDATE rows, even if the values are not changing?

> This question is about a different database server for another (very
> big) company. Here records are not hundreds, they are 10 millions up.
> And a vacuum I fear will last for more that some seconds.

Quite true. Hence the non-blocking vacuum in 7.2. Similarly, scheduling a
VACUUM at 4am and no-one will notice :)

> I hoped to be able to use a Debian. I had to mediate between Open Source
> (Debian, the preferred for us the developers) and a closed, commercial
> solution (loved by management... sigh). The solution was a Caldera... a
> Linux solution but with some commercial bells and whistles on it to make
> everyone happy. But I don't like it as Debian for programming purposes.
> It is too Red-Hattish :) and comes with ancient tools and libraries.

Hey, if it works, it works, right :)

> - Given that single user REINDEX [database name] FORCE in single user
> modedid rebuild the indexes but did not optimize them;
> - Given that a REINDEX INDEX [index name] did optimize and shrink a
> single index but increased the overall data size;
> - Given that a VACUUM VERBOSE ANALYZE somehow alters the overall
> database files size (seems to re-distribute the file space evenly across
> indexes);
> - Given that I was at a loss and ready to do anything with a logical
> sense (or not);

Hmm. BTW, I'd never heard of REINDEX DATABASE before yesterday, so I'm not
quite clear on what it does. REINDEX INDEX has always worked for me.

> 1) I reindexed all excessively big indexes one by one in psql:
>
> 2) I performed a VACUUM VERBOSE ANALYZE, that having nothing to
> re-distribute, simply "ate" all the wasted space.
>
> I know this is probably the weirdest and uninformed / untechnical
> "reasoning" you saw in this list from years, but it has one small,
> almost unnoticeable advantage: IT WORKED AND NOW THE DATABASE IS ITS
> ORIGINAL SIZE AGAIN (even less).

Actually, that seems to me to be the way to squeeze the most space out of
the DB. It should work very reliably.

Anyway, you should be up and running now. Still odd though.
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> There are 10 kinds of people in the world, those that can do binary
> arithmetic and those that can't.

Re: Call closed: Almost happy ending (from "Data files became huge with no apparent reason" thread)

От

Tom Lane

Дата:

03 сентября 2002 г., 10:32:47

Dario Fumagalli <dfumagalli@tin.it> writes:
> - Given that single user REINDEX [database name] FORCE in single user
> modedid rebuild the indexes but did not optimize them;
> - Given that a REINDEX INDEX [index name] did optimize and shrink a
> single index but increased the overall data size;
> - Given that a VACUUM VERBOSE ANALYZE somehow alters the overall
> database files size (seems to re-distribute the file space evenly across
> indexes);

There is no such thing as a "reindex that optimizes" versus "reindex
that doesn't optimize".  A rebuild is a rebuild.  I suspect that you
were fooled by out-of-date relpages statistics in pg_class, and/or
expecting REINDEX to rebuild indexes it won't.

VACUUM updates relpages to reflect current reality, but I'm not sure
whether REINDEX does.

Also, REINDEX at the database level only reindexes system-table indexes.

            regards, tom lane

Re: Call closed: Almost happy ending (from "Data files became huge with no apparent reason" thread)

От

Dario Fumagalli

Дата:

03 сентября 2002 г., 11:38:06

Martijn van Oosterhout wrote:

> On Tue, Sep 03, 2002 at 04:08:18PM +0200, Dario Fumagalli wrote:
>
>>Martijn van Oosterhout wrote:
>>
>>>What happens if you say "REINDEX INDEX products_pkey" or "REINDEX INDEX
>>>det_prod_dep_consumpt_pkey". Do those numbers change?
>>>
>>>
>>
>>Yes, it did it!
>>
>>proj_store=# REINDEX INDEX products_pkey;
>>REINDEX
>>proj_store=# REINDEX INDEX det_prod_dep_consumpt_pkey
>>REINDEX
>>proj_store=# select relname, relfilenode as node, reltoastrelid as relid,
>>proj_store=# reltoastidxid as idxid, relkind as k, relpages as pag from
>>pg_class order by relname;
>>
>>[Snip]
>>             relname             |  node   | relid  | idxid  | k | pag
>>--------------------------------+---------+--------+--------+---+------
>>det_prod_dep_consumpt_pkey      | 1224634 |      0 |      0 | i |    6
>>[Snip]
>>products_pkey                   | 1224633 |      0 |      0 | i |    5
>>
>>BUT... there is a but...
>>
>>The du -h command says that, despite the reported index reduction, the
>>overall database size has increased to 105 MB (?)!.
>>
>>And this raises a question: why a reindex proj_store force in single
>>user mode did not packed the indexes while from psql / multiuser
>>postmaster it did? Bah!
>>
>
>
> It sounds like there is actually an awful lot of activity going on.
>
>
>>NOTICE:  Index idx_products_csc: Pages 832; Tuples 976: Deleted 90768.
>>CPU 0.59s
>>/1.12u sec.
>>
>
> There you go. 90000 deleted tuples, 900 active. When was the last time you
> did a vacuum? Basically, you need to do a vacuum whenever you want to reuse
> space. Especially prior to 7.2.
>

As I stated in an earlier post, i vacuumed was about 2 - 3 weeks earlier
(my vacation duration).

>
>>NOTICE:  --Relation pg_toast_258417--
>>NOTICE:  Pages 0: Changed 0, reaped 0, Empty 0, New 0; Tup 0: Vac 0,
>>Keep/VTL 0/
>>0, Crash 0, UnUsed 0, MinLen 0, MaxLen 0; Re-using: Free/Avail. Space
>>0/0; EndEm
>>pty/Avail. Pages 0/0. CPU 0.00s/0.00u sec.
>>NOTICE:  Index pg_toast_258417_idx: Pages 1; Tuples 0. CPU 0.00s/0.00u sec.
>>
>
> Toast table/index is empty, so it's not that.
>
>
>>I finally made a VACUUM VERBOSE ANALYZE.
>>
>>It did things (lots of deletions and some moves) on the affected tables.
>>But now, if I reissue the pg_class query, I get figures like the following:
>>              relname             |  node   | relid  | idxid  | k | pag
>>---------------------------------+---------+--------+--------+---+------
>>  idx_det_prod_comp_vid_connsp    | 1094065 |      0 |      0 | i |   1
>>  idx_det_prod_dep_consumptdp     | 1094068 |      0 |      0 | i | 479
>>  idx_det_prod_vid_connsp         | 1094059 |      0 |      0 | i |   1
>>  idx_det_prod_vid_resp           | 1094053 |      0 |      0 | i |   1
>>  idx_det_prod_vid_stdsp          | 1094056 |      0 |      0 | i |   1
>>  idx_det_turnover_c              | 1094095 |      0 |      0 | i |   4
>>  idx_det_turnover_t              | 1094098 |      0 |      0 | i |   4
>>  idx_products_b                  | 1094044 |      0 |      0 | i | 733
>>  idx_products_b2                 | 1094047 |      0 |      0 | i | 281
>>  idx_products_csc                | 1094041 |      0 |      0 | i | 832
>>  idx_products_v                  | 1094050 |      0 |      0 | i | 270
>>
>>So, new big index file have born!
>>After this vacuum, du -h reports:
>>
>>[postgres@web base]$ du -h
>>1.6M    ./1
>>1.5M    ./18719
>>24M     ./242014
>>11M     ./46821
>>1.7M    ./197097
>>2.3M    ./279236
>>43M
>>
>>A lower disk usage than after the reindex above (still more than the
>>expected 26MB). The development machine (with 3 weeks old outdated data,
>>unfortunately, but still with the same record figures (+-5%)):
>>
>
> Your disk space seems to fluctuate an awful lot. Are you doing any queries
> that UPDATE rows, even if the values are not changing?
>

Yes, in this database insert / upate ratio is 1 : 180, and db updates
are performed 4 times a day by an external PHP console script.

>
>>This question is about a different database server for another (very
>>big) company. Here records are not hundreds, they are 10 millions up.
>>And a vacuum I fear will last for more that some seconds.
>>
>
> Quite true. Hence the non-blocking vacuum in 7.2. Similarly, scheduling a
> VACUUM at 4am and no-one will notice :)

It depends... This company intranet is accessible from working units in:
USA
Germany
Italy
UK

So 4am is somewhat relative here...

>
>
>>I hoped to be able to use a Debian. I had to mediate between Open Source
>>(Debian, the preferred for us the developers) and a closed, commercial
>>solution (loved by management... sigh). The solution was a Caldera... a
>>Linux solution but with some commercial bells and whistles on it to make
>>everyone happy. But I don't like it as Debian for programming purposes.
>>It is too Red-Hattish :) and comes with ancient tools and libraries.
>>
>
> Hey, if it works, it works, right :)
>
>
>>- Given that single user REINDEX [database name] FORCE in single user
>>modedid rebuild the indexes but did not optimize them;
>>- Given that a REINDEX INDEX [index name] did optimize and shrink a
>>single index but increased the overall data size;
>>- Given that a VACUUM VERBOSE ANALYZE somehow alters the overall
>>database files size (seems to re-distribute the file space evenly across
>>indexes);
>>- Given that I was at a loss and ready to do anything with a logical
>>sense (or not);
>>
>
> Hmm. BTW, I'd never heard of REINDEX DATABASE before yesterday, so I'm not
> quite clear on what it does. REINDEX INDEX has always worked for me.
>

I found it in the official html pages. It is one proprietary SQL command
that stating the original page in

http://www.postgresql.org/idocs/index.php?sql-reindex.html

Description
REINDEX is used to rebuild corrupted indexes. Although in theory this
should never be necessary, in practice indexes may become corrupted due
to software bugs or hardware failures. REINDEX provides a recovery method.

If you suspect corruption of an index on a user table, you can simply
rebuild that index, or all indexes on the table, using REINDEX INDEX or
REINDEX TABLE.

Note: Another approach to dealing with a corrupted user-table index is
just to drop and recreate it. This may in fact be preferable if you
would like to maintain some semblance of normal operation on the table
meanwhile. REINDEX acquires exclusive lock on the table, while CREATE
INDEX only locks out writes not reads of the table.

Things are more difficult if you need to recover from corruption of an
index on a system table. In this case it's important for the backend
doing the recovery to not have used any of the suspect indexes itself.
(Indeed, in this sort of scenario you may find that backends are
crashing immediately at startup, due to reliance on the corrupted
indexes.) To recover safely, the postmaster must be shut down and a
stand-alone PostgreSQL backend must be started instead, giving it the
command-line options -O and -P (these options allow system table
modifications and prevent use of system indexes, respectively). Then
issue REINDEX INDEX, REINDEX TABLE, or REINDEX DATABASE depending on how
much you want to reconstruct. If in doubt, use REINDEX DATABASE FORCE to
force reconstruction of all system indexes in the database. Then quit
the standalone backend and restart the postmaster.

Since this is likely the only situation when most people will ever use a
standalone backend, some usage notes might be in order:

Start the backend with a command like

postgres -D $PGDATA -O -P my_database
Provide the correct path to the database area with -D, or make sure that
the environment variable PGDATA is set. Also specify the name of the
particular database you want to work in.

You can issue any SQL command, not only REINDEX.

Be aware that the standalone backend treats newline as the command entry
terminator; there is no intelligence about semicolons, as there is in
psql. To continue a command across multiple lines, you must type
backslash just before each newline except the last one. Also, you won't
have any of the conveniences of readline processing (no command history,
for example).

To quit the backend, type EOF (control-D, usually).

See the postgres reference page for more information.

Usage
Recreate the indexes on the table mytable:

      REINDEX TABLE mytable;

Rebuild a single index:

     REINDEX INDEX my_index;

Rebuild all system indexes (this will only work in a standalone backend):

     REINDEX DATABASE my_database FORCE;

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

And yes, only now I saw that it says that the command will "Rebuild all
system indexes" and not ALL indexes... GRR..

>
>>1) I reindexed all excessively big indexes one by one in psql:
>>
>>2) I performed a VACUUM VERBOSE ANALYZE, that having nothing to
>>re-distribute, simply "ate" all the wasted space.
>>
>>I know this is probably the weirdest and uninformed / untechnical
>>"reasoning" you saw in this list from years, but it has one small,
>>almost unnoticeable advantage: IT WORKED AND NOW THE DATABASE IS ITS
>>ORIGINAL SIZE AGAIN (even less).
>>
>
> Actually, that seems to me to be the way to squeeze the most space out of
> the DB. It should work very reliably.
>
> Anyway, you should be up and running now. Still odd though.
>

Best regards,
Dario Fumagalli

Re: Call closed: Almost happy ending (from "Data files became huge with no apparent reason" thread)

От

Martijn van Oosterhout

Дата:

03 сентября 2002 г., 21:55:18

On Tue, Sep 03, 2002 at 05:40:25PM +0200, Dario Fumagalli wrote:
> As I stated in an earlier post, i vacuumed was about 2 - 3 weeks earlier
> (my vacation duration).

Seriously, setup a vacuum to run daily. The time wasted by scanning overly
large tables and using inexact statistics far exceeds a once-per-day delay
caused by a vacuum.

> Yes, in this database insert / upate ratio is 1 : 180, and db updates
> are performed 4 times a day by an external PHP console script.

Well, do a vacuum after each update for maximum performance.

I actually had a thought last night. REINDEX recreates the index, probably
with *all* rows in the table, whether or not they are still active. Thus, to
get maximum effect from REINDEX, you need to VACUUM first.

Maybe someone with more knowledge than me can confirm this.
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> There are 10 kinds of people in the world, those that can do binary
> arithmetic and those that can't.

Re: Call closed: Almost happy ending (from "Data files became huge with no apparent reason" thread)

От

Alvaro Herrera

Дата:

03 сентября 2002 г., 23:41:16

En Wed, 4 Sep 2002 11:55:06 +1000
Martijn van Oosterhout <kleptog@svana.org> escribió:

> On Tue, Sep 03, 2002 at 05:40:25PM +0200, Dario Fumagalli wrote:
> > As I stated in an earlier post, i vacuumed was about 2 - 3 weeks earlier
> > (my vacation duration).
>
> Seriously, setup a vacuum to run daily. The time wasted by scanning overly
> large tables and using inexact statistics far exceeds a once-per-day delay
> caused by a vacuum.

I say run a light VACUUM (standard VACUUM on 7.2) several times a day;
it doesn't lock tables so you don't have any downtime.  If you reach
steady state (i.e. tables do not grow), you don't even need VACUUM FULL,
but it's probably sane to do it every so often.

Remember that you can run ANALYZE separate from VACUUM.  It's important
to ANALYZE if the statistics change on your tables, so the optimizer can
choose good plans.

> I actually had a thought last night. REINDEX recreates the index, probably
> with *all* rows in the table, whether or not they are still active. Thus, to
> get maximum effect from REINDEX, you need to VACUUM first.

No.  The reindexing is an exclusive operation -- that is, no other
transaction can be using the index at the same time.  For that reason,
there's only one version of each tuple that is valid, and only that
version is indexed.

--
Alvaro Herrera (<alvherre[a]atentus.com>)
"El destino baraja y nosotros jugamos" (A. Schopenhauer)

Re: Call closed: Almost happy ending (from "Data files became huge with no apparent reason" thread)

От

Tom Lane

Дата:

04 сентября 2002 г., 00:05:18

Martijn van Oosterhout <kleptog@svana.org> writes:
> I actually had a thought last night. REINDEX recreates the index, probably
> with *all* rows in the table, whether or not they are still active. Thus, to
> get maximum effect from REINDEX, you need to VACUUM first.

No.  REINDEX boils down to IndexBuildHeapScan(), which uses
HeapTupleSatisfiesVacuum() to decide which tuples are worth indexing;
the tuples that get indexed are *exactly* the same ones that would
survive VACUUM.

In recent releases, anyway; before 7.2 things might have been flakier.

            regards, tom lane

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Almost happy ending (from "Data files became huge with no apparent reason" thread)