Index Corruption

Поиск
Список
Период
Сортировка
От Dylan Adams
Тема Index Corruption
Дата
Msg-id CA+Rt69Yb6W2fXRYb2P57LWSVS7wf8gGxedunNLChqp2api9pow@mail.gmail.com
обсуждение исходный текст
Ответы Re: Index Corruption
Re: Index Corruption
Re: Index Corruption
Список pgsql-general
We moved to PostgreSQL about 2 years ago and have been very happy with it
overall. The only major issue that we've had is intermittent index corruption.
This manifests itself as either "duplicate key value violates unique constraint"
or "could not read block 37422 of relation 1663/18663/19063: read only 0 of 8192
bytes." REINDEXing the table solves the problem. We do sometimes have bugs that
cause unique index violations, so the first error is especially annoying.

We've encountered the problem with both JDBC and libpq clients. The problem has
persisted through upgrades to our database servers (from 32bit CentOS 5.3 with
PostgreSQL 8.3.9 to 64bit CentOS 5.6 with PostgreSQL 8.4.8, all with stock
kernels).

Our database servers are fully virtualized, running under VMware Server on Dell
PowerEdge Servers. We use battery backed raid controllers (PERC4/5/6),
configured for RAID 10. We also experienced the problem when we had physical
database servers. Servers all have 2 CPUs.

PostgreSQL is installed from the PGDG RPMs (yum.postgresql.org). Our only
non-default postgresql.conf options are the typical tunables: shared_buffers,
checkpoint_segments, effective_cache_size, default_statistics_target, etc. WAL
settings are defaults (save checkpoint_segments). No clustering, no addons.
We've tuned autovacuum to be more aggressive in an attempt to address some bloat
issues; this didn't seem to have any impact on the frequency of index
corruption.

The databases are unloaded nightly. We do a weekly pg_dumpall as an integrity
check.

We've only seen the problem in our many weekly batch processes. Typically, these
processes DELETE a large subset of data from a table and then repopulate with
the same values in indexed fields. Depending on the dataset, they run from 30
minutes to 8 hours. These processes are scheduled so that only one program
(which is single threaded) is updating any given table at a time.

We have 12 database servers and we usually have about 1 incident per week.
Sometimes we'll go for weeks without any occurrences, and then we'll have a
flurry of them.

My primary question: is this normal? There isn't an overwhelming amount of
messages in the archives about index corruption, which leads me to think that
there's something with our configuration or our processes that is making us more
susceptible.

Is there something we should be doing to make index corruption less likely? Is
there anyway to do an index integrity check so we can be more proactive with
REINDEXing?

Thanks,
dylan

В списке pgsql-general по дате отправления:

Предыдущее
От: Adrian Klaver
Дата:
Сообщение: Re: Has Pg 9.1.0 been released today?
Следующее
От: Gavin Flower
Дата:
Сообщение: Re: Has Pg 9.1.0 been released today?