Massive table (500M rows) update nightmare

Поиск
Список
Период
Сортировка
От Carlo Stonebanks
Тема Massive table (500M rows) update nightmare
Дата
Msg-id hi41os$eiv$1@news.hub.org
обсуждение исходный текст
Ответы Re: Massive table (500M rows) update nightmare  (Scott Marlowe <scott.marlowe@gmail.com>)
Re: Massive table (500M rows) update nightmare  (Leo Mannhart <leo.mannhart@beecom.ch>)
Re: Massive table (500M rows) update nightmare  (marcin mank <marcin.mank@gmail.com>)
Re: Massive table (500M rows) update nightmare  (Eduardo Morras <emorras@s21sec.com>)
Список pgsql-performance
Our DB has an audit table which is 500M rows and growing. (FYI the objects
being audited are grouped semantically, not individual field values).

Recently we wanted to add a new feature and we altered the table to add a
new column. We are backfilling this varchar(255) column by writing a TCL
script to page through the rows (where every update is a UPDATE ... WHERE id
 >= x AND id < x+10 and a commit is performed after every 1000 updates
statement, i.e. every 10000 rows.)

We have 10 columns, six of which are indexed. Rough calculations suggest
that this will take two to three weeks to complete on an 8-core CPU with
more than enough memory.

As a ballpark estimate - is this sort of performance for an 500M updates
what one would expect of PG given the table structure (detailed below) or
should I dig deeper to look for performance issues?

As always, thanks!

Carlo

Table/index structure:

CREATE TABLE mdx_core.audit_impt
(
  audit_impt_id serial NOT NULL,
  impt_session integer,
  impt_version character varying(255),
  impt_name character varying(255),
  impt_id integer,
  target_table character varying(255),
  target_id integer,
  target_op character varying(10),
  note text,
  source_table character varying(255),
  CONSTRAINT audit_impt_pkey PRIMARY KEY (audit_impt_id)
)

CREATE INDEX audit_impt_impt_id_idx
  ON mdx_core.audit_impt
  USING btree
  (impt_id);
CREATE INDEX audit_impt_impt_name
  ON mdx_core.audit_impt
  USING btree
  (impt_name, impt_version);
CREATE INDEX audit_impt_session_idx
  ON mdx_core.audit_impt
  USING btree
  (impt_session);
CREATE INDEX audit_impt_source_table
  ON mdx_core.audit_impt
  USING btree
  (source_table);
CREATE INDEX audit_impt_target_id_idx
  ON mdx_core.audit_impt
  USING btree
  (target_id, audit_impt_id);
CREATE INDEX audit_impt_target_table_idx
  ON mdx_core.audit_impt
  USING btree
  (target_table, target_id, audit_impt_id);




В списке pgsql-performance по дате отправления:

Предыдущее
От: Craig Ringer
Дата:
Сообщение: Re: pg_connect takes 3.0 seconds
Следующее
От: Michael Ruf
Дата:
Сообщение: Re: Optimizer use of index slows down query by factor