Обсуждение: BUG #5395: UPDATE on shutdown overwrites table

Поиск
Список
Период
Сортировка

BUG #5395: UPDATE on shutdown overwrites table

От
"Lothar Bongartz"
Дата:
The following bug has been logged online:

Bug reference:      5395
Logged by:          Lothar Bongartz
Email address:      lotharbongartz@hotmail.com
PostgreSQL version: 8.4.3
Operating system:   Windows Server 2003 R2
Description:        UPDATE on shutdown overwrites table
Details:

I have ported a community with about one million members from MS SQL to
Postgres.
While the overall performance is comparable, I have noticed, that
conditional bulk writing can be extremely slow in Postgres:
SELECT ... INTO ... WHERE
UPDATE ... FROM ... WHERE
For this reason the database is stalling from time to time and I have to
restart the server. For the second time I have detected, that Postgres
overwrites a table when shutting down. The table <onlineinfo> is only
updated with NOW() and only for a single matching row:
UPDATE onlineinfo SET date_end=NOW() WHERE memb_id=v_id
When this command is executed while Postgres shuts down, all rows contain a
"random" date like '2007-06-25' in the <date_end> column afterwards.
To repair this, I have created a table from a backup:
CREATE TEMP TABLE temp_onlineinfo_bak
(
  memb_id integer NOT NULL,
  date_end timestamp NOT NULL
 )
After filling the temporary table from the backup, I use it to repair the
overwritten table:
UPDATE onlineinfo SET date_end=B.date_end
 FROM onlineinfo O, temp_onlineinfo_bak B
 WHERE O.date_end<'2010-03-21' AND B.memb_id=O.memb_id;
The WHERE condition is needed to prevent overwriting rows, which are updated
in the meantime.

Unfortunately this command needs several hours to complete, since this is
again a conditional bulk writing.

I tried all kind of variations like removing the index on
<onlineinfo><date_end> before the update or setting up an index on
<temp_onlineinfo_bak><memb_id>

Nothing helps. These are the data of the table:
CREATE TABLE onlineinfo
(
  memb_id integer NOT NULL,
  log_count integer NOT NULL,
  date_end timestamp NOT NULL,
  ip integer NOT NULL,
  CONSTRAINT onlineinfo_pkey PRIMARY KEY (memb_id),
  CONSTRAINT fk__onlineinf__memb___55009f39 FOREIGN KEY (memb_id)
      REFERENCES member (memb_id) MATCH FULL
      ON UPDATE NO ACTION ON DELETE NO ACTION
)
WITH (
  OIDS=FALSE
);
ALTER TABLE onlineinfo OWNER TO postgres;

-- Index: onlineinfo_date_end

-- DROP INDEX onlineinfo_date_end;

CREATE INDEX onlineinfo_date_end
  ON onlineinfo
  USING btree
  (date_end);

Re: BUG #5395: UPDATE on shutdown overwrites table

От
Tom Lane
Дата:
"Lothar Bongartz" <lotharbongartz@hotmail.com> writes:
> For this reason the database is stalling from time to time and I have to
> restart the server. For the second time I have detected, that Postgres
> overwrites a table when shutting down. The table <onlineinfo> is only
> updated with NOW() and only for a single matching row:
> UPDATE onlineinfo SET date_end=NOW() WHERE memb_id=v_id
> When this command is executed while Postgres shuts down, all rows contain a
> "random" date like '2007-06-25' in the <date_end> column afterwards.

This is quite hard to believe.  Can you provide a reproducible test
case?

I have seen cases where someone wrote what he thought was a single-row
update, but it turned out to be a whole-table update because the WHERE
clause actually reduced to constant TRUE.  Your reference to "v_id"
makes me think that you are issuing this query inside a plpgsql
function.  One of the common ways to shoot yourself in the foot like
that is to be careless about whether a name could match both a table
column and a plpgsql variable or parameter; could that have happened in
your situation?

BTW, there are easier ways to cancel a single query than restarting the
whole server.

            regards, tom lane

Re: BUG #5395: UPDATE on shutdown overwrites table

От
Craig Ringer
Дата:
On 30/03/2010 11:27 PM, Lothar Bongartz wrote:

> For this reason the database is stalling from time to time and I have to
> restart the server.

Are those stalls accompanied by any mention of checkpoints in the logs?

The default PostgreSQL configuration isn't tuned for larger workloads -
it's very minimalist. You probably need to increase your
checkpoint_segments . It may also be a good idea to look at shared_buffers .

Before just re-starting the server, you should look at `iostat',
`vmstat', `top', `iotop' or other suitable performance monitoring tools
and see if the server's actually DOING anything. You'll probably find
it's busy with disk I/O and not really stalled at all.

--
Craig Ringer