Re: Mass updates on a large table

Поиск
Список
Период
Сортировка
От Scott Marlowe
Тема Re: Mass updates on a large table
Дата
Msg-id dcc563d10708101057l7dd53571w39a333988a12c84@mail.gmail.com
обсуждение исходный текст
Ответ на Mass updates on a large table  ("Mark Steben" <msteben@autorevenue.com>)
Список pgsql-admin
On 8/10/07, Mark Steben <msteben@autorevenue.com> wrote:
>
> Good afternoon,
>
> I am attempting an update on two new date field columns on a 17 million row
> table.  Every row gets updated.
>
> The update statement is a simple one:
>
>   UPDATE EMAILRCPTS SET ID = ID
>
>  And the update of the new date fields themselves occurs as the result of a
> before trigger.
>
> The update took 3 days, 10 hours to complete on the testing box.

That's quite a while for only 17 million rows.  Are these rows
particularly wide?
Is it possible to do it by groups with a vacuum in between each group?
 That would keep the bloat down.

You don't mention your vacuuming strategy.  That might affect performance here.

Also, are there any FKs to / from this table?

> To minimize the impact of checkpoints.   The SHARED_BUFFERS parameter has
> been bumped up to 140000 on a 20meg RAM box.

I assume you meant 20Gig box.

Under 7.4 larger shared_buffers may not be a good thing.  that's a
very large shared buffer setting for 7.4 to handle.

> There are about 9 indexes on this table although none of them reference the
> date fields so since there are no inserts I don't think they would have an
> impact on the update  (I've been wrong before though)

Doesn't matter.  PostgreSQL's implementation of MVCC means that each
update results in a new row, and therefore each index has to be
updated for each row updated.

> Would an update statement referencing the date fields work faster than a
> trigger?

Possibly.

>  Do you have any other suggestions to speed this up?
> We are at Postgres 7.4.5.

Upgrade to a modern version?  7.4 is getting old fast, and 7.4.5 has a
LOT of bugs that have been fixed in later versions. It's up to like
7.4.17 so you're missing a LOT of updates just in the branch you're
in.  But upgrading to 8.2.4 would definitely be a help.

> We simply cannot afford this table to be down for 3+ days during a
> production update.  The production box is a 32meg RAM box.

I would question the business process that requires an entire 17
million row table be updated.

Also, posting your schema and your triggers might help a bit as well.

В списке pgsql-admin по дате отправления:

Предыдущее
От: "Chris Hoover"
Дата:
Сообщение: Re: Easy way to change table schema?
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Mass updates on a large table