Обсуждение: ALTER TABLE speed (adding foreign keys)

Поиск
Список
Период
Сортировка

ALTER TABLE speed (adding foreign keys)

От
Shaw Terwilliger
Дата:
I have a table, contacts, with two fields, id (int), contactid (int).
I have another table, users, with a bunch of fields, but one important
one, id (SERIAL), which is the table key.

I needed to test my database setup (and software) with 1.5 million users,
each of whom will have (on average), 10 contacts.  So for each user in
the users table, I'll need ten records in the contacts table.

I wrote a little script to spew some sample data into a file for
COPY into the tables.  Since 1.5 million * 10 contacts is 15 million
rows, I created my tables without foreign keys (both id and contactid
will eventually be foreign keys into users, since I often select on both
of them) and constraints (CHECK id <> contactid).

The COPY goes much, much faster with these constraints absent from my tables.
So I go to ALTER TABLE contacts, to add these foreign keys.  One backend
fires up, allocates a few MB of RAM, and takes a long, long time to check
all the existing (15 million) rows.  At least, that's what I think it's doing.

postgres   394 98.1  1.6  7752 4300 pts/0    R<   Feb14 632:41 /usr/lib/postgresql/bin/postgres localhost sterwill im
ALTER 

The process has taken approximately 10 hours of CPU time so far.
I understand placing a foreign key constraint on an existing table can be
a very expensive operation, but I just wanted to make sure I won't be
"waiting forever" on this process.  I've got more tables to ALTER, and if
they all take this long, I'll just wipe the database, load the schema
again (_with_ the constraints), and just let the COPY take a day or two.

I'm using PostgreSQL 7.0.3 on an AMD Athlon 650 with 256 MB RAM, 500 MB
swap, Linux 2.2.18.

--
Shaw Terwilliger <sterwill@sourcegear.com>
SourceGear Corporation
217.356.0105 x 641

Вложения

Re: ALTER TABLE speed (adding foreign keys)

От
Stephan Szabo
Дата:
On Thu, 15 Feb 2001, Shaw Terwilliger wrote:

> I have a table, contacts, with two fields, id (int), contactid (int).
> I have another table, users, with a bunch of fields, but one important
> one, id (SERIAL), which is the table key.
>
> I needed to test my database setup (and software) with 1.5 million users,
> each of whom will have (on average), 10 contacts.  So for each user in
> the users table, I'll need ten records in the contacts table.
>
> I wrote a little script to spew some sample data into a file for
> COPY into the tables.  Since 1.5 million * 10 contacts is 15 million
> rows, I created my tables without foreign keys (both id and contactid
> will eventually be foreign keys into users, since I often select on both
> of them) and constraints (CHECK id <> contactid).
>
> The COPY goes much, much faster with these constraints absent from my tables.
> So I go to ALTER TABLE contacts, to add these foreign keys.  One backend
> fires up, allocates a few MB of RAM, and takes a long, long time to check
> all the existing (15 million) rows.  At least, that's what I think it's doing.
>
> postgres   394 98.1  1.6  7752 4300 pts/0    R<   Feb14 632:41 /usr/lib/postgresql/bin/postgres localhost sterwill im
ALTER 
>
> The process has taken approximately 10 hours of CPU time so far.
> I understand placing a foreign key constraint on an existing table can be
> a very expensive operation, but I just wanted to make sure I won't be
> "waiting forever" on this process.  I've got more tables to ALTER, and if
> they all take this long, I'll just wipe the database, load the schema
> again (_with_ the constraints), and just let the COPY take a day or two.

Currently the alter table implementation effectively means you're just
deferring all the trigger calls, so it's basically going to be doing the
15 million trigger calls.  It should finish, but I'm really uncertain of
how long it might take.  I've considered trying to do it as one query
which should be faster but requires a second copy of the logic to figure
out what the fk constraint means so I haven't done it since we're not
quite done with specifying the constraints fully yet.