Обсуждение: Adding a foreign key constraint is extremely slow

Поиск
Список
Период
Сортировка

Adding a foreign key constraint is extremely slow

От
bsamwel@xs4all.nl
Дата:
Hi guys,

I'm having another performance problem as well. I have two tables called
"wwwlog" (about 100 bytes per row, 1.2 million records) and table called
"triples" (about 20 bytes per row, 0.9 million records). Triples contains
an integer foreign key to wwwlog, but it was not marked as a foreign key
at the point of table creation. Now, when I do:

alter table triples add foreign key(id1) references wwwlog(id);

PostgreSQL starts doing heavy work for at least one and a half hour, and I
broke it off at that. It is not possible to "explain" a statement like
this! Probably what it does is that it will check the foreign key
constraint for every field in the table. This will make it completely
impossible to load my data, because:

(1) I cannot set the foreign key constraints BEFORE loading the 0.9
million records, because that would cause the checks to take place during
loading.
(2) I cannot set the foreign key constraints AFTER loading the 0.9 million
records because I've got no clue at all how long this operation is going
to take.
(3) Table "triples" contains two more foreign keys to the same wwwlog key.
This means I've got to do the same thing two more times after the first
one is finished.

I find this behaviour very annoying, because it is possible to optimize a
check like this very well, for instance by creating a temporary data set
containing the union of all foreign keys and all primary keys of the
original table, augmented with an extra field "pri" which is 1 if the
record comes from the primary keys and 0 otherwise. Say this data is
contained in a temporary table called "t" with columns "key" and "pri" for
the data. One would then be able to do the check like this:

NOT EXISTS(
  SELECT key,sum(pri)
  FROM t
  GROUP BY key
  HAVING sum(pri) = 0
);

This means that there must not exist a group of "key" values that does not
have a primary key somewhere in the set. This query is extremely easy to
execute and would be done in a few seconds.

Does anyone know of a way of adding foreign key constraints faster in
PostgreSQL? Or, if there is no solution, do you guys know of any reasons
why a solution like the one I described above would or would not work, and
could or could not be built into PostgreSQL at some point?

Regards,
Bart


Re: Adding a foreign key constraint is extremely slow

От
Stephan Szabo
Дата:
On Sun, 23 Mar 2003 bsamwel@xs4all.nl wrote:

> Hi guys,
>
> I'm having another performance problem as well. I have two tables called
> "wwwlog" (about 100 bytes per row, 1.2 million records) and table called
> "triples" (about 20 bytes per row, 0.9 million records). Triples contains
> an integer foreign key to wwwlog, but it was not marked as a foreign key
> at the point of table creation. Now, when I do:
>
> alter table triples add foreign key(id1) references wwwlog(id);
>
> PostgreSQL starts doing heavy work for at least one and a half hour, and I
> broke it off at that. It is not possible to "explain" a statement like
> this! Probably what it does is that it will check the foreign key
> constraint for every field in the table. This will make it completely

In fact it does exactly this.  It could be done using
select * from fk where not exists (select * from pk where ...)
or another optimized method, but noone's gotten to changing it. I didn't
do it in the start becase I didn't want to duplicate the check logic if it
could be helped.

As a temporary workaround until something is done(assuming you know the
data is valid), set the constraints before loading then turn off triggers
on the tables (see pg_dump's data only output for an example), load the
data and turn them back on.


Re: Adding a foreign key constraint is extremely slow

От
Greg Stark
Дата:
bsamwel@xs4all.nl writes:

> alter table triples add foreign key(id1) references wwwlog(id);
>
> PostgreSQL starts doing heavy work for at least one and a half hour, and I
> broke it off at that. It is not possible to "explain" a statement like
> this! Probably what it does is that it will check the foreign key
> constraint for every field in the table. This will make it completely
> impossible to load my data, because:
>
> (2) I cannot set the foreign key constraints AFTER loading the 0.9 million
> records because I've got no clue at all how long this operation is going
> to take.

Try adding an index on wwwlog(id) so that it can check the constraint without
doing a full table scan for each value being checked.

--
greg

Re: Adding a foreign key constraint is extremely slow

От
Bart Samwel
Дата:
Greg Stark wrote:
> bsamwel@xs4all.nl writes:
>
>
>>alter table triples add foreign key(id1) references wwwlog(id);
>>
>>PostgreSQL starts doing heavy work for at least one and a half hour, and I
>>broke it off at that. It is not possible to "explain" a statement like
>>this! Probably what it does is that it will check the foreign key
>>constraint for every field in the table. This will make it completely
>>impossible to load my data, because:
>>
>>(2) I cannot set the foreign key constraints AFTER loading the 0.9 million
>>records because I've got no clue at all how long this operation is going
>>to take.
>
>
> Try adding an index on wwwlog(id) so that it can check the constraint without
> doing a full table scan for each value being checked.

AFAIK, because wwwlog(id) is the primary key, this index already exists
implicitly.  Still, 0.9 million separate index lookups are too slow for
my purposes, if for example it takes something as low as 1 ms per lookup
it will still take 900 seconds (= 15 minutes) to complete. As the
complete adding of the foreign key constraint took about an hour, that
would suggest an average of 4 ms per lookup, which suggests that the
index is, in fact, present. :)

Anyway, I've actually waited for the operation to complete. The problem
is out of my way for now.

Bart


--

Leiden Institute of Advanced Computer Science (http://www.liacs.nl)
E-mail: bsamwel@liacs.nl    Telephone: +31-71-5277037
Homepage: http://www.liacs.nl/~bsamwel
Opinions stated in this e-mail are mine and not necessarily my employer's.