Обсуждение: Are PRIMARY KEYs just UNIQUE INDEXes?

Поиск
Список
Период
Сортировка

Are PRIMARY KEYs just UNIQUE INDEXes?

От
Jules Bean
Дата:
Hi all,

I have a few questions about how postgres uses primary keys.

I initially created a table with a SERIAL primary key. In fact (in a
piece of appaling deviation for relational theory :-( but the tables
are so huge) the table has a perfectly natural primary key (a
two-column one, as it happens).  I built a unique index on that
one. And I also built a non-unique index on a different field.

Then I had a huge amount of data to bulk import, so I followed
standard advice and dropped the indexes, including the 'implied' index
tablename_pkey.

Import went fine (and very fast!).

Then I rebuilt the indexes.  I didn't bother to rebuild the _pkey
index, since I'm not actually using it at the moment (its destiny is
to serve as the target for a foreign key, but that isn't operational).

Now after a few days, I'm noticing inserts going /much/ slower.
Whereas I was seeing insert speeds of about 40,000 per minute, I'm now
seeing over an hour to insert 10,000.  Unfortunately, I've been making
too many changes to be sure what caused this (and I'm going to run
vacuum analyze again in a bit), but I'm wondering if it's confused
about the primary key.  So, my question is:

Does postgres remember that something is a primary key? Or is PRIMARY
KEY really just a shorthand for creating a unique index?

Secondly, what is typically the slowest part of an insert? Presumably
it checks all the unique indices, but that should be quick, right?

Jules

Re: Are PRIMARY KEYs just UNIQUE INDEXes?

От
Tom Lane
Дата:
Jules Bean <jules@jellybean.co.uk> writes:
> Does postgres remember that something is a primary key? Or is PRIMARY
> KEY really just a shorthand for creating a unique index?

pgsql does remember that an index was created via a PRIMARY KEY
declaration, but AFAIR the only thing that flag does is indicate
the logical "primary key" in situations like interpreting a REFERENCES
constraint that doesn't mention the specific column to use.  It
certainly has no bearing on the speed of inserts.

> Secondly, what is typically the slowest part of an insert? Presumably
> it checks all the unique indices, but that should be quick, right?

If you don't have any constraints or other fancy features, I'd think
the insertion of the new index entries is probably the slowest thing.
Uniqueness checks are done as part of that; they don't add much overhead
*unless* you are constantly deleting and reinserting rows with the same
key value (note that UPDATE is equivalent to delete+reinsert, too).
In that case, there are still index entries for the deleted rows,
and the system has to actually visit all the deleted rows with the same
key as the to-be-inserted row to verify that they're all deleted.
VACUUM will clean this up by physically removing the deleted rows and
their index entries.

Dunno if that explains your problem or not...

            regards, tom lane