Обсуждение: Allow replacement of bloated primary key indexes without foreign key rebuilds

Поиск
Список
Период
Сортировка

Allow replacement of bloated primary key indexes without foreign key rebuilds

От
Gurjeet Singh
Дата:
Today I learnt [1,2,3] that the feature ALTER TABLE .. ADD CONSTRAINT ... USING INDEX we added back in the day is not so useful in the field. Having to drop foreign key constraints before this command, and recreate them afterwards makes this command useless to most database setups. I feel sorry that no one brought this up when we were implementing the feature; maybe we could've done something about it right then.

I wish to correct it now, and did some research. Long story short, I realized that the foreign key constraint depends on the index relation of the primary key, and does not depend on the constraint object as I had expected (Please see rows 5 and 11 of the result set shown below). This behaviour is also seen when the FKey references a unique constraint. As much as that perplexes me, I think it makes our job a bit easier.

All we need to do is allow swapping of pg_class.relfilenode of two indexes. This will let the dependency entries stand as they are and allow us to drop the bloated primary key index structure without having to rebuild the foreign key constraints.

As for the syntactical sugar, this can be added to either ALTER TABLE or to ALTER INDEX. Although under no normal circumstances one would need to use ALTER INDEX to swap two indexes' relfilenode (because one can easily create a duplicate index and drop/rename-in-place the old one), I think it would make more sense here since it is just an operation on two indexes and has nothing to do with the constraints, apart from the fact that we want to use this feature to meddle with the constraints.

Syntax options:

ALTER TABLE tbl REPLACE [CONSTRAINT constr] {PRIMARY KEY | UNIQUE} USING INDEX new_index;

ALTER INDEX ind REPLACE WITH new_index;

Note that in both the syntaxes, it is assumed that all remnants of new_index will be gone after the command completes successfully; that is, the commands will behave as if they deleted the index structure of the index being replaced and placed the new structure in its place, while dropping the index that was used for the replacement.

I don't think we need to ensure that the new_index is completely flushed to disk before the operation, but we do need to issue relevant cache invalidation messages after the operation is done.

For replacement to be successful, new_index should not be associated with any constraints, and, new_index should be identical to the index being replaced, except for the index names.

The ALTER TABLE syntax closely emulates the existing syntax of replacing a constraint using an existing index, but looking at the grammar construction I feel that it may be more complex to implement than the ALTER INDEX syntax.

ALTER INDEX feels easier to do, since we won't have to jump through hoops like in ALTER TABLE's multi-command support (ATExec*() functions), and dropping the new_index might be easier to do.

Thoughts?

postgres=# \d+ test
                         Table "public.test"
 Column |  Type   | Modifiers | Storage | Stats target | Description
--------+---------+-----------+---------+--------------+-------------
 a      | integer | not null  | plain   |              |
 b      | integer |           | plain   |              |
Indexes:
    "test_pkey" PRIMARY KEY, btree (a)
    "temp_idx" UNIQUE, btree (a)
Referenced by:
    TABLE "test2" CONSTRAINT "test2_b_fkey" FOREIGN KEY (b) REFERENCES test(a)
Has OIDs: no

postgres=# \d+ test2
                        Table "public.test2"
 Column |  Type   | Modifiers | Storage | Stats target | Description
--------+---------+-----------+---------+--------------+-------------
 a      | integer |           | plain   |              |
 b      | integer |           | plain   |              |
Foreign-key constraints:
    "test2_b_fkey" FOREIGN KEY (b) REFERENCES test(a)
Has OIDs: no

Relevant output of query [4] on pg_depend:

    classid    | objid |            objid             |  refclassid   | refobjid |   refobjid   | deptype
---------------+-------+------------------------------+---------------+----------+--------------+---------
 pg_class      | 16413 | test                         | pg_namespace  |     2200 | public       | n

 pg_type       | 16415 | test                         | pg_class      |    16413 | test         | i
 pg_type       | 16414 | test[]                       | pg_type       |    16415 | test         | i
 pg_constraint | 16417 | test_pkey                    | pg_class      |    16413 | test         | a
 pg_class      | 16416 | test_pkey                    | pg_constraint |    16417 | test_pkey    | i
 pg_class      | 16418 | test2                        | pg_namespace  |     2200 | public       | n
 pg_type       | 16420 | test2                        | pg_class      |    16418 | test2        | i
 pg_type       | 16419 | test2[]                      | pg_type       |    16420 | test2        | i
 pg_constraint | 16421 | test2_b_fkey                 | pg_class      |    16413 | test         | n
 pg_constraint | 16421 | test2_b_fkey                 | pg_class      |    16418 | test2        | a
 pg_constraint | 16421 | test2_b_fkey                 | pg_class      |    16416 | test_pkey    | n
 pg_trigger    | 16422 | RI_ConstraintTrigger_a_16422 | pg_constraint |    16421 | test2_b_fkey | i
 pg_trigger    | 16423 | RI_ConstraintTrigger_a_16423 | pg_constraint |    16421 | test2_b_fkey | i
 pg_trigger    | 16424 | RI_ConstraintTrigger_c_16424 | pg_constraint |    16421 | test2_b_fkey | i
 pg_trigger    | 16425 | RI_ConstraintTrigger_c_16425 | pg_constraint |    16421 | test2_b_fkey | i

[1] http://archives.postgresql.org/pgsql-general/2012-07/msg00104.php
[2] http://archives.postgresql.org/pgsql-general/2012-07/msg00105.php
[3] http://archives.postgresql.org/pgsql-general/2012-07/msg00110.php
[4] select classid::regclass, objid, case classid::regclass::text when 'pg_class' then objid::regclass::text when 'pg_type' then objid::regtype::text when 'pg_constraint' then (select conname from pg_constraint where oid = objid) when 'pg_namespace' then (select nspname from pg_namespace where oid = objid) when 'pg_trigger' then (select tgname from pg_trigger where oid = objid) else objid::text end, refclassid::regclass, refobjid, case refclassid::regclass::text when 'pg_class' then refobjid::regclass::text when 'pg_type' then refobjid::regtype::text when 'pg_constraint' then (select conname from pg_constraint where oid = refobjid) when 'pg_namespace' then (select nspname from pg_namespace where oid = refobjid) when 'pg_trigger' then (select tgname from pg_trigger where oid = refobjid) else refobjid::text end, deptype from pg_depend;

--
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Re: Allow replacement of bloated primary key indexes without foreign key rebuilds

От
Greg Stark
Дата:
On Sat, Jul 7, 2012 at 4:53 AM, Gurjeet Singh <singh.gurjeet@gmail.com> wrote:
> All we need to do is allow swapping of pg_class.relfilenode of two indexes.
> This will let the dependency entries stand as they are and allow us to drop
> the bloated primary key index structure without having to rebuild the
> foreign key constraints.

Fwiw I don't like swapping relfilenodes on indexes the user created.
REINDEX currently does this but it's a bit of a hack and only works
because reindex carefully builds the new index with exactly the same
definition as the old one.

The problem you describe is one of constraints and dependencies and
not one of indexes. It seems what you really want is a way to alter
foreign key dependencies to depend on a new index. Either an explicit
command that lets you set the new dependency or what seems even better
would be to have DROP INDEX check any dependent objects to see if
there's another index that can satisfy them and change their
dependency.

These might suffer from deadlock problems but hopefully they could be
manageable since it's not a frequent operation and there aren't any
other operations that rejigger dependencies.

-- 
greg


Re: Re: Allow replacement of bloated primary key indexes without foreign key rebuilds

От
Tom Lane
Дата:
Greg Stark <stark@mit.edu> writes:
> On Sat, Jul 7, 2012 at 4:53 AM, Gurjeet Singh <singh.gurjeet@gmail.com> wrote:
>> All we need to do is allow swapping of pg_class.relfilenode of two indexes.

> Fwiw I don't like swapping relfilenodes on indexes the user created.
> REINDEX currently does this but it's a bit of a hack and only works
> because reindex carefully builds the new index with exactly the same
> definition as the old one.

Yes.  The swap-relfilenodes operation would have to carefully check that
the index definitions were exactly equivalent, and there would be a
constant risk for bugs of omission if that code weren't taught about
any new index properties we invent.

> The problem you describe is one of constraints and dependencies and
> not one of indexes. It seems what you really want is a way to alter
> foreign key dependencies to depend on a new index. Either an explicit
> command that lets you set the new dependency or what seems even better
> would be to have DROP INDEX check any dependent objects to see if
> there's another index that can satisfy them and change their
> dependency.

Either of these have exactly the same issue, namely their correctness
depends on determining if two indexes have identical properties.

All of these things seem like ugly, hard-to-use kluges anyway (the
make-sure-the-indexes-match business is just as much of a PITA for the
DBA as it is for the system).  What we really want is REINDEX
CONCURRENTLY.
        regards, tom lane


Re: Re: Allow replacement of bloated primary key indexes without foreign key rebuilds

От
Greg Stark
Дата:
On Tue, Jul 10, 2012 at 3:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The problem you describe is one of constraints and dependencies and
>> not one of indexes. It seems what you really want is a way to alter
>> foreign key dependencies to depend on a new index. Either an explicit
>> command that lets you set the new dependency or what seems even better
>> would be to have DROP INDEX check any dependent objects to see if
>> there's another index that can satisfy them and change their
>> dependency.
>
> Either of these have exactly the same issue, namely their correctness
> depends on determining if two indexes have identical properties.

This doesn't sound right to me. In these cases all it would have to
know about is the same set of properties that CREATE CONSTRAINT looks
for to find a satisfactory index to depend on.

-- 
greg


Re: Re: Allow replacement of bloated primary key indexes without foreign key rebuilds

От
Alvaro Herrera
Дата:
Excerpts from Tom Lane's message of mar jul 10 10:44:03 -0400 2012:

> All of these things seem like ugly, hard-to-use kluges anyway (the
> make-sure-the-indexes-match business is just as much of a PITA for the
> DBA as it is for the system).  What we really want is REINDEX
> CONCURRENTLY.

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.47.9961 ?

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Re: Allow replacement of bloated primary key indexes without foreign key rebuilds

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Excerpts from Tom Lane's message of mar jul 10 10:44:03 -0400 2012:
>> What we really want is REINDEX CONCURRENTLY.

> http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.47.9961 ?

Hm ... that paper looks like something we might want to incorporate into
btree's VACUUM processing, but it's not very on-point if someone really
wants to rebuild the index totally.
        regards, tom lane


Re: Re: Allow replacement of bloated primary key indexes without foreign key rebuilds

От
Gurjeet Singh
Дата:
On Tue, Jul 10, 2012 at 10:44 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Greg Stark <stark@mit.edu> writes:
> On Sat, Jul 7, 2012 at 4:53 AM, Gurjeet Singh <singh.gurjeet@gmail.com> wrote:
>> All we need to do is allow swapping of pg_class.relfilenode of two indexes.

> Fwiw I don't like swapping relfilenodes on indexes the user created.
> REINDEX currently does this but it's a bit of a hack and only works
> because reindex carefully builds the new index with exactly the same
> definition as the old one.

Yes.  The swap-relfilenodes operation would have to carefully check that
the index definitions were exactly equivalent, and there would be a
constant risk for bugs of omission if that code weren't taught about
any new index properties we invent.

IMHO there must be many other places in this code-base where we run that risk.

The way I am planning to do it was to compare all relevant fields of the FormData_pg_index. And I am assuming anybody changing the struct members will take care of relevant changes needed for this code too.

We can add a runtime/compile-time assert to make sure that Natts_pg_index==17. That way, if a new column gets added, we will get alerted promptly.
 
All of these things seem like ugly, hard-to-use kluges anyway (the
make-sure-the-indexes-match business is just as much of a PITA for the
DBA as it is for the system).  What we really want is REINDEX
CONCURRENTLY.

+1, but I can't take on that task.

--
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Re: Re: Allow replacement of bloated primary key indexes without foreign key rebuilds

От
Gurjeet Singh
Дата:
On Tue, Jul 10, 2012 at 11:11 AM, Greg Stark <stark@mit.edu> wrote:
On Tue, Jul 10, 2012 at 3:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The problem you describe is one of constraints and dependencies and
>> not one of indexes. It seems what you really want is a way to alter
>> foreign key dependencies to depend on a new index. Either an explicit
>> command that lets you set the new dependency or what seems even better
>> would be to have DROP INDEX check any dependent objects to see if
>> there's another index that can satisfy them and change their
>> dependency.
>
> Either of these have exactly the same issue, namely their correctness
> depends on determining if two indexes have identical properties.

This doesn't sound right to me. In these cases all it would have to
know about is the same set of properties that CREATE CONSTRAINT looks
for to find a satisfactory index to depend on.

I like the DROP index idea, but the silent side-effect may not make people happy. Can you give me a pointer to relevant code.

--
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company