Обсуждение: autovacuum and TOAST tables

Поиск
Список
Период
Сортировка

autovacuum and TOAST tables

От
Alvaro Herrera
Дата:
Hi,

Here's a patch to make autovacuum process TOAST tables separately from
main tables.

The most important change is that when called from autovac, vacuum does
not process the TOAST table at all.  It will only do so when the stats
for the TOAST table say that it needs vacuuming.  (A user-invoked vacuum
still processes TOAST tables normally.)

Per previous discussion, the autovac code is now doing two passes over
pg_class.

There's two things I'm not happy about in this patch:

1. it uses a List to keep the mapping of heap<->toast Oids.  This is
needed to be able to fetch the main rel's pg_autovacuum entry to process
the toast table.  This incurs in O(n^2) behavior.

2. the "expected relkind" business is gone; it's not easy to pass the
correct relkind down from autovac, and at the same time have a
reasonable thing to pass down from user-invoked vacuum.  Right now what
the patch does is check that the rel to vacuum is either
RELKIND_RELATION or _TOASTVALUE.

(I admit that my unhappiness about the second is mild, though.)

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Вложения

Re: autovacuum and TOAST tables

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> There's two things I'm not happy about in this patch:

> 1. it uses a List to keep the mapping of heap<->toast Oids.  This is
> needed to be able to fetch the main rel's pg_autovacuum entry to process
> the toast table.  This incurs in O(n^2) behavior.

Use a dynahash table instead?
        regards, tom lane


Re: autovacuum and TOAST tables

От
Alvaro Herrera
Дата:
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > There's two things I'm not happy about in this patch:
>
> > 1. it uses a List to keep the mapping of heap<->toast Oids.  This is
> > needed to be able to fetch the main rel's pg_autovacuum entry to process
> > the toast table.  This incurs in O(n^2) behavior.
>
> Use a dynahash table instead?

Right, the attached patch does that.

Note that this patch allows a toast table to be vacuumed by the user:

alvherre=# vacuum pg_toast.pg_toast_39970;
VACUUM

I don't have a problem with that, but if anyone thinks this is not a
good idea, please speak up.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Вложения

Re: autovacuum and TOAST tables

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Note that this patch allows a toast table to be vacuumed by the user:
> I don't have a problem with that, but if anyone thinks this is not a
> good idea, please speak up.

The permissions on pg_toast will prevent anyone but a superuser from
doing that anyway, so it's no big deal.

Possibly more interesting is what happens if someone drops the parent
table while VACUUM is working independently on the toast table.  Does
DROP take exclusive lock on a toast table?  Probably, but it needs
to be checked.  I think preventing that scenario was one reason why
the vacuuming was tied together way back when.

(The same goes for any other parent-table DDL action that would affect
the toast table; CLUSTER or TRUNCATE for instance.)
        regards, tom lane


Re: autovacuum and TOAST tables

От
Alvaro Herrera
Дата:
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > Note that this patch allows a toast table to be vacuumed by the user:
> > I don't have a problem with that, but if anyone thinks this is not a
> > good idea, please speak up.
> 
> The permissions on pg_toast will prevent anyone but a superuser from
> doing that anyway, so it's no big deal.
> 
> Possibly more interesting is what happens if someone drops the parent
> table while VACUUM is working independently on the toast table.  Does
> DROP take exclusive lock on a toast table?  Probably, but it needs
> to be checked.

Yes, it does.  So the autovacuum process working on the TOAST table
would get cancelled by the DROP TABLE, TRUNCATE, CLUSTER.  The one ALTER
TABLE variant that I think needs to handle the TOAST table is ALTER
TYPE, but I think it should work that it is being vacuumed concurrently.
REINDEX TABLE should perhaps also be concerned because it does reindex
the toast table, but it grabs the lock before actually doing the
reindexing so I don't think there's a problem here.


BTW only now I notice that CLUSTER leaves the toast table name in bad
shape: if you create a table with OID X its TOAST table is named
pg_toast_X.  If you then cluster this table, a new transient table gets
created with OID Y; the TOAST table for Y is named pg_toast_Y, and then
this new TOAST table is used as the new TOAST table for the original
table X.  So you end up with table OID X having TOAST table pg_toast_Y.

This is not a concern from the system standpoint because it doesn't use
this name for anything, but people looking at the catalogs manually may
be taken by surprise.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: autovacuum and TOAST tables

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> BTW only now I notice that CLUSTER leaves the toast table name in bad
> shape: if you create a table with OID X its TOAST table is named
> pg_toast_X.  If you then cluster this table, a new transient table gets
> created with OID Y; the TOAST table for Y is named pg_toast_Y, and then
> this new TOAST table is used as the new TOAST table for the original
> table X.  So you end up with table OID X having TOAST table pg_toast_Y.

Hmm, we could probably fix that if we made the cluster operation swap
the physical storage of the two toast tables, rather than swapping the
tables altogether.  I agree it's not critical but it could be confusing.
        regards, tom lane


Re: autovacuum and TOAST tables

От
Tom Lane
Дата:
I wrote:
> Hmm, we could probably fix that if we made the cluster operation swap
> the physical storage of the two toast tables, rather than swapping the
> tables altogether.  I agree it's not critical but it could be confusing.

On second thought, I think it *could* lead to a visible failure.
Suppose the OID counter wraps around and the OID that had been used for
the temporary CLUSTER table gets assigned to a new table.  If that table
needs a toast table, it'll try to create one using the name that is
already in use.  We have defenses against picking an OID that's in use,
but none for toast table names.  So I think it's indeed worth fixing.
        regards, tom lane


Re: autovacuum and TOAST tables

От
Alvaro Herrera
Дата:
Tom Lane wrote:

> On second thought, I think it *could* lead to a visible failure.
> Suppose the OID counter wraps around and the OID that had been used for
> the temporary CLUSTER table gets assigned to a new table.  If that table
> needs a toast table, it'll try to create one using the name that is
> already in use.  We have defenses against picking an OID that's in use,
> but none for toast table names.  So I think it's indeed worth fixing.

Okay, I'll see to it after committing this autovacuum stuff.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: autovacuum and TOAST tables

От
Alvaro Herrera
Дата:
Tom Lane wrote:
> I wrote:
> > Hmm, we could probably fix that if we made the cluster operation swap
> > the physical storage of the two toast tables, rather than swapping the
> > tables altogether.  I agree it's not critical but it could be confusing.
> 
> On second thought, I think it *could* lead to a visible failure.
> Suppose the OID counter wraps around and the OID that had been used for
> the temporary CLUSTER table gets assigned to a new table.  If that table
> needs a toast table, it'll try to create one using the name that is
> already in use.  We have defenses against picking an OID that's in use,
> but none for toast table names.  So I think it's indeed worth fixing.

My first attempt at a fix, which was simply swapping relfilenode for the
TOAST tables (and its indexes) after the data has been copied, does not
work, apparently because the TOAST pointers have the toast table ID
embedded.  Since we're intending to keep the pg_class entry for the old
TOAST table, the OID in the toast links is no longer valid.

I'm not sure what can be done about this ...  Obviously we cannot just
swap the toast table before starting to move the data, because then we
cannot read the original data.

I wonder if we can get away with simply renaming the new toast table and
index after the data has been copied.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: autovacuum and TOAST tables

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> My first attempt at a fix, which was simply swapping relfilenode for the
> TOAST tables (and its indexes) after the data has been copied, does not
> work, apparently because the TOAST pointers have the toast table ID
> embedded.

Ouch.  Right.

> I wonder if we can get away with simply renaming the new toast table and
> index after the data has been copied.

Yeah, that seems like the best answer.
        regards, tom lane


Re: autovacuum and TOAST tables

От
Alvaro Herrera
Дата:
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:

> > I wonder if we can get away with simply renaming the new toast table and
> > index after the data has been copied.
>
> Yeah, that seems like the best answer.

Seems like this patch fixes it.

How far back should be backpatched?  Given the lack of field complaints
I'd say "just to HEAD" but maybe others have different opinions.  FWIW
the patch applies cleanly (modulo header changes) back to 8.0.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Вложения

Re: autovacuum and TOAST tables

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@commandprompt.com> writes:
>> Yeah, that seems like the best answer.

> Seems like this patch fixes it.

Um, not for tables that don't have toast tables ...

> How far back should be backpatched?

Not at all; it's not a bug fix.
        regards, tom lane


Re: autovacuum and TOAST tables

От
Alvaro Herrera
Дата:
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> >> Yeah, that seems like the best answer.
>
> > Seems like this patch fixes it.
>
> Um, not for tables that don't have toast tables ...

Right, this seems better.

Note that it needs to open the toast table and grab AccessShare to get
the toast index OID.  I don't think it needs a stronger lock.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Вложения