Обсуждение: Do we need vacuuming when tables are regularly dropped?

Поиск
Список
Период
Сортировка

Do we need vacuuming when tables are regularly dropped?

От
"Peter Kovacs"
Дата:
Hi,

We have a number of automated performance tests (to test our own code)
involving PostgreSQL. Test cases are supposed to drop and recreate
tables each time they run.

The problem is that some of the tests show a linear performance
degradation overtime. (We have data for three months back in the
past.) We have established that some element(s) of our test
environment must be the culprit for the degradation. As rebooting the
test machine didn't revert speeds to baselines recorded three months
ago, we have turned our attention to the database as the only element
of the environment which is persistent across reboots. Recreating the
entire PGSQL cluster did cause speeds to revert to baselines.

I understand that vacuuming solves performance problems related to
"holes" in data files created as a result of tables being updated. Do
I understand correctly that if tables are dropped and recreated at the
beginning of each test case, holes in data files are reclaimed, so
there is no need for vacuuming from a performance perspective?

I will double check whether the problematic test cases do indeed
always drop their tables, but assuming they do, are there any factors
in the database (apart from table updates) that can cause a linear
slow-down with repetitive tasks?

Thanks
Peter

Re: Do we need vacuuming when tables are regularly dropped?

От
"Peter Kovacs"
Дата:
PS:
PGSQL version is: 8.2.7. (BTW, which catalog view contains the
back-end version number?)


On Mon, Sep 29, 2008 at 11:37 AM, Peter Kovacs
<maxottovonstirlitz@gmail.com> wrote:
> Hi,
>
> We have a number of automated performance tests (to test our own code)
> involving PostgreSQL. Test cases are supposed to drop and recreate
> tables each time they run.
>
> The problem is that some of the tests show a linear performance
> degradation overtime. (We have data for three months back in the
> past.) We have established that some element(s) of our test
> environment must be the culprit for the degradation. As rebooting the
> test machine didn't revert speeds to baselines recorded three months
> ago, we have turned our attention to the database as the only element
> of the environment which is persistent across reboots. Recreating the
> entire PGSQL cluster did cause speeds to revert to baselines.
>
> I understand that vacuuming solves performance problems related to
> "holes" in data files created as a result of tables being updated. Do
> I understand correctly that if tables are dropped and recreated at the
> beginning of each test case, holes in data files are reclaimed, so
> there is no need for vacuuming from a performance perspective?
>
> I will double check whether the problematic test cases do indeed
> always drop their tables, but assuming they do, are there any factors
> in the database (apart from table updates) that can cause a linear
> slow-down with repetitive tasks?
>
> Thanks
> Peter
>

Re: Do we need vacuuming when tables are regularly dropped?

От
Tom Lane
Дата:
"Peter Kovacs" <maxottovonstirlitz@gmail.com> writes:
> We have a number of automated performance tests (to test our own code)
> involving PostgreSQL. Test cases are supposed to drop and recreate
> tables each time they run.

> The problem is that some of the tests show a linear performance
> degradation overtime. (We have data for three months back in the
> past.) We have established that some element(s) of our test
> environment must be the culprit for the degradation. As rebooting the
> test machine didn't revert speeds to baselines recorded three months
> ago, we have turned our attention to the database as the only element
> of the environment which is persistent across reboots. Recreating the
> entire PGSQL cluster did cause speeds to revert to baselines.

What it sounds like to me is that you're not vacuuming the system
catalogs, which are getting bloated with dead rows about all those
dropped tables.

            regards, tom lane

Re: Do we need vacuuming when tables are regularly dropped?

От
"Peter Kovacs"
Дата:
On Mon, Sep 29, 2008 at 2:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Peter Kovacs" <maxottovonstirlitz@gmail.com> writes:
>> We have a number of automated performance tests (to test our own code)
>> involving PostgreSQL. Test cases are supposed to drop and recreate
>> tables each time they run.
>
>> The problem is that some of the tests show a linear performance
>> degradation overtime. (We have data for three months back in the
>> past.) We have established that some element(s) of our test
>> environment must be the culprit for the degradation. As rebooting the
>> test machine didn't revert speeds to baselines recorded three months
>> ago, we have turned our attention to the database as the only element
>> of the environment which is persistent across reboots. Recreating the
>> entire PGSQL cluster did cause speeds to revert to baselines.
>
> What it sounds like to me is that you're not vacuuming the system
> catalogs, which are getting bloated with dead rows about all those
> dropped tables.

Wow, great!

It is not immediately clear from the documentation, but the VACUUM
command also deals with the system catalogs as well, correct?

Thanks a lot!
Peter

>
>                        regards, tom lane
>

Re: Do we need vacuuming when tables are regularly dropped?

От
Tom Lane
Дата:
"Peter Kovacs" <maxottovonstirlitz@gmail.com> writes:
> It is not immediately clear from the documentation, but the VACUUM
> command also deals with the system catalogs as well, correct?

If it's run without any argument by a superuser, then yes.

(I think in recent versions we also allow a non-superuser database owner
to do this.)

            regards, tom lane

Re: Do we need vacuuming when tables are regularly dropped?

От
"Peter Kovacs"
Дата:
Thank you!
Peter

On Mon, Sep 29, 2008 at 2:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Peter Kovacs" <maxottovonstirlitz@gmail.com> writes:
>> It is not immediately clear from the documentation, but the VACUUM
>> command also deals with the system catalogs as well, correct?
>
> If it's run without any argument by a superuser, then yes.
>
> (I think in recent versions we also allow a non-superuser database owner
> to do this.)
>
>                        regards, tom lane
>

Re: Do we need vacuuming when tables are regularly dropped?

От
Steve Crawford
Дата:
>> What it sounds like to me is that you're not vacuuming the system
>> catalogs, which are getting bloated with dead rows about all those
>> dropped tables.
>>
>
> Wow, great!
>
> It is not immediately clear from the documentation, but the VACUUM
> command also deals with the system catalogs as well, correct?
>
>

To expand on Tom's answer, rows in system tables are created not only
for tables but for each column in the table, rules, indexes, etc. You
can end up with a lot more row creation than you suspect. And temporary
tables bloat the system tables just like regular tables. We discovered
that cron scripts using temporary tables can cause very rapid
system-table blotage.

Cheers,
Steve


Re: Do we need vacuuming when tables are regularly dropped?

От
"Scott Marlowe"
Дата:
On Mon, Sep 29, 2008 at 11:12 AM, Steve Crawford
<scrawford@pinpointresearch.com> wrote:
>
>>> What it sounds like to me is that you're not vacuuming the system
>>> catalogs, which are getting bloated with dead rows about all those
>>> dropped tables.
>>>
>>
>> Wow, great!
>>
>> It is not immediately clear from the documentation, but the VACUUM
>> command also deals with the system catalogs as well, correct?
>>
>>
>
> To expand on Tom's answer, rows in system tables are created not only for
> tables but for each column in the table, rules, indexes, etc. You  can end
> up with a lot more row creation than you suspect. And temporary tables bloat
> the system tables just like regular tables. We discovered that cron scripts
> using temporary tables can cause very rapid system-table blotage.

Also, there was a time when you couldn't do vacuum full on system
tables do to locking issues, and had to take the db down to single
user mode to do so.

Tom, is that still the case?

Re: Do we need vacuuming when tables are regularly dropped?

От
Tom Lane
Дата:
"Scott Marlowe" <scott.marlowe@gmail.com> writes:
> Also, there was a time when you couldn't do vacuum full on system
> tables do to locking issues, and had to take the db down to single
> user mode to do so.

There was a short period when *concurrent* vacuum fulls on just the
wrong combinations of system catalogs could deadlock (because they both
needed to look up stuff in the other one).  AFAIK we fixed that.  It's
never been the case that it didn't work at all.

            regards, tom lane

Re: Do we need vacuuming when tables are regularly dropped?

От
Steve Crawford
Дата:
Tom Lane wrote:
> "Scott Marlowe" <scott.marlowe@gmail.com> writes:
>
>> Also, there was a time when you couldn't do vacuum full on system
>> tables do to locking issues, and had to take the db down to single
>> user mode to do so.
>>
>
> There was a short period when *concurrent* vacuum fulls on just the
> wrong combinations of system catalogs could deadlock (because they both
> needed to look up stuff in the other one).  AFAIK we fixed that.  It's
> never been the case that it didn't work at all.
>
>             regards, tom lane
>
Never personally had trouble with vacuum full or reindex on system
tables. CLUSTER, however, is another story. While I've never run across
anything explicitly documenting that clustering system tables is
forbidden, I've also never used a version of PostgreSQL that allows it
(though I've never tried in single-user mode):

postgres@[local]=> CLUSTER pg_class USING pg_class_oid_index ;
ERROR:  "pg_class" is a system catalog

Should the docs
(http://www.postgresql.org/docs/8.3/interactive/sql-cluster.html) be
updated to note this restriction?

Cheers,
Steve


Re: Do we need vacuuming when tables are regularly dropped?

От
Tom Lane
Дата:
Steve Crawford <scrawford@pinpointresearch.com> writes:
> postgres@[local]=> CLUSTER pg_class USING pg_class_oid_index ;
> ERROR:  "pg_class" is a system catalog

I think the DB is probably protecting you from yourself here ;-).
If memory serves there are some system indexes whose relfilenode
numbers can't change, and pg_class_oid_index is one of them.  If
the CLUSTER had gone through you'd have hosed that database
irretrievably.

The protection check that is firing here is not so fine-grained as to
know the difference between pg_class and catalogs that this might be
safe for; but it does point up the moral that you need to know exactly
what you're doing if you are going to do DDL stuff on the system
catalogs.

            regards, tom lane

Re: Do we need vacuuming when tables are regularly dropped?

От
Steve Crawford
Дата:
Tom Lane wrote:
> Steve Crawford <scrawford@pinpointresearch.com> writes:
>
>> postgres@[local]=> CLUSTER pg_class USING pg_class_oid_index ;
>> ERROR:  "pg_class" is a system catalog
>>
>
> I think the DB is probably protecting you from yourself here ;-).

And elsewhere. :)

I wasn't advocating for a change of behavior, just the addition of
"Clustering is not permitted on system tables." to the documentation of
the CLUSTER command.

Cheers,
Steve