Re: [PATCHES] Cleaning up unreferenced table files

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: [PATCHES] Cleaning up unreferenced table files
Дата
Msg-id Pine.OSF.4.61.0506272030530.449706@kosh.hut.fi
обсуждение исходный текст
Ответ на Re: [PATCHES] Cleaning up unreferenced table files  (Bruce Momjian <pgman@candle.pha.pa.us>)
Список pgsql-hackers
No, not for now. Maybe for 8.2. And maybe as a contrib tool at first after 
all.

- Heikki

On Mon, 27 Jun 2005, Bruce Momjian wrote:

>
> Heikki, do you have any interest in completing your file checking patch
> for inclusion in 8.1 by adding tablespace information and other fixes as
> requested by Tom below?  The current patch version is at:
>
>        ftp://candle.pha.pa.us/pub/postgresql/mypatches
>
> called checkfiles*.
>
> Anyone else want to complete it?
>
> ---------------------------------------------------------------------------
>
> Tom Lane wrote:
>> Bruce Momjian <pgman@candle.pha.pa.us> writes:
>>> Applied.
>>
>> Now that I've had a chance to look at it, this patch is thoroughly
>> broken.  Problems observed in a quick review:
>>
>> 1. It doesn't work at all for non-default tablespaces: it will
>> claim that every file in such a tablespace is stale.  The fact
>> that it does that rather than failing entirely is accidental.
>> It tries to read the database's pg_class in the target tablespace
>> whether it's there or not.  Because the system is still in recovery
>> mode, the low-level routines allow the access to the nonexistent
>> pg_class table to pass --- in fact they think they should create
>> the file, so after it runs there's a bogus empty "1259" file in each
>> such tablespace (which of course it complains about, too).  The code
>> then proceeds to think that pg_class is empty so of course everything
>> draws a warning.
>>
>> 2. It's not robust against stale subdirectories of a tablespace
>> (ie, subdirs corresponding to a nonexistent database) --- again,
>> it'll try to read a nonexistent pg_class.  Then it'll produce a
>> bunch of off-target complaint messages.
>>
>> 3. It's assuming that relfilenode is unique database-wide, when no
>> such assumption is safe.  We only have a guarantee that it's unique
>> tablespace-wide.
>>
>> 4. It fails to examine table segment files (such as "nnn.1").  These
>> should be complained of when the "nnn" doesn't match any hash entry.
>>
>> 5. It will load every relfilenode value in pg_class into the hashtable
>> whether it's meaningful or not.  There should be a check on relkind.
>>
>> 6. I don't think relying on strtol to decide if a filename is entirely
>> numeric is very safe.  Note all the extra defenses in pg_atoi against
>> various platform-specific misbehaviors of strtol.  Personally I'd use a
>> strspn test instead.
>>
>> 7. There are no checks for readdir failure (compare any other readdir
>> loop in the backend).
>>
>> See also Simon Riggs' complaints that the circumstances under which it's
>> done are pretty randomly selected.  (One particular thing that I think
>> is a bad idea is to do this in a standalone backend.  Any sort of
>> corruption in any db's pg_class would render it impossible to start up.)
>>
>> To fix the first three problems, and also avoid the performance problem
>> of multiply rescanning a database's pg_class for each of its
>> tablespaces, I would suggest that the hashtable entries be widened to
>> RelFileNode structs (ie, db oid, tablespace oid, relfilenode oid).  Then
>> there should be one iteration over pg_database to learn the OIDs and
>> default tablespaces of each database; with that you can read pg_class
>> from its correct location for each database and load all the entries
>> into the hashtable.  Then you iterate through the tablespaces looking
>> for stuff not present in the hashtable.  You might also want to build a
>> list or hashtable of known database OIDs, so that you can recognize a
>> stale subdirectory immediately and issue a direct complaint about it
>> without even recursing into it.
>>
>>             regards, tom lane
>>
>
> -- 
>  Bruce Momjian                        |  http://candle.pha.pa.us
>  pgman@candle.pha.pa.us               |  (610) 359-1001
>  +  If your life is a hard drive,     |  13 Roberts Road
>  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>               http://www.postgresql.org/docs/faq
>

- Heikki


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Josh Berkus
Дата:
Сообщение: Re: Sigh, another contrib/cube and contrib/seg problem
Следующее
От: Oleg Bartunov
Дата:
Сообщение: Re: tsearch2 vs core?