Re: Big 7.1 open items

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Big 7.1 open items
Дата
Msg-id 7458.961170401@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Big 7.1 open items  (JanWieck@t-online.de (Jan Wieck))
Ответы Re: Big 7.1 open items  (Don Baccus <dhogaza@pacifier.com>)
Re: Big 7.1 open items  (JanWieck@t-online.de (Jan Wieck))
Список pgsql-hackers
JanWieck@t-online.de (Jan Wieck) writes:
> Tom Lane wrote:
>> It gets a little trickier if you want to be able to split
>> multi-gig tables across several tablespaces, though, since
>> you couldn't just append ".N" to the base table path in that
>> scenario.
>> 
>> I'd be interested to know what sort of facilities Oracle
>> provides for managing huge tables...

>     Oracle  tablespaces  are  a  collection of 1...n preallocated
>     files.   Each  table  then  is  bound  to  a  tablespace  and
>     allocates extents (chunks) from those files.

OK, to get back to the point here: so in Oracle, tables can't cross
tablespace boundaries, but a tablespace itself could span multiple
disks?

Not sure if I like that better or worse than equating a tablespace
with a directory (so, presumably, all the files within it live on
one filesystem) and then trying to make tables able to span
tablespaces.  We will need to do one or the other though, if we want
to have any significant improvement over the current state of affairs
for large tables.

One way is to play the flip-the-path-ordering game some more,
and access multiple-segment tables with pathnames like this:
.../TABLESPACE/RELATION        -- first or only segment.../TABLESPACE/N/RELATION    -- N'th extension segment

This isn't any harder for md.c to deal with than what we do now,
but by making the /N subdirectories be symlinks, the dbadmin could
easily arrange for extension segments to go on different filesystems.
Also, since /N subdirectory symlinks can be added as needed,
expanding available space by attaching more disks isn't hard.
(If the admin hasn't pre-made a /N symlink when it's needed,
I'd envision the backend just automatically creating a plain
subdirectory so that it can extend the table.)

A limitation is that the N'th extension segments of all the relations
in a given tablespace have to be in the same place, but I don't see
that as a major objection.  Worst case is you make a separate tablespace
for each of your multi-gig relations ... you're probably not going to
have a very large number of such relations, so this doesn't seem like
unmanageable admin complexity.

We'd still want to create some tools to help the dbadmin with slinging
all these symlinks around, of course.  But I think it's critical to keep
the low-level file access protocol simple and reliable, which really
means minimizing the amount of information the backend needs to know to
figure out which file to write a page in.  With something like the above
you only need to know the tablespace name (or more likely OID), the
relation OID (+name or not, depending on outcome of other argument),
and the offset in the table.  No worse than now from the software's
point of view.

Comments?
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Lockhart
Дата:
Сообщение: Re: Big 7.1 open items
Следующее
От: Kristofer Munn
Дата:
Сообщение: ERROR: cannot find attribute 1 of relation pg_temp.13465.1