Re: Data archiving/warehousing idea

Поиск
Список
Период
Сортировка
От Gavin Sherry
Тема Re: Data archiving/warehousing idea
Дата
Msg-id Pine.LNX.4.58.0702011307380.12097@linuxworld.com.au
обсуждение исходный текст
Ответ на Data archiving/warehousing idea  (Chris Dunlop <chris@onthe.net.au>)
Ответы Re: Data archiving/warehousing idea
A more general approach (Re: Data archiving/warehousing idea)
Список pgsql-hackers
On Thu, 1 Feb 2007, Chris Dunlop wrote:

> G'day hackers,

G'Day Chris,

> already - I couldn't find anything in the mail archives, but
> that doesn't mean it's not there...)

There has been a lot of discussion about this kind of thing over the
years.

> The main idea is that, there might be space utilisation and
> performance advantages if postgres had "hard" read-only tables,
> i.e. tables which were guaranteed (by postgres) to never have
> their data changed (insert/update/delete).
>
> This could potentially save disk space by allowing "book
> keeping" elements in the page/tuple headers to be removed, e.g.
> visibility information etc.  Also, some indexes could
> potentially be packed tighter if we know the data will never
> change (of course this is already available using the fillfactor
> control).

Well, there is also CPU overhead doing MVCC but there are a few
fundamental problems that must be overcome. The most significant is that
no useful table is always read only, otherwise you could never load it.
What do we do in the presence of a failure during the load or a user
issued ABORT? I guess we'd truncate the table... What about replay after a
crash?

Another way of looking at it is, we use the 'bookkeeping' information in
the tuple header for concurrency and for handling the abortion of the
transaction.

> The idea would be to introduce a statement something like:
>
>   ALTER TABLE foo SET ARCHIVE;

I'd not thought of that approach. There are two problems: some archive
tables are so large that loading them and then reprocessing them isn't
appealing. Secondly, we'd be rewriting the binary structure of the table
and this does not suit the non-overwriting nature of Postgres's storage
system.

A different approach discussed earlier involves greatly restricting the
way in which the table is used. This table could only be written to if an
exclusive lock is held; on error or ABORT, the table is truncated.

The problem is that a lot of this looks like a hack and I haven't seen a
very clean approach which has gone beyond basic brain dump.

Thanks,

Gavin


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Chris Dunlop
Дата:
Сообщение: Data archiving/warehousing idea
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: DROP FUNCTION failure: cache lookup failed for relation X