pg_upgrade: How to deal with toast

Поиск
Список
Период
Сортировка
От Zdenek Kotala
Тема pg_upgrade: How to deal with toast
Дата
Msg-id 492488F1.4000009@sun.com
обсуждение исходный текст
Ответы Re: pg_upgrade: How to deal with toast  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Список pgsql-hackers
We are now in discussion about toast table upgrading. I try to collect ideas and
figure out how it should work and where are problems.

Overview:
---------

A few weeks ago we made a decision to use convert on read. We already made a 
decision how to solve problem with overflow data after conversion. Now we need 
to make decision how to deal with toast table.

Toasted data are split into chunks and these chunks are stored into toasted 
table as a record with following structure:

(valueid oid, residx int32, chunk varlena)

Chunk size is defined by TOAST_MAX_CHUNK_SIZE.

Toast table could contain different datatypes and one page can contain different 
type chunks.

toast_fetch_datum, toast_fetch_datum_slice and functions are low level function 
which do a main job.

How to upgrade it:
------------------

Toasted values is processed on demand when some part of postgresql needs to have 
detosted value (see for example pg_detoast_datum()). The toast_fetch_datum 
function starts index scan where valueid is a search key. This scan invokes a 
toast index and toasttable page upgrade provided by hook in ReadBuffer and 
toast_fetch_datum gets already converted tuple, but chunked data will stay 
untouched.

Toasted datum can be converted only when is completely connected together. It of 
course invokes lot of page conversions.

The idea is to read toasted datum, convert it, store it back and old chunks mark 
as deleted. Same method will be use for slice access, because we cannot access 
selected slice until the toasted datum is not converted.

the implementation will add hook in toast_fetch_datum, toast_fetch_datum_slice 
functions which handle conversion (similar to hook in ReadBuffer).

Issues:
-------

1) different chunk size in old and new format.

It is not issue because, old format is read and connected in the conversion 
function and this function can accept different chunk size. (originally I 
supposed to replace residx with offset, but I think it is not necessary now for 
upgrade)

2) data type is unknown

Unfortunately, in low function is no clue what data type is really stored in a 
chunks. One idea how to solve it is to add attno to chunk record structure. By 
my opinion, It is best solution, but it disallow to upgrade from 8.3->8.4!

3) How to detect which toasted datum need conversion

One idea is to make magic with XMIN. Everything older then "upgraded" 
transaction needs conversion. But vacuum probably can freeze some old or new 
tuples and after that we lost information. Another possible solution is probably 
mark tuples on converted page in info mask.

    Ideas, comments?
            Zdenek            












В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: Re: [BUGS] libpq does not manage SSL callbacks properly when other libraries are involved.
Следующее
От: Gregory Stark
Дата:
Сообщение: Cool hack with recursive queries