Re: [WIP] In-place upgrade

Поиск
Список
Период
Сортировка
От Decibel!
Тема Re: [WIP] In-place upgrade
Дата
Msg-id 75081FC9-2909-496E-BAFB-2541636F66DA@decibel.org
обсуждение исходный текст
Ответ на Re: [WIP] In-place upgrade  (Bruce Momjian <bruce@momjian.us>)
Ответы Re: [WIP] In-place upgrade  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: [WIP] In-place upgrade  (Zdenek Kotala <Zdenek.Kotala@Sun.COM>)
Список pgsql-hackers
On Nov 6, 2008, at 1:31 PM, Bruce Momjian wrote:
>> 3. What about multi-release upgrades?  Say someone wants to upgrade
>> from 8.3 to 8.6.  8.6 only knows how to read pages that are
>> 8.5-and-a-half or better, 8.5 only knows how to read pages that are
>> 8.4-and-a-half or better, and 8.4 only knows how to read pages that
>> are 8.3-and-a-half or better.  So the user will have to upgrade to
>> 8.3.MAX, then 8.4.MAX, then 8.5.MAX, and then 8.6.
>
> Yes.


I think that's pretty seriously un-desirable. It's not at all  
uncommon for databases to stick around for a very long time and then  
jump ahead many versions. I don't think we want to tell people they  
can't do that.

More importantly, I think we're barking up the wrong tree by putting  
migration knowledge into old versions. All that the old versions need  
to do is guarantee a specific amount of free space per page. We  
should provide a mechanism to tell a cluster what that free space  
requirement is, and not hard-code it into the backend.

Unless I'm mistaken, there are only two cases we care about for  
additional space: per-page and per-tuple. Those requirements could  
also vary for different types of pg_class objects. What we need is an  
API that allows an administrator to tell the database to start  
setting this space aside. One possibility:

pg_min_free_space( version, relkind, bytes_per_page, bytes_per_tuple );
pg_min_free_space_index( version, indexkind, bytes_per_page,  
bytes_per_tuple );

version: This would be provided as a safety mechanism. You would have  
to provide the major version that matches what the backend is  
running. See below for an example.

relkind: Essentially, heap vs toast, though I suppose it's possible  
we might need this for sequences.

indexkind: Because we support different types of indexes, I think we  
need to handle them differently than heap/toast. If we wanted, we  
could have a single function that demands that indexkind is NULL if  
relkind != 'index'.

bytes_per_(page|tuple): obvious. :)


Once we have an API, we need to get users to make use of it. I'm  
thinking add something like the following to the release notes:

"To upgrade from a prior version to 8.4, you will need to run some of  
the following commands, depending on what version you are currently  
using:

For version 8.3:
SELECT pg_min_free_space( '8.3', 'heap', 4, 12 );
SELECT pg_min_free_space( '8.3', 'toast', 4, 12 );

For version 8.2:
SELECT pg_min_free_space( '8.2', 'heap', 14, 12 );
SELECT pg_min_free_space( '8.2', 'toast', 14, 12 );
SELECT pg_min_free_space_index( '8.2', 'b-tree', 4, 4);"

(Note I'm just pulling numbers out of thin air in this example.)

As you can see, we pass in the version number to ensure that if  
someone accidentally cut and pastes the wrong stuff they know what  
they did wrong immediately.

One downside to this scheme is that it doesn't provide a mechanism to  
ensure that all required minimum free space requirements were passed  
in. Perhaps we want a function that takes an array of complex types  
and forces you to supply information for all known storage  
mechanisms. Another possibility would be to pass in some kind of  
binary format that contains a checksum.

Even if we do come up with a pretty fool-proof way to tell the old  
version what free space it needs to set aside, I think we should  
still have a mechanism for the new version to know exactly what the  
old version has set aside, and if it's actually been accomplished or  
not. One option that comes to mind is to add min_free_space_per_page  
and min_free_space_per_tuple to pg_class. Normally these fields would  
be NULL; the old version would only set them once it had verified  
that all pages in a given relation met those requirements (presumably  
via vacuum). The new version would check all these values on startup  
to ensure they made sense.

OTOH, we might not want to go mucking around with changing the  
catalog for older versions (I'm not even sure if we can). So perhaps  
it would be better to store this information in a separate table, or  
maybe a separate file. That might be best anyway; we generally  
wouldn't need this information, so it would be nice if it wasn't  
bloating pg_class all the time.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: Reducing some DDL Locks to ShareLock
Следующее
От: Decibel!
Дата:
Сообщение: Re: ALTER DATABASE SET TABLESPACE vs crash safety