Re: making relfilenodes 56 bits

Поиск
Список
Период
Сортировка
От Dilip Kumar
Тема Re: making relfilenodes 56 bits
Дата
Msg-id CAFiTN-vE=1H8c64oW-0Vc6fskTOpgShBNb9MbOE7sZXQpo=FoA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: making relfilenodes 56 bits  (Ashutosh Sharma <ashu.coek88@gmail.com>)
Ответы Re: making relfilenodes 56 bits  (Ashutosh Sharma <ashu.coek88@gmail.com>)
Список pgsql-hackers
On Tue, Jul 26, 2022 at 6:06 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:

Hi,
Note: please avoid top posting.

>         /*
>          * If relfilenumber is unspecified by the caller then create storage
> -        * with oid same as relid.
> +        * with relfilenumber same as relid if it is a system table otherwise
> +        * allocate a new relfilenumber.  For more details read comments atop
> +        * FirstNormalRelFileNumber declaration.
>          */
>         if (!RelFileNumberIsValid(relfilenumber))
> -           relfilenumber = relid;
> +       {
> +           relfilenumber = relid < FirstNormalObjectId ?
> +               relid : GetNewRelFileNumber();
>
> Above code says that in the case of system table we want relfilenode to be the same as object id. This technically
meansthat the relfilenode or oid for the system tables would not be exceeding 16383. However in the below lines of code
addedin the patch, it says there is some chance for the storage path of the user tables from the old cluster
conflictingwith the storage path of the system tables in the new cluster. Assuming that the OIDs for the user tables on
theold cluster would start with 16384 (the first object ID), I see no reason why there would be a conflict. 


Basically, the above comment says that the initial system table
storage will be created with the same relfilenumber as Oid so you are
right that will not exceed 16383.  And below code is explaining the
reason that in order to avoid the conflict with the user table from
the older cluster we do it this way.  Otherwise, in the new design, we
have no intention to keep the relfilenode same as Oid.  But during an
upgrade from the older cluster which is not following this new design
might have user table relfilenode which can conflict with the system
table in the new cluster so we have to ensure that with the new design
also when creating the initial cluster we keep the system table
relfilenode in low range and directly using Oid is the best idea for
this purpose instead of defining the completely new range and
maintaining a separate counter for that.

> +/* ----------
> + * RelFileNumber zero is InvalidRelFileNumber.
> + *
> + * For the system tables (OID < FirstNormalObjectId) the initial storage
> + * will be created with the relfilenumber same as their oid.  And, later for
> + * any storage the relfilenumber allocated by GetNewRelFileNumber() will start
> + * at 100000.  Thus, when upgrading from an older cluster, the relation storage
> + * path for the user table from the old cluster will not conflict with the
> + * relation storage path for the system table from the new cluster.  Anyway,
> + * the new cluster must not have any user tables while upgrading, so we needn't
> + * worry about them.
> + * ----------
> + */
> +#define FirstNormalRelFileNumber   ((RelFileNumber) 100000)
>
> ==
>
> When WAL logging the next object id we have the chosen the xlog threshold value as 8192 whereas for relfilenode it is
512.Any reason for choosing this low arbitrary value in case of relfilenumber? 

For Oid when we cross the max value we will wraparound, whereas for
relfilenumber we can not expect the wraparound for cluster lifetime.
So it is better not to log forward a really large number of
relfilenumber as we do for Oid.  OTOH if we make it really low like 64
then we can is RelFIleNumberGenLock in wait event in very high
concurrency where from 32 backends we are continuously
creating/dropping tables.  So we thought of choosing this number 512
so that it is not very low that can create the lock contention and it
is not very high so that we need to worry about wasting those many
relfilenumbers on the crash.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Justin Pryzby
Дата:
Сообщение: Re: Restructure ALTER TABLE notes to clarify table rewrites and verification scans
Следующее
От: "David G. Johnston"
Дата:
Сообщение: Re: doc: Clarify Savepoint Behavior