Re: making relfilenodes 56 bits

Поиск
Список
Период
Сортировка
От Matthias van de Meent
Тема Re: making relfilenodes 56 bits
Дата
Msg-id CAEze2WgKaNhzckxGdjQ_V13E2nW6CiNyegHXWU+pAyGCnO6QXg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: making relfilenodes 56 bits  (Simon Riggs <simon.riggs@enterprisedb.com>)
Список pgsql-hackers
On Wed, 29 Jun 2022 at 14:41, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>
> On Tue, 28 Jun 2022 at 19:18, Matthias van de Meent
> <boekewurm+postgres@gmail.com> wrote:
>
> > I will be the first to admit that it is quite unlikely to be common
> > practise, but this workload increases the number of dbOid+spcOid
> > combinations to 100s (even while using only a single tablespace),
>
> Which should still fit nicely in 32bits then. Why does that present a
> problem to this idea?

It doesn't, or at least not the bitspace part. I think it is indeed
quite unlikely anyone will try to build as many tablespaces as the 100
million tables project, which utilized 1000 tablespaces to get around
file system limitations [0].

The potential problem is 'where to store such mapping efficiently'.
Especially considering that this mapping might (and likely: will)
change across restarts and when database churn (create + drop
database) happens in e.g. testing workloads.

> The reason to mention this now is that it would give more space than
> 56bit limit being suggested here. I am not opposed to the current
> patch, just finding ways to remove some objections mentioned by
> others, if those became blockers.
>
> > which in my opinion requires some more thought than just handwaving it
> > into an smgr array and/or checkpoint records.
>
> The idea is that we would store the mapping as an array, with the
> value in the RelFileNode as the offset in the array. The array would
> be mostly static, so would cache nicely.

That part is not quite clear to me. Any cluster may have anywhere
between 3 and hundreds or thousands of entries in that mapping. Do you
suggest to dynamically grow that (presumably shared, considering the
addressing is shared) array, or have a runtime parameter limiting the
amount of those entries (similar to max_connections)?

> For convenience, I imagine that the mapping could be included in WAL
> in or near the checkpoint record, to ensure that the mapping was
> available in all backups.

Why would we need this mapping in backups, considering that it seems
to be transient state that is lost on restart? Won't we still use full
dbOid and spcOid in anything we communicate or store on disk (file
names, WAL, pg_class rows, etc.), or did I misunderstand your
proposal?

Kind regards,

Matthias van de Meent


[0] https://www.pgcon.org/2013/schedule/attachments/283_Billion_Tables_Project-PgCon2013.pdf



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Strange failures on chipmunk
Следующее
От: Nathan Bossart
Дата:
Сообщение: Re: replacing role-level NOINHERIT with a grant-level option