Re: Reducing the size of BufferTag & remodeling forks

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Reducing the size of BufferTag & remodeling forks
Дата
Msg-id 20150702140740.GD16267@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: Reducing the size of BufferTag & remodeling forks  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On 2015-07-02 09:51:59 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > 1) Introduce a shared pg_relfilenode table. Every table, even
> >    shared/nailed ones, get an entry therein. It's there to make it
> >    possibly to uniquely allocate relfilenodes across databases &
> >    tablespaces.
> > 2) Replace relation forks, with the exception of the init fork which is
> >    special anyway, with separate relfilenodes. Stored in seperate
> >    columns in pg_class.
> 
> > Thoughts?
> 
> I'm concerned about the traffic and contention involved with #1.

I don't think that'll be that significant in comparison to all the other
work done when creating a relation. Unless we do something wrong it'll
be highly unlikely to get row level contention, as the oids of the
individual relations will be from the oid counter or something similar.

> I'm also concerned about the assumption that relfilenode should,
> or even can be, unique across an entire installation.  (I suppose
> widening it to 8 bytes would fix some of the hazards there, but
> that bloats your buffer tag again.)

Why? Because it limits the number of relations & forks we can have to
2**32? That seems like an extraordinary large limit? The catalog sizes
(pg_attribute most prominently) are a problem at a much lower number of
relations than that. Also rel/catcache management.

> But here's the big problem: you're talking about a huge amount of
> work for what seems likely to be a microscopic improvement in some
> operations.

I don't think it's microscopic at all. Just hacking away database &
tablespace from hashing & comparisons in the buffer tag (obviously not a
correct thing, but works enough for pgbench) results in quite measurable
performance benefits. But the main point isn't the performance
improvements themselves, but that it opens the door to smarter buffer
mapping algorithms, which e.g. will allow ordered access.  Also not
having the current problem with increasing the number of forks would be
good.

> Worse, we'll be taking penalties for other operations.
> How will you do DropDatabaseBuffers() for instance?

> CREATE DATABASE is going to be a problem, too.

More promently than that, without access to the database/tablespace we
couldn't even write out dirty buffers in a reasonable manner.  That's
why I think we're going to have to continue storing those two in the
buffer descriptors, just not include them in the buffer mapping.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: drop/truncate table sucks for large values of shared buffers
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: drop/truncate table sucks for large values of shared buffers