Обсуждение: Allow table AM's to cache stuff in relcache

Поиск
Список
Период
Сортировка

Allow table AM's to cache stuff in relcache

От
Heikki Linnakangas
Дата:
Index AM's can cache stuff in RelationData->rd_amcache. In the zedstore 
table AM we've been hacking on, I'd like to also use rd_amcache to cache 
some information, but that's not currently possible, because rd_amcache 
can only be used by index AMs, not table AMs.

Attached patch allows rd_amcache to also be used by table AMs.

While working on this, I noticed that the memory management of relcache 
entries is quite complicated. Most stuff that's part of a relcache entry 
is allocated in CacheMemoryContext. But some fields have a dedicated 
memory context to hold them, like rd_rulescxt for rules and rd_pdcxt for 
partition information. And indexes have rd_indexcxt to hold all kinds of 
index support info.

In the patch, I documented that rd_amcache must be allocated in 
CacheMemoryContext, or in rd_indexcxt if it's an index. It works, but 
it's a bit weird. It would nice to have one memory context in every 
relcache entry, to hold all the stuff related to it, including 
rd_amcache. In other words, it would be nice if we had "rd_indexcxt" for 
tables, too, not just indexes. That would allow tracking memory usage 
more accurately, if you're debugging an out of memory situation for example.

However, the special contexts like rd_rulescxt and rd_pdcxt would still 
be needed, because of the way RelationClearRelation preserves them, when 
rebuilding the relcache entry for an open relation. So I'm not sure how 
much it would really simplify things. Also, there's some overhead for 
having extra memory contexts, and some people already complain that the 
relcache uses too much memory.

Alternatively, we could document that rd_amcache should always be 
allocated in CacheMemoryContext, even for indexes. That would make the 
rule for pg_amcache straightforward. There's no particular reason why 
rd_amcache has to be allocated in rd_indexcxt, except for how it's 
accounted for in memory context dumps.

- Heikki

Вложения

Re: Allow table AM's to cache stuff in relcache

От
Tom Lane
Дата:
Heikki Linnakangas <hlinnaka@iki.fi> writes:
> Index AM's can cache stuff in RelationData->rd_amcache. In the zedstore 
> table AM we've been hacking on, I'd like to also use rd_amcache to cache 
> some information, but that's not currently possible, because rd_amcache 
> can only be used by index AMs, not table AMs.
> Attached patch allows rd_amcache to also be used by table AMs.

Seems reasonable.

> In the patch, I documented that rd_amcache must be allocated in 
> CacheMemoryContext, or in rd_indexcxt if it's an index. It works, but 
> it's a bit weird.

Given the way the patch is implemented, it doesn't really matter which
context it's in, does it?  The retail pfree is inessential but also
harmless, if rd_amcache is in rd_indexcxt.  So we could take out the
"must".  I think it's slightly preferable to use rd_indexcxt if available,
to reduce the amount of loose junk in CacheMemoryContext.

> It would nice to have one memory context in every 
> relcache entry, to hold all the stuff related to it, including 
> rd_amcache. In other words, it would be nice if we had "rd_indexcxt" for 
> tables, too, not just indexes. That would allow tracking memory usage 
> more accurately, if you're debugging an out of memory situation for example.

We had some discussion related to that in the "hyrax
vs. RelationBuildPartitionDesc" thread.  I'm not quite sure where
we'll settle on that, but some redesign seems inevitable.

            regards, tom lane



Re: Allow table AM's to cache stuff in relcache

От
Julien Rouhaud
Дата:
On Fri, Jun 14, 2019 at 5:40 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Heikki Linnakangas <hlinnaka@iki.fi> writes:
> > Index AM's can cache stuff in RelationData->rd_amcache. In the zedstore
> > table AM we've been hacking on, I'd like to also use rd_amcache to cache
> > some information, but that's not currently possible, because rd_amcache
> > can only be used by index AMs, not table AMs.
> > Attached patch allows rd_amcache to also be used by table AMs.
>
> Seems reasonable.

+1.

> > In the patch, I documented that rd_amcache must be allocated in
> > CacheMemoryContext, or in rd_indexcxt if it's an index. It works, but
> > it's a bit weird.
>
> Given the way the patch is implemented, it doesn't really matter which
> context it's in, does it?  The retail pfree is inessential but also
> harmless, if rd_amcache is in rd_indexcxt.  So we could take out the
> "must".  I think it's slightly preferable to use rd_indexcxt if available,
> to reduce the amount of loose junk in CacheMemoryContext.

I agree that for indexes the context used won't make much difference.
But IMHO avoiding some bloat in CacheMemoryContext is a good enough
reason to document using rd_indexcxt when available.

> > It would nice to have one memory context in every
> > relcache entry, to hold all the stuff related to it, including
> > rd_amcache. In other words, it would be nice if we had "rd_indexcxt" for
> > tables, too, not just indexes. That would allow tracking memory usage
> > more accurately, if you're debugging an out of memory situation for example.
>
> We had some discussion related to that in the "hyrax
> vs. RelationBuildPartitionDesc" thread.  I'm not quite sure where
> we'll settle on that, but some redesign seems inevitable.

There wasn't any progress on this since last month, and this patch
won't make the situation any worse.  I'll mark this patch as ready for
committer, as it may save some time for people working on custom table
AM.



Re: Allow table AM's to cache stuff in relcache

От
Heikki Linnakangas
Дата:
On 12/07/2019 16:07, Julien Rouhaud wrote:
> On Fri, Jun 14, 2019 at 5:40 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Heikki Linnakangas <hlinnaka@iki.fi> writes:
>>> In the patch, I documented that rd_amcache must be allocated in
>>> CacheMemoryContext, or in rd_indexcxt if it's an index. It works, but
>>> it's a bit weird.
>>
>> Given the way the patch is implemented, it doesn't really matter which
>> context it's in, does it?  The retail pfree is inessential but also
>> harmless, if rd_amcache is in rd_indexcxt.  So we could take out the
>> "must".  I think it's slightly preferable to use rd_indexcxt if available,
>> to reduce the amount of loose junk in CacheMemoryContext.
> 
> I agree that for indexes the context used won't make much difference.
> But IMHO avoiding some bloat in CacheMemoryContext is a good enough
> reason to document using rd_indexcxt when available.

Right, it doesn't really matter whether an index AM uses 
CacheMemoryContext or rd_indexctx, the code works either way. I think 
it's better to give clear advice though, one way or another. Otherwise, 
different index AM's can end up doing it differently for no particular 
reason, which seems confusing.

>>> It would nice to have one memory context in every
>>> relcache entry, to hold all the stuff related to it, including
>>> rd_amcache. In other words, it would be nice if we had "rd_indexcxt" for
>>> tables, too, not just indexes. That would allow tracking memory usage
>>> more accurately, if you're debugging an out of memory situation for example.
>>
>> We had some discussion related to that in the "hyrax
>> vs. RelationBuildPartitionDesc" thread.  I'm not quite sure where
>> we'll settle on that, but some redesign seems inevitable.
> 
> There wasn't any progress on this since last month, and this patch
> won't make the situation any worse.  I'll mark this patch as ready for
> committer, as it may save some time for people working on custom table
> AM.

Pushed, thanks for the review! As Tom noted, some redesign here seems 
inevitable, but this patch shouldn't get in the way of that, so no need 
to hold this back for the redesign.

- Heikki