On Fri, Dec 18, 2020 at 11:54:39AM -0600, Justin Pryzby wrote:
> On Fri, Dec 18, 2020 at 07:39:14AM +0530, Bharath Rupireddy wrote:
> > On Fri, Dec 18, 2020 at 2:14 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
> > > Are you thinking that TableInsertState would eventually have additional
> > > attributes which would apply to other tableams, but not to heap ? Is
> > > heap_insert_begin() really specific to heap ? It's allocating and populating a
> > > structure based on its arguments, but those same arguments would be passed to
> > > every other AM's insert_begin routine, too. Do you need a more flexible data
> > > structure, something that would also accomodate extensions? I'm thinking of
> > > reloptions as a loose analogy.
> >
> > I could not think of other tableam attributes now. But +1 to have that
> > kind of flexible structure for TableInsertState. So, it can have
> > tableam type and attributes within the union for each type.
>
> Right now you have heap_insert_begin(), and I asked if it was really
> heap-specific. Right now, it populates a struct based on a static list of
> arguments, which are what heap uses.
>
> If you were to implement a burp_insert_begin(), how would it differ from
> heap's? With the current API, they'd (have to) be the same, which means either
> that it should apply to all AMs (or have a "default" implementation that can be
> overridden by an AM), or that this API assumes that other AMs will want to do
> exactly what heap does, and fails to allow other AMs to implement optimizations
> for bulk inserts as claimed.
>
> I don't think using a "union" solves the problem, since it can only accommodate
> core AMs, and not extensions, so I suggested something like reloptions, which
> have a "namespace" prefix (and core has toast.*, like ALTER TABLE t SET
> toast.autovacuum_enabled).
I think you'd want to handle things like:
- a compressed AM wants to specify a threshold for a tuple's *compressed* size
(maybe in addition to the uncompressed size);
- a "columnar" AM wants to specify a threshold size for a column, rather
than for each tuple;
I'm not proposing to handle those specific parameters, but rather pointing out
that your implementation doesn't allow handling AM-specific considerations,
which I think was the goal.
The TableInsertState structure would need to store those, and then the AM's
multi_insert_v2 routine would need to make use of them.
It feels a bit like we'd introduce the idea of an "AM option", except that it
wouldn't be user-facing (or maybe some of them would be?). Maybe I've
misunderstood though, so other opinions are welcome.
--
Justin