Re: New Table Access Methods for Multi and Single Inserts
От | Luc Vlaming |
---|---|
Тема | Re: New Table Access Methods for Multi and Single Inserts |
Дата | |
Msg-id | c2ba9c52-8c5a-13f0-7d50-b569f7c93840@swarm64.com обсуждение исходный текст |
Ответ на | Re: New Table Access Methods for Multi and Single Inserts (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>) |
Список | pgsql-hackers |
On 06-01-2021 14:06, Bharath Rupireddy wrote: > On Wed, Jan 6, 2021 at 12:56 PM Luc Vlaming <luc@swarm64.com> wrote: >> The main reason for me for wanting a single API is that I would like the >> decision of using single or multi inserts to move to inside the tableam. >> For e.g. a heap insert we might want to put the threshold at e.g. 100 >> rows so that the overhead of buffering the tuples is actually >> compensated. For other tableam this logic might also be quite different, >> and I think therefore that it shouldn't be e.g. COPY or CTAS deciding >> whether or not multi inserts should be used. Because otherwise the thing >> we'll get is that there will be tableams that will ignore this flag and >> do their own thing anyway. I'd rather have an API that gives all >> necessary information to the tableam and then make the tableam do "the >> right thing". >> >> Another reason I'm suggesting this API is that I would expect that the >> begin is called in a different place in the code for the (multiple) >> inserts than the actual insert statement. >> To me conceptually the begin and end are like e.g. the executor begin >> and end: you prepare the inserts with the knowledge you have at that >> point. I assumed (wrongly?) that during the start of the statement one >> knows best how many rows are coming; and then the actual insertion of >> the row doesn't have to deal anymore with multi/single inserts, choosing >> when to buffer or not, because that information has already been given >> during the initial phase. One of the reasons this is appealing to me is >> that e.g. in [1] there was discussion on when to switch to a multi >> insert state, and imo this should be up to the tableam. > > Agree that whether to go with the multi or single inserts should be > completely left to tableam implementation, we, as callers of those API > just need to inform whether we expect single or multiple rows, and it > should be left to tableam implementation whether to actually go with > buffering or single inserts. ISTM that it's an elegant way of making > the API generic and abstracting everything from the callers. What I > wonder is how can we know in advance the expected row count that we > need to pass in to heap_insert_begin()? IIUC, we can not estimate the > upcoming rows in COPY, Insert Into Select, or Refresh Mat View or some > other insert queries? Of course, we can look at the planner's > estimated row count for the selects in COPY, Insert Into Select or > Refresh Mat View after the planning, but to me that's not something we > can depend on and pass in the row count to the insert APIs. > > When we don't know the expected row count, why can't we(as callers of > the APIs) tell the APIs something like, "I'm intending to perform > multi inserts, so if possible and if you have a mechanism to buffer > the slots, do it, otherwise insert the tuples one by one, or else do > whatever you want to do with the tuples I give it you". So, in case of > COPY we can ask the API for multi inserts and call heap_insert_begin() > and heap_insert_v2(). > I thought that when it is available (because of planning) it would be nice to pass it in. If you don't know you could pass in a 1 for doing single inserts, and e.g. -1 or max-int for streaming. The reason I proposed it is so that tableam's have as much knowledge as posisble to do the right thing. is_multi does also work of course but is just somewhat less informative. What to me seemed somewhat counterintuitive is that with the proposed API it is possible to say is_multi=true and then still call heap_insert_v2 to do a single insert. > Given the above explanation, I still feel bool is_multi would suffice. > > Thoughts? > > On dynamically, switching from single to multi inserts, this can be > done by heap_insert_v2 itself. The way I think it's possible is that, > say we have some threshold row count 1000(can be a macro) after > inserting those many tuples, heap_insert_v2 can switch to buffering > mode. For that I thought it'd be good to use the expected row count, but yeah dynamically switching also works and might work better if the expected row counts are usually off. > > Thoughts? > >> Which would make the code something like: >> >> void >> heap_multi_insert_v2(TableInsertState *state, TupleTableSlot *slot) >> { >> TupleTableSlot *batchslot; >> HeapMultiInsertState *mistate = (HeapMultiInsertState *)state->mistate; >> Size sz; >> >> Assert(mistate && mistate->slots); >> >> if (mistate->slots[mistate->cur_slots] == NULL) >> mistate->slots[mistate->cur_slots] = >> table_slot_create(state->rel, NULL); >> >> batchslot = mistate->slots[mistate->cur_slots]; >> >> ExecClearTuple(batchslot); >> ExecCopySlot(batchslot, slot); >> >> /* >> * Calculate the tuple size after the original slot is copied, because the >> * copied slot type and the tuple size may change. >> */ >> sz = GetTupleSize(batchslot, mistate->max_size); >> >> Assert(sz > 0); >> >> mistate->cur_slots++; >> mistate->cur_size += sz; >> >> if (mistate->cur_slots >= mistate->max_slots || >> mistate->cur_size >= mistate->max_size) >> heap_multi_insert_flush(state); >> } > > I think clearing tuples before copying the slot as you suggested may > work without the need of clear_slots flag. ok, cool :) > >> >>> > Also, why do we want to do ExecClearTuple() anyway? Isn't >>> > it good enough that the next call to ExecCopySlot will effectively clear >>> > it out? >>> >>> For virtual, heap, minimal tuple slots, yes ExecCopySlot slot clears the >>> slot before copying. But, for buffer heap slots, the >>> tts_buffer_heap_copyslot does not always clear the destination slot, see >>> below. If we fall into else condition, we might get some issues. And >>> also note that, once the slot is cleared in ExecClearTuple, it will not >>> be cleared again in ExecCopySlot because TTS_SHOULDFREE(slot) will be >>> false. That is why, let's have ExecClearTuple as is. >>> >> I had no idea the buffer heap slot doesn't unconditionally clear out the >> slot :( So yes lets call it unconditionally ourselves. See also >> suggestion above. > > Yeah, we will clear the tuple slot before copy to be on the safer side. > ok >>> /* >>> * If the source slot is of a different kind, or is a buffer slot >>> that has >>> * been materialized / is virtual, make a new copy of the tuple. >>> Otherwise >>> * make a new reference to the in-buffer tuple. >>> */ >>> if (dstslot->tts_ops != srcslot->tts_ops || >>> TTS_SHOULDFREE(srcslot) || >>> !bsrcslot->base.tuple) >>> { >>> MemoryContext oldContext; >>> >>> ExecClearTuple(dstslot); >>> } >>> else >>> { >>> Assert(BufferIsValid(bsrcslot->buffer)); >>> >>> tts_buffer_heap_store_tuple(dstslot, bsrcslot->base.tuple, >>> bsrcslot->buffer, false); >>> >>> > - flushed -> why is this a stored boolean? isn't this indirectly encoded >>> > by cur_slots/cur_size == 0? >>> >>> Note that cur_slots is in HeapMultiInsertState and outside of the new >>> APIs i.e. in TableInsertState, mistate is a void pointer, and we can't >>> really access the cur_slots. I mean, we can access but we need to be >>> dereferencing using the tableam kind. Instead of doing all of that, to >>> keep the API cleaner, I chose to have a boolean in the TableInsertState >>> which we can see and use outside of the new APIs. Hope that's fine. >>> >> So you mean the flushed variable is actually there to tell the user of >> the API that they are supposed to call flush before end? Why can't the >> end call flush itself then? I guess I completely misunderstood the >> purpose of table_multi_insert_flush being public. I had assumed it is >> there to from the usage site indicate that now would be a good time to >> flush, e.g. because of a statement ending or something. I had not >> understood this is a requirement that its always required to do >> table_multi_insert_flush + table_insert_end. >> IMHO I would hide this from the callee, given that you would only really >> call flush yourself when you immediately after would call end, or are >> there other cases where one would be required to explicitly call flush? > > We need to know outside the multi_insert API whether the buffered > slots in case of multi inserts are flushed. Reason is that if we have > indexes or after row triggers, currently we call ExecInsertIndexTuples > or ExecARInsertTriggers on the buffered slots outside the API in a > loop after the flush. > > If we agree on removing heap_multi_insert_v2 API and embed that logic > inside heap_insert_v2, then we can do this - pass the required > information and the functions ExecInsertIndexTuples and > ExecARInsertTriggers as callbacks so that, whether or not > heap_insert_v2 choses single or multi inserts, it can callback these > functions with the required information passed after the flush. We can > add the callback and required information into TableInsertState. But, > I'm not quite sure, we would make ExecInsertIndexTuples and > ExecARInsertTriggers. And in > > If we don't want to go with callback way, then at least we need to > know whether or not heap_insert_v2 has chosen multi inserts, if yes, > the buffered slots array, and the number of current buffered slots, > whether they are flushed or not in the TableInsertState. Then, > eventually, we might need all the HeapMultiInsertState info in the > TableInsertState. > To me the callback API seems cleaner, that on heap_insert_begin we can pass in a callback that is called on every flushed slot, or only on multi-insert flushes. Is there a reason it would only be done for multi-insert flushes or can it be generic? > Thoughts? > > With Regards, > Bharath Rupireddy. > EnterpriseDB: http://www.enterprisedb.com > Hi, Replied inline. Kind regards, Luc
В списке pgsql-hackers по дате отправления: