Re: [HACKERS] Faster methods for getting SPI results (460%improvement)

Поиск
Список
Период
Сортировка
От Jim Nasby
Тема Re: [HACKERS] Faster methods for getting SPI results (460%improvement)
Дата
Msg-id c537b2e7-38cd-0507-2255-69541c9da7b9@BlueTreble.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Faster methods for getting SPI results (460%improvement)  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Список pgsql-hackers
On 1/23/17 9:23 PM, Jim Nasby wrote:
> I think the last step here is to figure out how to support switching
> between the current behavior and the "columnar" behavior of a dict of lists.

I've thought more about this... instead of trying to switch from the 
current situation of 1 choice of how results are return to 2 choices, I 
think it'd be better to just expose the API that the new Destination 
type provides to SPI. Specifically, execute a python function during 
Portal startup, and a different function for receiving tuples. There'd 
be an optional 3rd function for Portal shutdown.

The startup function would be handed details of the resultset it was 
about to receive, as a list that contained python tuples with the 
results of SPI_fname, _gettype, _gettypeid. This function would return a 
callback version number and a python object that would be kept in the 
DestReceiver.

The receiver function would get the object created by the startup 
function, as well as a python tuple of the TupleTableSlot that had gone 
through type conversion. It would need to add the value to the object 
from the startup function. It would return true or false, just like a 
Portal receiver function does.

The shutdown function would receive the object that's been passed 
around. It would be able to do any post-processing. Whatever it returned 
is what would be handed back to python as the results of the query.

The version number returned by the startup function allows for future 
improvements to this facility. One idea there is allowing the startup 
function to control how Datums get mapped into python objects.

In order to support all of this without breaking backwards compatibility 
or forking a new API, plpy.execute would accept a kwdict, to avoid 
conflicting with the arbitrary number of arguments that can currently be 
accepted. We'd look in the kwdict for a key called "portal_functions" 
pointing at a 2 or 3 element tuple of the startup, receive and shutdown 
functions. plpy would pre-define a tuple that provides the current 
behavior, and that's what would be used by default. In the future, we 
might add a way to control the default.

Comments?
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Nico Williams
Дата:
Сообщение: Re: [HACKERS] Idea on how to simplify comparing two sets
Следующее
От: Jeff Janes
Дата:
Сообщение: [HACKERS] Poor memory context performance in large hash joins