Re: [HACKERS] Faster methods for getting SPI results (460%improvement)

Поиск

Список

Период

Сортировка

От	Jim Nasby
Тема	Re: [HACKERS] Faster methods for getting SPI results (460%improvement)
Дата	25 января 2017 г. 07:43:27
Msg-id	7f5fddde-1c13-f253-c161-9075ec096a28@BlueTreble.com обсуждение исходный текст
Ответ на	Re: [HACKERS] Faster methods for getting SPI results (460% improvement) (Craig Ringer <craig@2ndquadrant.com>)
Ответы	Re: [HACKERS] Faster methods for getting SPI results (460%improvement)
Список	pgsql-hackers

Дерево обсуждения

On 1/23/17 10:36 PM, Craig Ringer wrote:
> which is currently returned as
>
> [ {"a":1, "b":10}, {"a":2, "b":20} ]
>
> instead as
>
> { "a": [1, 2], "b": [10, 20] }

Correct.

> If so I see that as a lot more of a niche thing. I can see why it'd be
> useful and would help performance, but it seems much more disruptive.
> It requires users to discover it exists, actively adopt a different
> style of ingesting data, etc. For a 10%-ish gain in a PL.

In data science, what we're doing now is actually the niche. All real 
analytics happens with something like a Pandas DataFrame, which is 
organized as a dict of lists.

This isn't just idle nomenclature either: organizing results in what 
amounts to a column store provides a significant speed improvement for 
most analytics, because you're working on an array of contiguous memory 
(at least, when you're using more advanced types like DataFrames and 
Series).

> I strongly suggest making this design effort a separate thread, and
> focusing on the SPI improvements that give "free" no-user-action
> performance boosts here.

Fair enough. I posted the SPI portion of that yesterday. That should be 
useful for pl/R and possibly pl/perl. pl/tcl could make use of it, but 
it would end up executing arbitrary tcl code in the middle of portal 
execution, which doesn't strike me as a great idea. Unfortunately, I 
don't think plpgsql could make much use of this for similar reasons.

I'll post a plpython patch that doesn't add the output format control.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [HACKERS] Faster methods for getting SPI results (460%improvement)