Re: [GENERAL] Perfomance of IN-clause with many elements and possible solutions

Поиск
Список
Период
Сортировка
От Jeff Janes
Тема Re: [GENERAL] Perfomance of IN-clause with many elements and possible solutions
Дата
Msg-id CAMkU=1yD+5vLUyJuzML6DYvPKMrAxpD_eKsmL=2JqdC81uLejQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [GENERAL] Perfomance of IN-clause with many elements and possiblesolutions  (Dmitry Lazurkin <dilaz03@gmail.com>)
Список pgsql-general
On Tue, Aug 1, 2017 at 9:24 AM, Dmitry Lazurkin <dilaz03@gmail.com> wrote:
On 08/01/2017 07:13 PM, Jeff Janes wrote:
I think that HashSet is a Java-specific term.  It is just a hash table in which there is no data to store, just the key itself (and probably a cash of the hashcode of that key), correct? 

Yes. And in Java HashSet implemented on top of HashMap (:

I think a more general solution would be to get the planner and executor to run the in-list query using the Hash Join, the same way it runs the in-VALUES one.

Have additional plan nodes big overhead?

I don't think a new plan node will have meaningfully more overhead than new code of equal generality and robustness which has just been bolted on to the side of an existing node.  I don't know how to test that, though.  If the existing hash code is slow, it would probably be better to spend time improving it for everyone than coming up with yet another implementation.  I know the existing simplehash.h code as specialized for tuples in src/backend/executor/execGrouping.c is horribly inefficient with memory usage when the tuples are skinny, but that shouldn't be a problem for the number of values likely to be present in a hard-coded in-list.
 


I was impressed at how well the JSON and hstore worked, you might want to look at how they do it.  It is must be using an internal hash table of some sort.

JSONB and HSTORE keep sorted pairs and use binary search.

Ah.  that would be another way to approach it.  How many types have btree ordering functions without hashing functions, or the reverse?

Cheers,

Jeff

В списке pgsql-general по дате отправления:

Предыдущее
От: Merlin Moncure
Дата:
Сообщение: Re: [GENERAL] Shared Constants in PLPGSQL
Следующее
От: Alex Samad
Дата:
Сообщение: Re: [GENERAL] Question about loading up a table