Re: Do we want a hashset type?

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Do we want a hashset type?
Дата
Msg-id 79337bf2-a783-d424-da57-bece3d64528c@enterprisedb.com
обсуждение исходный текст
Ответ на Re: Do we want a hashset type?  (Andrew Dunstan <andrew@dunslane.net>)
Ответы Re: Do we want a hashset type?
Re: Do we want a hashset type?
Список pgsql-hackers

On 6/18/23 18:45, Andrew Dunstan wrote:
> 
> On 2023-06-16 Fr 20:38, Joel Jacobson wrote:
>>
>> New patch is attached, which will henceforth always be a complete patch,
>> to avoid the hassle of having to assemble incremental patches.
> 
> 
> Cool, thanks.
> 

It might still be convenient to keep it split into smaller, easier to
review, parts. A patch that introduces basic functionality and then
patches adding various "advanced" features.

> 
> A couple of random thoughts:
> 
> 
> . It might be worth sending a version number with the send function
> (c.f. jsonb_send / jsonb_recv). That way would would not be tied forever
> to some wire representation.
> 
> . I think there are some important set operations missing: most notably
> intersection, slightly less importantly asymmetric and symmetric
> difference. I have no idea how easy these would be to add, but even for
> your stated use I should have thought set intersection would be useful
> ("Who is a member of both this set of friends and that set of friends?").
> 
> . While supporting int4 only is OK for now, I think we would at least
> want to support int8, and probably UUID since a number of systems I know
> of use that as an object identifier.
> 

I agree we should aim to support a wider range of data types. Could we
have a polymorphic type, similar to what we do for arrays and ranges? In
fact, CREATE TYPE allows specifying ELEMENT, so wouldn't it be possible
to implement this as a special variant of an array? Would be better than
having a set of functions for every supported data type.

(Note: It might still be possible to have a special implementation for
selected fixed-length data types, as it allows optimization at compile
time. But that could be done later.)


The other thing I've been thinking about is the SQL syntax and what does
the SQL standard says about this.

AFAICS the standard only defines arrays and multisets. Arrays are pretty
much the thing we have, including the ARRAY[] constructor etc. Multisets
are similar to hashset discussed here, except that it tracks the number
of elements for each value (which would be trivial in hashset).

So if we want to make this a built-in feature, maybe we should aim to do
the multiset thing, with the standard SQL syntax? Extending the grammar
should not be hard, I think. I'm not sure of the underlying code
(ArrayType, ARRAY_SUBLINK stuff, etc.) we could reuse or if we'd need a
lot of separate code doing that.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tommy Pavlicek
Дата:
Сообщение: Re: [PATCH] ltree hash functions
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: Support logical replication of DDLs