On 09/05/14 15:34, Bruce Momjian wrote:
> On Thu, May 8, 2014 at 06:39:11PM -0400, Tom Lane wrote:
>> I wrote:
>>> I think the idea of hashing only keys/values that are "too long" is a
>>> reasonable compromise. I've not finished coding it (because I keep
>>> getting distracted by other problems in the code :-() but it does not
>>> look to be very difficult. I'm envisioning the cutoff as being something
>>> like 128 bytes; in practice that would mean that few if any keys get
>>> hashed, I think.
>> Attached is a draft patch for this. In addition to the hash logic per se,
>> I made these changes:
>>
>> * Replaced the K/V prefix bytes with a code that distinguishes the types
>> of JSON values. While this is not of any huge significance for the
>> current index search operators, it's basically free to store the info,
>> so I think we should do it for possible future use.
>>
>> * Fixed the problem with "exists" returning rows it shouldn't. I
>> concluded that the best fix is just to force recheck for exists, which
>> allows considerable simplification in the consistent functions.
>>
>> * Tried to improve the comments in jsonb_gin.c.
>>
>> Barring objections I'll commit this tomorrow, and also try to improve the
>> user-facing documentation about the jsonb opclasses.
> Looks good. I was thinking the jsonb_ops name could remain unchanged
> and the jsonb_hash_ops could be called jsonb_combo_ops as it combines
> the key and value into a single index entry.
>
If you have 'jsonb_combo_ops' - then surely 'jsonb_op' should be called
'jsonb_xxx_ops', where the 'xxx' distinguishes that from
'jsonb_combo_ops'? I guess, if any appropriate wording of 'xxx' was too
cumbersome, then it would be worse.
Cheers,
Gavin