Heikki Linnakangas wrote:
> Teodor Sigaev wrote:
>>> Ok. Probably easiest to do that by changing the palloc to palloc0 in
>>> parse_tsquery.
>> and change sizeof to sizeof(QueryItem)
>
> Do you mean the sizeofs in the memcpys in parse_tsquery? You can't
Oops, I meant pallocs in push* function. palloc0 in parse_tsquery is another way.
>
> BTW, can you explain what the CRC-32 of a value is used for? It looks
> like it's used to speed up some operations, by comparing the CRCs before
> comparing the values, but I didn't quite figure out how it works. How
It's mostly used in GiST indexes - recalculating crc32 every time for each index
tuple to be checked is rather expensive.
> much of a performance difference does it make? Would hash_any do a
> better/cheaper job?
crc32 was chosen after testing a lot of hash function. Perl's hash was the
fastest, but crc32 makes much less number of collisions. That's interesting for
ASCII a lot of functions produce rather small number of collision, but for upper
part of table (0x7f-0xff) crc32 was the best. CRC32 has evenly distributed
collisions over characters, others - not.
> In any case, I think we need to calculate the CRC/hash in tsqueryrecv,
> instead of trusting the client.
Agreed.
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/