On Mon, Jun 21, 2021 at 02:08:12PM +0100, Simon Riggs wrote:
> New chapter for Hash Indexes, designed to help users understand how
> they work and when to use them.
>
> Mostly newly written, but a few paras lifted from README when they were helpful.
+ <para>
+ PostgreSQL includes an implementation of persistent on-disk hash indexes,
+ which are now fully crash recoverable. Any data type can be indexed by a
I don't see any need to mention that they're "now" crash safe.
+ Each hash index tuple stores just the 4-byte hash value, not the actual
+ column value. As a result, hash indexes may be much smaller than B-trees
+ when indexing longer data items such as UUIDs, URLs etc.. The absence of
comma:
URLs, etc.
+ the column value also makes all hash index scans lossy. Hash indexes may
+ take part in bitmap index scans and backward scans.
Isn't it more correct to say that it must use a bitmap scan?
+ through the tree until the leaf page is found. In tables with millions
+ of rows this descent can increase access time to data. The equivalent
rows comma
+ that hash value. When scanning a hash bucket during queries we need to
queries comma
+ <para>
+ As a result of the overflow cases, we can say that hash indexes are
+ most suitable for unique, nearly unique data or data with a low number
+ of rows per hash bucket will be suitable for hash indexes. One
The beginning and end of the sentence duplicate "suitable".
+ Each row in the table indexed is represented by a single index tuple in
+ the hash index. Hash index tuples are stored in the bucket pages, and if
+ they exist, the overflow pages.
"the overflow pages" didn't sound right, but I was confused by the comma.
I think it should say ".. in bucket pages and overflow pages, if any."
--
Justin