On 01/06/2019 14:52, Morris de Oryx wrote:
[...]
> For an example, imagine an address table with 100M US street addresses
> with two character state abbreviations. So, say there are around 60
> values in there (the USPS is the mail system for a variety of US
> territories, possessions and friends in the Pacific.) Okay, so what's
> the best index type for state abbreviation? For the sake of argument,
> assume a normal distribution so something like FM (Federated States of
> Micronesia) is on a tail end and CA or NY are a whole lot more common.
[...]
I'd expect the distribution of values to be closer to a power law than
the Normal distribution -- at very least a few states would have the
most lookups. But this is my gut feel, not based on any scientific
analysis!
Cheers,
Gavin