Neha Patil (Editor)

CityHash

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

CityHash is a family of non-cryptographic hash functions, designed for fast hashing of strings. It has 32-, 64-, 128-, and 256-bit variants. CityHash been referenced widely in academic papers.

Google developed the algorithm in-house starting in 2010. The C++ source code for the reference implementation of the algorithm was released in 2011 under an MIT license, with credit to Geoff Pike and Jyrki Alakuijala. The authors expect the algorithm to outperform previous work by a factor of 1.05 to 2.5, depending on the CPU and mix of string lengths being hashed. CityHash is influenced by and partly based on MurmurHash.

Some particularly fast CityHash functions depend on CRC32 instructions that are present in SSE4.2. However, most CityHash functions are designed to be portable, though they will run best on little-endian 32-bit or 64-bit CPUs.

Google has announced FarmHash as the successor to CityHash.

Concerns

CityHash releases do not maintain backward compatibility with previous versions. Users should not use CityHash for persistent storage, or else not upgrade CityHash.

The README warns that CityHash has not been tested much on big-endian platforms.

References

CityHash Wikipedia