Puneet Varma (Editor)

CLMUL instruction set

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

Carry-less Multiplication (CLMUL) is an extension to the x86 instruction set used by microprocessors from Intel and AMD which was proposed by Intel in March 2008 and made available in the Intel Westmere processors announced in early 2010.

Contents

One use of these instructions is to improve the speed of applications doing block cipher encryption in Galois/Counter Mode, which depends on finite field GF(2k)) multiplication, which can be implemented more efficiently with the new CLMUL instructions than with the traditional instruction set. Another application is the fast calculation of CRC values, including those used to implement the LZ77 sliding window DEFLATE algorithm in zlib and pngcrush.

New instructions

The instruction computes the 128-bit carry-less product of two 64-bit values. The destination is a 128-bit XMM register. The source may be another XMM register or memory. An immediate operand specifies which halves of the 128-bit operands are multiplied. Mnemonics specifying specific values of the immediate operand are also defined:

CPUs with CLMUL instruction set

  • Intel
  • Westmere processor (March 2010).
  • Sandy Bridge processor
  • Ivy Bridge processor
  • Haswell processor
  • Broadwell processor (with increased throughput and lower latency)
  • Skylake processor
  • AMD:
  • Bulldozer processor (2011).
  • Piledriver based processors (including newer AMD A-series APUs)
  • Jaguar based processors.
  • The presence of the CLMUL instruction set can be checked by testing one of the CPU feature bits.

    References

    CLMUL instruction set Wikipedia


    Similar Topics