Trisha Shetty (Editor)

Pollard's rho algorithm

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

Pollard's rho algorithm is a special-purpose integer factorization algorithm. It was invented by John Pollard in 1975. It is particularly effective for a composite number having a small prime factor.

Contents

Core ideas

The ρ algorithm is based on Floyd's cycle-finding algorithm and on the observation that (as in the birthday problem) t random numbers x1, x2, ... , xt in the range [1, n] will contain a repetition with probability P > 0.5 if t > 1.177n1/2. The constant 1.177 comes from the more general result that if P is the probability that t random numbers in the range [1, n] contain a repetition, then P > 1 - exp{ - t2/2n }. Thus P > 0.5 provided 1/2 > exp{ - t2/2n }, or t2 > 2n ln 2, or t > (2ln 2)1/2n1/2 = 1.177n1/2.

The ρ algorithm uses g(x), a polynomial modulo n, as a generator of a pseudo-random sequence. (The most commonly used function is g(x) = (x2 + 1) mod n.) Let's assume n = pq. The algorithm generates the sequence x1 = g(2), x2 = g(g(2)), x3 = g(g(g(2))), and so on. Two different sequences will in effect be running at the same time—the sequence {xk} and the sequence {xk mod p}. Since p < n1/2, the latter sequence is likely to repeat earlier than the former sequence. The repetition of the mod p sequence will be detected by the fact that p divides xk - xm, hence gcd(xk - xm, n) = p, where k < m. Once a repetition occurs, the sequence will cycle, because each term depends only on the previous one. The name ρ algorithm derives from the similarity in appearance between the Greek letter ρ and the directed graph formed by the values in the sequence and their successors. Once it is cycling, Floyd's cycle-finding algorithm will eventually detect a repetition. The algorithm succeeds whenever the sequence {xk mod p} repeats before the sequence {xk}. The randomizing function g(x) must be a polynomial modulo n, so that it will work both modulo p and modulo n. That is, so that g(x mod p) ≡ g(x) (mod p).

Algorithm

The algorithm takes as its inputs n, the integer to be factored; and g ( x ) , a polynomial in x computed modulo n. This will ensure that if p | n , and x y mod p , then g ( x ) g ( y ) mod p . In the original algorithm, g ( x ) = ( x 2 1 ) mod n , but nowadays it is more common to use g ( x ) = ( x 2 + 1 ) mod n . The output is either a non-trivial factor of n, or failure. It performs the following steps:

x ← 2; y ← 2; d ← 1 while d = 1: x ← g(x) y ← g(g(y)) d ← gcd(|x - y|, n) if d = n: return failure else: return d

Note that this algorithm may fail to find a nontrivial factor even when n is composite. In that case, the method can be tried again, using a starting value other than 2 or a different g ( x ) . The name ρ algorithm comes from the fact that the values of x ( mod d ) eventually repeat with period d, resulting in a ρ shape in a graph of the values.

Variants

In 1980, Richard Brent published a faster variant of the rho algorithm. He used the same core ideas as Pollard but a different method of cycle detection, replacing Floyd's cycle-finding algorithm with the related Brent's cycle finding method.

A further improvement was made by Pollard and Brent. They observed that if gcd ( a , n ) > 1 , then also gcd ( a b , n ) > 1 for any positive integer b. In particular, instead of computing gcd ( | x y | , n ) at every step, it suffices to define z as the product of 100 consecutive | x y | terms modulo n, and then compute a single gcd ( z , n ) . A major speed up results as 100 gcd steps are replaced with 99 multiplications modulo n and a single gcd. Occasionally it may cause the algorithm to fail by introducing a repeated factor, for instance when n is a square. But it then suffices to go back to the previous gcd term, where gcd ( z , n ) = 1 , and use the regular ρ algorithm from there.

Application

The algorithm is very fast for numbers with small factors, but slower in cases where all factors are large. The ρ algorithm's most remarkable success was the factorization of the ninth Fermat number, F8 = 1238926361552897 * 93461639715357977769163558199606896584051237541638188580280321. The ρ algorithm was a good choice for F8 because the prime factor p = 12389263661552897 is much smaller than the other factor. The factorization took 2 hours on a UNIVAC 1100/42.

Example factorization

Let n = 8051 and g(x) = (x2 + 1) mod 8051.

97 is a non-trivial factor of 8051. Starting values other than x = y = 2 may give the cofactor (83) instead of 97.

The example n = 10403 = 101 . 103

Here we introduce another variant, where only a single sequence is computed, and the gcd is computed inside the loop that detects the cycle.

C++ code sample

The following code sample finds the factor 101 of 10403 with a starting value of x = 2.

The results

In the following table the third and fourth columns contain secret information not known to the person trying to factor pq = 10403. They are included to show how the algorithm works. If we start with x = 2 and follow the algorithm, we get the following numbers:

The first repetition modulo 101 is 97 which occurs in step 17. The repetition is not detected until step 23, when x = xfixed (mod 101). This causes gcd(x - x_fixed, n) = gcd(2799 - 9970, n) to be p = 101, and a factor is found.

Complexity

If the pseudo random number x = g(x) occurring in the Pollard ρ algorithm were an actual random number, it would follow that success would be achieved half the time, by the Birthday paradox in O ( p ) O ( n 1 / 4 ) iterations. It is believed that the same analysis applies as well to the actual rho algorithm, but this is a heuristic claim, and rigorous analysis of the algorithm remains open.

Additional reading

  • Katz, Jonathan; Lindell, Yehuda (2007), "Chapter 8", Introduction to Modern Cryptography, CRC Press 
  • References

    Pollard's rho algorithm Wikipedia