HEJ, HU ISSN 1418-7108
Manuscript no.: ANM-000926-A
Articles Frontpage [*] [*]

Factorization methods

In this subsection we discuss, how we can find the prime factors of a large number $n$ (knowing that $n$ is not prime). If $n$ contains e.g. more than $30$ digits the use of the basic algorithm is usually hopeless, we do not like a lot of time to wait. In the last decades more algorithms were established to improve this situation. However, in comparison with the prime tests, the factorization algorithms are much more expensive. It is much more difficult to factorize a number, than to decide the prime property.

The Pollard-$\rho$ method One of the well-known methods was presented by J. M. POLLARD. It was called by him a Monte Carlo method, because of its (pseudo) random nature. In spite of this the algorithm terminates usually succesful, and it is easy to programize. Let $x_0$ be an integer, and let $f(x)$ be a polynomial with integer coefficients. Let us consider the sequence $x_{i+1} = f(x_i)$ mod $n$. Our purpose is, that this sequence must necessarily be random-like. This property depends on the choice of the polynomial $f(x)$. It is proved, that a linear polynomial $ax + b$ is not good, the next simplest case is $x^2 +1$. This choice behaves nice, though we can not prove this exactly.

  > x[0]:=2: n:=1387: s:=NULL:
  > for i from 1 to 8 do x[i]:=x[i-1]^2+1; s:=s,x[i] mod n
  > od:
  > print(s);

\begin{displaymath}
5, \,26, \,677, \,620, \,202, \,582, \,297, \,829
\end{displaymath}

Let us assume, that the number $n$ has a non-trivial divisor $d$. Considering the sequence $y_i = x_i$ mod $d$ we find, that this sequence will be eventually periodic (there are only finite different residues mod $d$). The path of the $y_i$-s draws a greek letter $\rho$, a tail with a cycle. That is why this algorithm is called as the Pollard-$\rho$ method. We get $y_j = y_k, y_{j+1} = y_{k+1}, \dots$, for some indices $j \ne k$. Using this $x_j \equiv x_k$ (mod $d$), i.e. $d \vert x_j - x_k$, thus gcd $(n, x_j - x_k)$ is a non-trivial divisor of $n$. We do not know the value $d$, however if we choose a lot of pairs $(j,k)$, and compute for the pairs gcd $(n, x_j - x_k)$, then usually sooner or later we will find a factor of $n$. The efficiency of this algorithm can be improved if we compute the gcd-s rarer, only for products e.g. 10 pairs of $x_j - x_k$ and $n$. To compute integer gcd-s, usually some version of the Euclidean algorithm (see e.g. in [5]) is used, which is relatively fast. We illustrate the theoretical description with the factorization of the number $1387$. After finding a divisor we produce the $y_j$ values and present the "$\rho$ property".

  > l:=[s]: prod:=1:
  > for i from 2 to 8 do prod:=prod*(l[i]-l[i-1]) od:
  > prod, igcd(prod,n);

\begin{displaymath}
-18766855483437600, \,19
\end{displaymath}


  > for i from 1 to 8 do l[i]:=l[i] mod 19 od: l;

\begin{displaymath}[5, \,7, \,12, \,12, \,12, \,12, \,12, \,12]
\end{displaymath}


  > ifactor(1387);

\begin{displaymath}
(\,{\rm }19)\,(\,{\rm }73)
\end{displaymath}

Getting a divisor $d$, we check the numbers $d$ and $n/d$. If we find that these numbers are not primes, we can try to factorize them further with the same method or with the basic algorithm. It is not sure, that the algorithm gives a correct answer. It is possible, that the gcd is $n$. In this case the choice for $f(x)$ seems to be wrong, let us choose another polynomial $f(x) = x^2 + c$, $c \ne 0,1,-2$. If there is no answer after a long time, then it is better to try another factorization algorithm. With Maple we can apply this method to factorize integers, using the function ifactor with the option 'pollard'.

Pollard's $p-1$ method Another method of POLLARD is based on the assumption, that the number $n$ has such a prime factor $p$, for which $p-1$ is a product of small primes. If this happens, then for an arbitrary number $a$ with gcd$(a,p)=1$, the equation $a^{p-1} \equiv 1 ($mod $p)$ is satisfied. Thus $p \vert $gcd $(a^{p-1} -1, n)$. However, to find this divisor $p$ we have to be lucky. Let us choose a number $i= 2^{c_2} \cdot 3^{c_3} \cdot 5^{c_5} \cdot \dots \cdot k^{c_k}$, where the bases are the first few primes, and the exponents are small positive integers. After it we compute gcd$(a^i -1, n)=d$. If $n > d > 1$, then we have found non-trivial factors of $n$, $d$ and $n/d$. If $d=1$, then we have to choose a larger $i$, and if $d=n$, then we need another base $a$. With this method (called Pollard $p-1$) we find sometimes very quickly a non-trivial divisor, but in many cases it does not work.

  > n:=973: i:=2^5*3^3*5*7:
  > igcd(2^i-1,n);

\begin{displaymath}
7
\end{displaymath}


  > ifactor(n);

\begin{displaymath}
(\,{\rm }7)\,(\,{\rm }139)
\end{displaymath}

By combining the basic algorithm with the Pollard methods in most of the cases we are able to factorize numbers up to roughly $30-40$ (sometimes $50$) decimal digits. There are some other algorithms, which are suitable for factorization of such numbers, e.g. a method developed by FERMAT (see [1] or [8]). The adventage of his idea is, that this algorithm uses only additions and substractions, there are no divisions. However, to a significant improvement we need other approaches.

Sieving algorithms With the improvement of FERMAT's original concept we get sieving algorithms ([8]). In the beginning these methods yield not a significant improvement over $50$ decimal digits. However, in the early $80$s the quadratic sieve was discovered, and after a few years with further developments (first of all by POMERANCE) it proved to be one of the best tools to decompose large numbers ([1]). In a decade it almost entirely displaced the formerly successful continued fraction algorithm.

An elliptic curve method Besides the quadratic sieve nowadays the most successful factorization algorithm is the really surprising elliptic curve method worked out by H. W. LENSTRA. In this paper we only outline this method, because the detailed discussion needs longer introduction to the elliptic curve theory. Its basic idea is similar to the Pollard $p-1$ method. So very briefly, we choose an elliptic curve in the normal form $y^2 = x^3 + bx + c$, and a number $p$ similarly as in the algorithm Pollard $p-1$. We work with a finite group of points (generated by a number $p$) on this curve, and compute a gcd. However, comparing with the Pollard method, if this gcd is 1, then we have the possibility to choose another ellipitic curve. Because of the significant difference of the groups (generated by the same number $p$) on different elliptic curves, sooner or later we have a very good chance to find a non-trivial divisor (details in [1] or [12]). With Maple we can apply this method using the function ifactor with the option 'lenstra'.

REMARK The factorization of large numbers can seem a useless or only theoretically interesting thing. However, a very important practical application was discovered in the $70$s: a public key crypto-system. The essence of the method is, that the encrypted message and the key are public, but in spite of this unauthorized persons are not able to read the message, because to do this it is needed to factorize a very large number, which is the product of two large primes (detailed discussion in [1]). If our "enemy" tries to discover the message, using a $250$ digit key, it takes years, even if he or she uses the best current supercomputers and algorithms.
HEJ, HU ISSN 1418-7108
Manuscript no.: ANM-000926-A
Articles Frontpage [*] [*]