Factorization

This section contains a description of most of the machinery provided in Magma for the factorization of integers. An account of the Number Field Sieve is deferred until later in the chapter.

In the first subsection the general-purpose Factorization function is described. It employs a combination of methods in an attempt to find the complete prime factorization of a given integer. Some control is possible over each of the methods, but in general default choices for the parameters would give good results for a wide range of arguments.

In the second subsection we describe functions that enable access to each of the factorization methods available in Magma. The user has control over parameters for these methods.

Factorization functions in Magma return a factorization sequence. This is a sequence of two-element tuples [ <p1, k1>, ..., <pr, kr>], with p1<p2< ... <pr distinct prime numbers and ki positive, which is used to represent integers in factored form: n is the product over i of piki. Although such sequences are printed like ordinary sequences, they form a separate category RngIntEltFact. Operations on such factorization sequences are described in the next online help node.

Contents

General Factorization

The general Factorization function is designed to give close to optimal performance for the factorization of integers that may be encountered in the course of daily computations. The strategy employed is as follows (the next subsection gives a more detailed description of the individual methods). First of all a compositeness test is used to ensure that the argument is composite; if not the primality proving algorithm is invoked (unless a flag is set to avoid this --- see below). See the previous node for compositeness testing and primality proving. This operation is repeated for any non-trivial factor (and cofactor) found along the way. Before any of the general factorization techniques is employed, it is checked whether |n| is of the special form bk∓ 1, in which case an intelligent database look-up is used which is likely to be successful if b and k are not too large. This is equivalent to the Cunningham function on b, k, ∓ 1, described in the next node. In the first true stage of factorization trial division is used to find powers of 2 and other small primes (by default up to 10000). After this it is checked whether the remaining composite number is the power of a positive integer; if so the appropriate root is used henceforth. After this Pollard's ρ method is applied (using 8191 iterations by default). The bound on trial division factors and the number of iterations for ρ can be set by the optional parameters TrialDivisionLimit and PollardRhoLimit. It is possible, from this point on, that several composite factors still need factorization. The description below applies to each of these.

The final two algorithms deployed are usually indicated by ECM (for Elliptic Curve Method) and MPQS (for Multiple Polynomial Quadratic Sieve). By default, ECM (which is likely to find `smaller' factors if they exist) is used with parameters that depend on the size of the remaining (composite) factors. After that, if a composite factor of at least 25 digits remains, MPQS is used; it is the best method available for factoring integers of more than about 40 decimal digits especially for products of two primes of roughly equal size. If the remaining composite is smaller than 25 digits, ECM is again invoked, now in an indefinite loop until a factor is found. The latter will also occur if the user, via a flag MPQSLimit indicates that MPQS should not be applied to numbers of the given size, and provided the user has not limited the number of ECM trials by setting the ECMLimit. Thus, unless both MPQSLimit and ECMLimit are set as optional parameters by the users, the algorithm will continue until the complete factorization has been completed.

Besides the limiting parameters just mentioned it also possible to avoid the use of primality proofs and receive probable primes, with a flag similar to that used on IsPrime; see the previous node.

A verbose flag can be set to obtain informative printing on progress in the various stages of factorization. Specific flags for ECM and MPQS may be used as well; they are described in the next subsection.

Since V2.27, parallel versions of the ECM and MPQS algorithms are also available and are selected by using the procedures SetNthreads and StartWorkers. See Subsection Integer Factorisation for more details.

SetVerbose("Factorization", v) : MonStgElt, RngIntElt ->
(Procedure.) Set the verbose printing level for all of the factorization algorithms to be v. Currently the legal values for v are true, false, 0 or 1 (false is the same as 0, and true is the same as 1). If the level is 1, information is printed at each stage of the algorithm as a number is factored.
Factorization(n) : RngIntElt -> RngIntEltFact, RngIntElt, SeqEnum
Factorisation(n) : RngIntElt -> RngIntEltFact, RngIntElt, SeqEnum
Factorization(n: parameters) : RngIntElt -> RngIntEltFact, RngIntElt, SeqEnum
Factorisation(n: parameters) : RngIntElt -> RngIntEltFact, RngIntElt, SeqEnum
A combination of algorithms (Cunningham, trial division, SQUFOF, Pollard ρ, ECM and MPQS) is used to attempt to find the complete factorization of |n|, where n is a non-zero integer. A factorization sequence is returned, representing the completely factored part of |n| (which is usually all of |n|). The second return value is 1 or (-1), reflecting the sign of n. If the factorization could not be completed, a third sequence is returned, containing composite factors that could not be decomposed with the given values of the parameters; this can only happen if both ECMLimit and MPQSLimit have been set. (Note that the third variable will remain unassigned if the full factorization is found.)

When a very large prime (more than 200 decimal digits say), appears in the factorization, proving its primality may dominate the running time.

There are 6 optional parameters.

     Proof: BoolElt                      Default: true
     Bases: RngIntElt                    Default: 20
The parameter Proof (Proof := true by default) can be set to false to indicate that the first sequence may contain probable primes (see also the previous node), in which case the parameter Bases indicates the number of tests used by Miller-Rabin (Bases := 20 by default).
     TrialDivisionLimit: RngIntElt       Default: 10000
The parameter TrialDivisionLimit can be used to specify an upper bound for the primes used in the trial division stage (default TrialDivisionLimit := 10000).
     SQUFOFLimit: RngIntElt              Default: 24
The parameter SQUFOFLimit can be used specify the maximum number of decimal digits for an integer to which SQUFOF should be applied; this is Shank's square form factorization method (default SQUFOFLimit := 24).
     PollardRhoLimit: RngIntElt          Default: 8191
The parameter PollardRhoLimit can be used to specify an upper bound on the number of iterations in the ρ method (default PollardRhoLimit := 8191).
     ECMLimit: RngIntElt                 Default:
This optional parameter can be used to limit the number of curves used by the ECM part of the factorization attempt. Setting ECMLimit := 0 prevents the use of ECM. The default value depends on the size of the input, and ranges from 2 for n with less than 37 digits to around 500 for n with 80 digits. The smoothness is incremented in each step to grow by default from 500 to 600 (for 37 digits and less), and from 500 to about 10000 for n having 80 digits. For the indefinite case of ECM (which applies when MPQS is disallowed) the initial smoothness is 500, the number of curves is infinite and the smoothness is incremented by 100 in each step.
     MPQSLimit: RngIntElt                Default: ∞
The parameter MPQSLimit can be used specify the maximum number of decimal digits for an integer to which MPQS should still be applied; MPQS will not be invoked on integers having less than (or sometimes equal) 25 decimal digits. Setting the parameter to anything less than 25 will therefore prevent MPQS from being used. Unless ECMLimit has been set, this will imply that ECM will be applied until the full factorization has been obtained.

Note that progress can be monitored by use of Verbose("Factorization", true).

Storing Potential Factors

Since V2.14 (October 2007), Magma now internally stores a list of factors found by the ECM and MPQS algorithms. Subsequently, when either of those algorithms are to be invoked by the Factorization function, the integers in the list are first tried to see whether factors can be easily found. One may also give prime factors to Magma to store in this list via the following procedure.

StoreFactor(n) : RngIntElt ->
StoreFactor(S) : [ RngIntElt ] ->
(Procedure.) Store the single integer n or the integers in the set/sequence S in the list of factors to be tried by the Factorization function. Each integer must be a positive prime.
GetStoredFactors() : -> [ RngIntElt ]
Return a sequence containing the currently stored integers.
ClearStoredFactors() : ->
Clear the list of stored integer factors.

Specific Factorization Algorithms

In this node we discuss how various factorization algorithms can be accessed individually. Generally these function should not be used for ordinary factorization (for that use Factorization discussed in the previous node), but they can be used for experimentation, or to build a personal factorization function with control over each of the methods used.

On some functions a little preprocessing is done to ensure that the argument is composite, that powers of 2 (and sometimes 3) are taken out and that the integer to be factored is not the power of an integer.

For each of these functions the Proof (default true) and Bases parameters can be used to indicate that primality of prime factors need not be rigorously proved, and how many bases should be used in the compositeness test, as discussed in the node on IsPrime.

SetVerbose("Cunningham", b) : MonStgElt, BoolElt ->
SetVerbose("ECM", b) : MonStgElt, Elt ->
SetVerbose("MPQS", b) : MonStgElt, Elt ->
Using this procedure to set either of the verbose flags "Cunningham", "ECM" or "MPQS", (which are false by default) enables the user to obtain progress information on attempts to factor integers using the `Cunningham' method, ECM or MPQS.
Cunningham(b, k, c) : RngIntElt, RngIntElt, RngIntElt -> SeqEnum
This function attempts to factor n=bk + c, where c∈{∓ 1} and b and k are not too big. This function uses R. Brent's factor algorithm [BtR92], which employs a combination of table-lookups and attempts at `algebraic' factorization (Aurifeuillian techniques). An error results if the tables, containing most of the known factors for numbers of this form (including the `Cunningham tables'), cannot be located by the system. The function will always return the complete prime factorization (in the form of a factorization sequence) of the number n (but it may take very long before it completes); it should be pointed out, however, that the primes appearing in the factorization are only probable primes and a rigorous primality prover has not been applied.
AssertAttribute(RngInt, "CunninghamStorageLimit", l) : Cat, MonStgElt, RngIntElt ->
This attribute is used to change the number of Cunningham factorizations which are stored in Magma. Normally, Magma stores a certain number of factorizations computed by the Cunningham intrinsic function so that commonly needed factorizations can be recalled quickly. When the stored list fills up, the factorization least recently accessed is removed from the list. Setting this attribute to zero ensures that no storage is done. The default value is 20.
TrialDivision(n) : RngIntElt -> RngIntEltFact, [ RngIntElt ]
TrialDivision(n, B) : RngIntElt, RngIntElt -> RngIntEltFact, [ RngIntElt ]
    Proof: BoolElt                      Default: true
    Bases: RngIntElt                    Default: 20
The integer n not=0 is subjected to trial division by primes up to a certain bound B (the sign of n is ignored). If only the argument n is given, B is taken to be 10000. The function returns a factorization sequence and a sequence containing an unfactored composite that remains.
PollardRho(n) : RngIntElt -> RngIntEltFact, [ RngIntElt ]
PollardRho(n, c, s, k) : RngIntElt, RngIntElt, RngIntElt, RngIntElt -> RngIntEltFact, [ RngIntElt ]
    Proof: BoolElt                      Default: true
    Bases: RngIntElt                    Default: 20
The ρ-method of Pollard is invoked by this function to find the factorization of an integer n>1. For this method a quadratic function x2 + c is iterated k times, with starting value x=s. If only n is used as argument to the function, the default values c=1, s=1, and k=8191 are selected. A speed-up to the original algorithm, due to R.P. Brent [Bre80], is implemented. The function returns two values: a factorization sequence and a sequence containing unfactored composite factors.
pMinus1(n, B1) : RngIntElt, RngIntElt -> RngIntElt
    x0: RngIntElt                       Default: 
    B2: RngIntElt                       Default: 
    k: RngIntElt                        Default: 
Given an integer n>1, an attempt to find a factor is made using Paul Zimmermann's GMP-ECM implementation of Pollard's p - 1 method. If a factor f with 1<f<n is found, then f is returned; otherwise 0 is returned.

The Step 1 bound B1 is given as the second argument B1. By default, the Step 2 bound B2 is optimally chosen, but may be given with the parameter B2 instead. By default, an optimal number of blocks is chosen for Step 2, but this may be overridden via the parameter k (see the function ECM). The base x0 is chosen randomly by default, but may instead be supplied via the parameter x0.

This method will return a prime factor p of n if p - 1 has all its prime factors less than or equal to the Step 1 bound B1, except for one factor which may be less than or equal to the Step 2 bound B2.

pPlus1(n, B1) : RngIntElt, RngIntElt -> RngIntElt
    x0: RngIntElt                       Default: 
    B2: RngIntElt                       Default: 
    k: RngIntElt                        Default: 
Given an integer n>1, an attempt to find a factor is made using Paul Zimmermann's GMP-ECM implementation of Williams' p + 1 method. If a factor f with 1<f<n is found, then f is returned; otherwise 0 is returned.

The Step 1 bound B1 is given as the second argument B1. By default, the Step 2 bound B2 is optimally chosen, but may be given with the parameter B2 instead. By default, an optimal number of blocks is chosen for Step 2, but this may be overridden via the parameter k (see the function ECM). The base x0 is chosen randomly by default, but may instead be supplied via the parameter x0.

This method may return a prime factor p of n if p + 1 has all its prime factors less than or equal to the Step 1 bound B1, except for one factor which may be less than or equal to the Step 2 bound B2. A base x0 is used, and not all bases will succeed: only half of the bases work (namely those where the Jacobi symbol of x02 - 4 and p is -1.) Unfortunately, since p is usually not known in advance, there is no way to ensure that this holds. However, if the base is chosen randomly, there is a probability of about 1/2 that it will give a Jacobi symbol of -1 (so that the factor p would be found assuming that p + 1 is smooth enough). A rule of thumb is to run pPlus1 three times with different random bases.

SQUFOF(n) : RngIntElt -> RngIntEltFact, [ RngIntElt ]
SQUFOF(n, k) : RngIntElt, RngIntElt -> RngIntEltFact, [ RngIntElt ]
    Proof: BoolElt                      Default: true
    Bases: RngIntElt                    Default: 20
This is a fast implementation of Shanks's square form factorization method that will only work for integers n>1 less than 22b - 2, where b is the number of bits in a long (which is either 32 or 64). The argument k may be used to specify the maximum number of iterations used to find the square; by default it is 200000. The expected number of iterations is O(N1/4).
ECM(n, B1) : RngIntElt, RngIntElt -> RngIntElt, RngIntElt
    Sigma: RngIntElt                    Default: 
    x0: RngIntElt                       Default: 
    B2: RngIntElt                       Default: 
    k: RngIntElt                        Default: 2
Given an integer n>1, an attempt is made to find a factor using the GMP-ECM implementation of the Elliptic Curve Method (ECM). If a factor f with 1<f<n is found, then f is returned together with the corresponding successful σ seed; otherwise 0 is returned.

The Step 1 bound B1 is given as the second argument B1. By default, the Step 2 bound B2 is optimally chosen, but may be given with the parameter B2 instead.

The elliptic curve used is defined by Suyama's parametrization and is determined by a parameter σ. By default, σ is chosen randomly with 0 < σ < 232, but an alternative positive integer may be supplied instead via the parameter Sigma. Let u = σ2 - 5, v = 4σ and a = (v - u)3(3u + v)/(4u3v) - 2. The starting point used is (x0:1), where by default x0=u3/v3, but x0 may instead be supplied via the parameter x0. Finally, the curve used is by2 = x3 + ax2 + x, where b = x03 + ax02 + x0.

Step 1 uses very little memory, but Step 2 may use a large amount of memory, especially for large B2, since its efficient algorithms use some large tables. To reduce the memory usage of Step 2, one may increase the parameter k, which controls the number of "blocks" used. Multiplying the default value of k by 4 will decrease the memory usage by a factor of 2. For example, with B2=1010 and a 155-digit number n, Step 2 requires about 96MB with the default k=2, but only 42MB with k=8. Increasing k does, however, slightly increase the time required for Step 2.

ECMSteps(n, L, U) : RngIntElt, RngIntElt, RngIntElt -> RngIntElt, RngIntElt
Given an integer n>1, an attempt to find a factor of n is made by repeated calls to {ECM}. The initial B1 bound is taken to be L, and subsequently B1 is replaced with B1 + ⌊√B1 ⌋ at each step. If a factor is found at any point, then this is returned with the corresponding successful σ seed; otherwise, if B1 becomes greater than the upper bound U, then 0 is returned.
MPQS(n) : RngIntElt -> RngIntEltFact, [ RngIntElt ]
    Proof: BoolElt                      Default: true
    Bases: RngIntElt                    Default: 20
This function can be used to drive Arjen Lenstra's implementation of the multiple polynomial quadratic sieve MPQS. Given an integer n>5.1024 an attempt is made to find the prime factorization of n using MPQS. The function returns two values: a factorization sequence and a sequence containing unfactored composite factors.

Factorization Related Functions

ECMOrder(p, s) : RngIntElt, RngIntElt -> RngIntElt
ECMFactoredOrder(p, s) : RngIntElt, RngIntElt -> RngIntElt
Suppose p is a prime factor found by the ECM algorithm and such that the σ value determining the successful curve was s. These functions compute the order of the corresponding elliptic curve. The first function returns the order as an integer, while the second function returns the factorization of the order. In general, this order will have been smooth with respect to the relevant bounds for the ECM algorithm to have worked, and these functions allow one to examine how small the prime divisors of the curve order really are.
PrimeBasis(n) : RngIntElt -> [RngIntElt]
PrimeDivisors(n) : RngIntElt -> [RngIntElt]
A sequence containing the distinct prime divisors of the positive integer |n|, given in increasing order.
Divisors(n) : RngIntElt -> [ RngIntElt ]
Divisors(f) : RngIntEltFact -> [ RngIntElt ]
Returns a sequence containing all divisors of the positive integer, including 1 and the integer itself, given in increasing order. The argument given must be either the integer n itself, or a factorization sequence f representing it.
CoprimeBasis(S) : [ RngIntElt ] -> [ RngIntElt ]
CoprimeBasis(S) : { RngIntElt } -> [ RngIntElt ]
Given a set or sequence S of integers, return a coprime basis of S in the form of a factorization sequence Q whose integer value is the same as the product of the elements of S but Q has coprime bases (i.e., the first components of tuples from Q are coprime).

Example RngInt_Perfect (H19E7)

In this example we use the Divisors function together with the &+ reduction of sequences to find the first few perfect numbers, that is, numbers n such that the sum of the divisors less than n equals n.
> { x : x in [2..1000] | &+Divisors(x) eq 2*x };
{ 6, 28, 496 }
> f := Factorization(496);
> f;
[ <2, 4>, <31, 1> ]
> Divisors(f);
[ 1, 2, 4, 8, 16, 31, 62, 124, 248, 496 ]
PartialFactorization(S) : [ RngIntElt ] -> [ RngIntEltFact ]
Given a sequence of non-zero integers S, return, for each integer S[i], two factorization lists Fi and Gi, such that S[i] = Facint(Fi) * Facint(Gi). All the divisors in Fi are square factors, and, for any i and j, the divisors in Gi and Gj are either equal or are pairwise coprime. In other terms, PartialFactorization(S) provides a partial decomposition of the integers in S in square and coprime factors. The interesting fact is that this factorization uses only gcd and exact integer division. This algorithm is due to J.E. Cremona.

Example RngInt_PartialFact (H19E8)

A partial factorization is shown.
> PartialFactorization([1380, 675, 3408, 654]);
[
  [
    [ <2, 2> ],
    [ <115, 1>, <3, 1> ]
  ],
  [
    [ <5, 2>, <3, 2> ],
    [ <3, 1> ]
  ],
  [
    [ <2, 4> ],
    [ <71, 1>, <3, 1> ]
  ],
  [
    [],
    [ <218, 1>, <3, 1> ]
  ]
]
V2.28, 13 July 2023