Lanczos: BoolElt Default: false
Given a sparse m x n matrix A, defined over the integer ring Z, and a positive integer M, compute a vector v such that v satisfies the equation v.Atr=0 modulo M. v will be non-zero with high probability.This function is designed for index-calculus-type algorithms where a large sparse linear system defined by the matrix A is first constructed and then a vector satisfying the above equation modulo M is required. For this reason it is natural that the transpose of A appears in this equation. The example H28E3 below illustrates such a situation in detail.
The first version of the function takes the actual integer M as the second argument given and so must be factored as part of the calculation, while the second version of the function takes the factorization sequence L of M. If possible, the solution vector is multiplied by a unit modulo M so that its first entry is 1.
The function uses the Structured Gaussian Elimination algorithm [LO91b, Sec. 5]. This reduces the linear system to be solved to a much smaller but denser system. By default, the function recurses on the smaller system until it is almost completely dense, and then this dense system is solved using the fast dense modular nullspace algorithm of Magma.
If the parameter Lanczos is set to true, then the Lanczos algorithm [LO91b, Sec. 3] will be used instead. This is generally very much slower than the default method (it is often 10 to 50 times slower), but it will take considerably less memory, so may be preferable in the case of extremely large matrices.
For typical matrices arising in index-calculus problems, and for most machines, the default method (reducing to a completely dense system) should solve a linear system of size roughly 100, 000 x 100, 000 using about 500MB of memory while a linear system of size roughly 200, 000 x 200, 000 should require about 1.5GB to 2.0GB of memory.
We present Magma code which performs the first stage of the basic linear sieve [COS86], [LO91a] for computing discrete logarithms in a prime finite field GF(p). This first stage determines most of the logarithms of the elements of the factor basis (which is the set of small primes up to a given limit) by using a sieve to compute a sparse linear system which is then solved modulo p - 1.
Even though the following sieving code is written in the Magma language and so is much slower than a serious C implementation, it is sufficiently powerful to be able to compute the first stage logarithms in less than a minute for fields GF(p) where p is about 1020 and (p - 1)/2 is prime. In comparison, the standard Pohlig-Hellman algorithm based on the Pollard-Rho method would take many hours for such fields. This Magma code can also be adapted for other index-calculus methods.
Suppose p is an odd prime and let K=GF(p). Let Q be the factor base, an ordered set consisting of all positive primes from 2 to a given limit qlimit. Fix a primitive element α of K which is also prime as an integer, so α is in Q. We wish to compute the discrete logarithms of most of the elements of Q with respect to α; i.e., we wish to compute lq with αlq=q for most q∈Q.
The algorithm uses a sieve to search for linear relations involving the logarithms of the elements of Q. Let H=⌊√p⌋ + 1 and J=H2 - p. We search for pairs of integers (c1, c2) with 1≤c1, c2 ≤ climit (where climit is a given limit which is much less than H) such that
[(H + c1)(H + c2)] mod p = J + (c1 + c2)H + c1 c2
is smooth with respect to Q (i.e., all of its prime factors are in Q). If we include these H + ci in the factor base, then this gives a linear relation modulo (p - 1) among the logarithms of the elements of the extended factor base.
Fix c1 with 1≤c ≤ climit and suppose we initialize a sieve array (to be indexed by c2) to have zero in each position. For each prime power qh with q∈Q and h sufficiently small, we compute
d = [(J + c1 H)(H + c1) - 1] mod qh.
Then for all c2 ≡ d mod (qh), it turns out that
(H + c1)(H + c2) ≡ 0 mod (qh).
So for each c2 with c1≤c2≤ climit and c2 ≡ d mod (qh), we add log(q) to the position of the sieve corresponding to c2. (Ensuring that c2≥c1 avoids redundant relations.) When we have done this for each q∈Q, we perform trial division to obtain relations for each of the c2 whose sieve values are beneath a suitable threshold.
We repeat this with a new c1 value until we have more relations than elements of the factor base (typically we make the ratio be 1.1 or 1.2), then we solve the corresponding linear system modulo M=p - 1 to obtain the desired logarithms. We ensure that α is the first element of Q so that when the vector is normalized modulo M (so that its first entry is 1), the logarithms will be with respect to α. For derivations of the above equations and for further details concerning the sieving, see [LO91a].
In practice, one first writes M=p - 1=M1.M2 where M1 contains the maximal powers of all the small primes dividing M (say, for primes ≤10000). The solution space modulo M1 will often have high dimension, so the logarithms modulo M1 usually cannot be correctly computed from the linear system alone. So we only compute the solution of the linear system modulo M2. It is still possible that some logarithms cannot be determined modulo M2 (e.g., if 2 unknowns occur only in one equation), but usually most of the logarithms will be correctly computed modulo M2. Then the logarithm of each factor basis element can be easily computed modulo M1 by the Pohlig-Hellman algorithm, and the Chinese Remainder Theorem can be used to combine these with the correct modulo-M2 logarithms to compute the logarithms modulo M of most elements of the factor basis.
Similar index-calculus-type algorithms should have techniques for handling small prime divisors of the modulus M when the solution of the linear system has high nullity modulo these small primes.
The following function Sieve has been developed by Allan Steel, based on code by Benjamin Costello. Its arguments are the field K=GF(p), the factor base prime limit qlimit, the c1, c2 range limit climit, and the desired stopping ratio of relations to unknowns. The function returns the sparse relation matrix A together with an indexed set containing the corresponding extended factor base (the small primes and the H + ci values which yield relations).
> function Sieve(K, qlimit, climit, ratio) > p := #K; > Z := Integers(); > H := Iroot(p, 2) + 1; > J := H^2 - p; > > // Get factor basis of all primes <= qlimit. > fb_primes := [p: p in [2 .. qlimit] | IsPrime(p)]; > > printf "Factor base has %o primes, climit is %o n", #fb_primes, climit; > > // Ensure that the primitive element of K is prime (as an integer). > a := rep{x: x in [2..qlimit] | IsPrime(x) and IsPrimitive(K!x)}; > SetPrimitiveElement(K,K!a); > > // Initialize extended factor base FB to fb_primes (starting with a). > FB := {@ Z!a @}; > for x in fb_primes do > Include(~FB, x); > end for; > > // Initialize A to 0 by 0 sparse matrix over Z. > A := SparseMatrix(); > > // Get logs of all factor basis primes. > log2 := Log(2.0); > logqs := [Log(q)/log2: q in fb_primes]; > > for c1 in [1 .. climit] do > > // Stop if ratio of relations to unknowns is high enough. > if Nrows(A)/#FB ge ratio then break; end if; > > if c1 mod 50 eq 0 then > printf "c1: %o, #rows: %o, #cols: %o, ratio: %o n", > c1, Nrows(A), #FB, RealField(8)!Nrows(A)/#FB; > end if; > > // Initialize sieve. > sieve := [z: i in [1 .. climit]] where z := Log(1.0); > den := H + c1; // denominator of relation > num := -(J + c1*H); // numerator > > for i := 1 to #fb_primes do > // For each prime q in factor base... > q := fb_primes[i]; > logq := logqs[i]; > > qpow := q; > while qpow le qlimit do > // For all powers qpow of q up to qlimit... > > if den mod qpow eq 0 then break; end if; > c2 := num * Modinv(den, qpow) mod qpow; > if c2 eq 0 then c2 := qpow; end if; > > nextqpow := qpow*q; > // Ensure c2 >= c1 to remove redundant relations. > while c2 lt c1 do > c2 +:= qpow; > end while; > > while c2 le #sieve do > // Add logq into sieve for c2. > sieve[c2] +:= logq; > > // Test higher powers of q if nextqpow is too large. > if nextqpow gt qlimit then > prod := (J + (c1 + c2)*H + c1*c2) mod p; > nextp := nextqpow; > while prod mod nextp eq 0 do > sieve[c2] +:= logq; > nextp *:= q; > end while; > end if; > c2 +:= qpow; > end while; > qpow := nextqpow; > end while; > end for; > > // Check sieve for full factorizations. > rel := den * (H + 1); // the relation > relinc := H + c1; // add to relation to get next relation > count := 0; > for c2 in [1 .. #sieve] do > n := rel mod p; > if Abs(sieve[c2] - Ilog2(n)) lt 1 then > fact, r := TrialDivision(n, qlimit); > if r eq [] and (#fact eq 0 or fact[#fact][1] lt qlimit) then > // Include each H + c_i in extended factor basis. > Include(~FB, H + c1); > Include(~FB, H + c2); > > // Include relation (H + c1)*(H + c2) = fact. > row := Nrows(A) + 1; > for t in fact do > SetEntry(~A, row, Index(FB, t[1]), t[2]); > end for; > if c1 eq c2 then > SetEntry(~A, row, Index(FB, H + c1), -2); > else > SetEntry(~A, row, Index(FB, H + c1), -1); > SetEntry(~A, row, Index(FB, H + c2), -1); > end if; > end if; > end if; > rel +:= relinc; > end for; > end for; > > // Check matrix by multiplying out relations in field. > assert {&*[(K!FB[j])^A[k, j]: j in Support(A, k)]: k in [1..Nrows(A)]} > eq {1}; > > return A, FB; > end function;We first apply the function to a trivial example to illustrate the main points. We let K be GF(103), and we make the factor basis include primes up to 35, the ci range be up to 27, and the stopping ratio be 1.1. We first compute the relation matrix A and the extended factor basis F.
> K := GF(103); > A, F := Sieve(K, 35, 27, 1.1); Factor base has 11 primes, climit is 27 > A; Sparse matrix with 33 rows and 30 columns over Integer RingWe examine a few rows of A and the extended factor basis F. Note that 5 is the smallest prime which is primitive in K, so it has been inserted first in F.
> A[1]; A[2]; A[30]; (1 0 0 0 0 1 0 0 0 0 0 -1 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0) (0 0 0 1 1 0 0 0 0 0 0 -1 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0) (0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 0 0 -1 0 0 0 0 0) > F; {@ 5, 2, 3, 7, 11, 13, 17, 19, 23, 29, 31, 12, 14, 15, 16, 21, 38, 20, 26, 35, 22, 33, 34, 24, 25, 28, 30, 37, 32, 36 @}
Next we compute a solution vector v such that v.Atr = 0 modulo M=#K - 1.
> Factorization(#K-1); [ <2, 1>, <3, 1>, <17, 1> ] > v := ModularSolution(A, #K - 1); > v; (1 44 39 4 61 72 70 80 24 86 57 25 48 40 74 43 22 0 1 5 3 100 12 69 2 92 84 93 16 64)Notice that 0 occurs in v, so the corresponding logarithm is not known. The vector is normalized so that the logarithm of 5 is 1, as desired. We finally compute the powers of the primitive element of K by each element of v and check that all of these match the entries of F except for the one missed entry. Note also that because M has small prime divisors, it is fortunate that the logarithms are all correct, apart from the missed one.
> a := PrimitiveElement(K); > a; 5 > Z := IntegerRing(); > [a^Z!x: x in Eltseq(v)]; [ 5, 2, 3, 7, 11, 13, 17, 19, 23, 29, 31, 12, 14, 15, 16, 21, 38, 1, 5, 35, 22, 33, 34, 24, 25, 28, 30, 37, 32, 36 ] > F; {@ 5, 2, 3, 7, 11, 13, 17, 19, 23, 29, 31, 12, 14, 15, 16, 21, 38, 20, 26, 35, 22, 33, 34, 24, 25, 28, 30, 37, 32, 36 @}
Finally, we take K=GF(p), where p is the smallest prime above 1020 such that (p - 1)/2 is prime. The sparse matrix constructed here is close to 1000 x 1000, but the solution vector is still found in less than a second.
> K := GF(100000000000000000763); > Factorization(#K - 1); [ <2, 1>, <50000000000000000381, 1> ] > time A, F := Sieve(K, 2200, 800, 1.1); Factor base has 327 primes, climit is 800 c1: 50, #rows: 222, #cols: 555, ratio: 0.4 c1: 100, #rows: 444, #cols: 738, ratio: 0.60162602 c1: 150, #rows: 595, #cols: 836, ratio: 0.71172249 c1: 200, #rows: 765, #cols: 921, ratio: 0.83061889 c1: 250, #rows: 908, #cols: 973, ratio: 0.9331963 c1: 300, #rows: 1014, #cols: 1011, ratio: 1.003 c1: 350, #rows: 1105, #cols: 1023, ratio: 1.0802 Time: 3.990 > A; Sparse matrix with 1141 rows and 1036 columns over Integer Ring > time v := ModularSolution(A, #K - 1); Time: 0.170We observe that the list consisting of the primitive element powered by the computed logarithms agrees with the factor basis for at least the first 30 elements.
> a := PrimitiveElement(K); > a; 2 > v[1], v[2]; 1 71610399209536789314 > a^71610399209536789314; 3 > P := [a^x: x in Eltseq(v)]; > [P[i]: i in [1 .. 30]]; [ 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113 ] > [F[i]: i in [1 .. 30]]; [ 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113 ]There are in fact 8 logarithms which could not be computed correctly.
> [i: i in [1..#F] | P[i] ne F[i]]; [ 671, 672, 673, 737, 738, 947, 1024, 1025 ]A zero value appearing in the place of a logarithm indicates that the nullity of the system was larger than 1, even considered modulo (#K - 1)/2.
> [v[i]: i in [1..#F] | P[i] ne F[i]]; [ 0, 22076788376647522787, 73252391663364176895, 0, 33553634905886614528, 42960107136526083388, 0, 57276316725691590267 ] > [P[i]: i in [1..#F] | P[i] ne F[i]]; [ 1, 7480000052677, 7960000056517, 1, 5220000041437, 8938028430619715763, 1, 11360000275636 ]However, we have found the correct logarithms for nearly all factor basis elements. As pointed out above, the Pollard-Rho method applied to an element of the factor basis for this field would take many hours!