OLD | NEW |
(Empty) | |
| 1 |
| 2 |
| 3 Bignum support (Fast Number Theoretic Transform or FNT): |
| 4 ======================================================== |
| 5 |
| 6 Bignum arithmetic in libmpdec uses the scheme for fast convolution |
| 7 of integer sequences from: |
| 8 |
| 9 J. M. Pollard: The fast Fourier transform in a finite field |
| 10 http://www.ams.org/journals/mcom/1971-25-114/S0025-5718-1971-0301966-0/home.html |
| 11 |
| 12 |
| 13 The transform in a finite field can be used for convolution in the same |
| 14 way as the Fourier Transform. The main advantages of the Number Theoretic |
| 15 Transform are that it is both exact and very memory efficient. |
| 16 |
| 17 |
| 18 Convolution in pseudo-code: |
| 19 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 20 |
| 21 fnt_convolute(a, b): |
| 22 x = fnt(a) # forward transform of a |
| 23 y = fnt(b) # forward transform of b |
| 24 z = pairwise multiply x[i] and y[i] |
| 25 result = inv_fnt(z) # backward transform of z. |
| 26 |
| 27 |
| 28 Extending the maximum transform length (Chinese Remainder Theorem): |
| 29 ------------------------------------------------------------------- |
| 30 |
| 31 The maximum transform length is quite limited when using a single |
| 32 prime field. However, it is possible to use multiple primes and |
| 33 recover the result using the Chinese Remainder Theorem. |
| 34 |
| 35 |
| 36 Multiplication in pseudo-code: |
| 37 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 38 |
| 39 _mpd_fntmul(u, v): |
| 40 c1 = fnt_convolute(u, v, P1) # convolute modulo prime1 |
| 41 c2 = fnt_convolute(u, v, P2) # convolute modulo prime2 |
| 42 c3 = fnt_convolute(u, v, P3) # convolute modulo prime3 |
| 43 result = crt3(c1, c2, c3) # Chinese Remainder Theorem |
| 44 |
| 45 |
| 46 Optimized transform functions: |
| 47 ------------------------------ |
| 48 |
| 49 There are three different fnt() functions: |
| 50 |
| 51 std_fnt: "standard" decimation in frequency transform for array lengths |
| 52 of 2**n. Performs well up to 1024 words. |
| 53 |
| 54 sixstep: Cache-friendly algorithm for array lengths of 2**n. Outperforms |
| 55 std_fnt for large arrays. |
| 56 |
| 57 fourstep: Algorithm for array lengths of 3 * 2**n. Also cache friendly |
| 58 in large parts. |
| 59 |
| 60 |
| 61 List of bignum-only files: |
| 62 -------------------------- |
| 63 |
| 64 Functions from these files are only used in _mpd_fntmul(). |
| 65 |
| 66 umodarith.h -> fast low level routines for unsigned modular arithmetic |
| 67 numbertheory.c -> routines for setting up the FNT |
| 68 difradix2.c -> decimation in frequency transform, used as the |
| 69 "base case" by the following three files: |
| 70 |
| 71 fnt.c -> standard transform for smaller arrays |
| 72 sixstep.c -> transform large arrays of length 2**n |
| 73 fourstep.c -> transform arrays of length 3 * 2**n |
| 74 |
| 75 convolute.c -> do the actual fast convolution, using one of |
| 76 the three transform functions. |
| 77 transpose.c -> transpositions needed for the sixstep algorithm. |
| 78 crt.c -> Chinese Remainder Theorem: use information from three |
| 79 transforms modulo three different primes to get the |
| 80 final result. |
| 81 |
| 82 |
| 83 |
OLD | NEW |