--- /dev/null Thu Jan 01 00:00:00 1970 +0000 |
+++ b/Modules/_decimal/libmpdec/literature/matrix-transform.txt Sat Mar 10 18:12:20 2012 +0100 |
@@ -0,0 +1,256 @@ |
+ |
+ |
+(* Copyright (c) 2011 Stefan Krah. All rights reserved. *) |
+ |
+ |
+The Matrix Fourier Transform: |
+============================= |
+ |
+In libmpdec, the Matrix Fourier Transform [1] is called four-step transform |
+after a variant that appears in [2]. The algorithm requires that the input |
+array can be viewed as an R*C matrix. |
+ |
+All operations are done modulo p. For readability, the proofs drop all |
+instances of (mod p). |
+ |
+ |
+Algorithm four-step (forward transform): |
+---------------------------------------- |
+ |
+ a := input array |
+ d := len(a) = R * C |
+ p := prime |
+ w := primitive root of unity of the prime field |
+ r := w**((p-1)/d) |
+ A := output array |
+ |
+ 1) Apply a length R FNT to each column. |
+ |
+ 2) Multiply each matrix element (addressed by j*C+m) by r**(j*m). |
+ |
+ 3) Apply a length C FNT to each row. |
+ |
+ 4) Transpose the matrix. |
+ |
+ |
+Proof (forward transform): |
+-------------------------- |
+ |
+ The algorithm can be derived starting from the regular definition of |
+ the finite-field transform of length d: |
+ |
+ d-1 |
+ ,---- |
+ \ |
+ A[k] = | a[l] * r**(k * l) |
+ / |
+ `---- |
+ l = 0 |
+ |
+ |
+ The sum can be rearranged into the sum of the sums of columns: |
+ |
+ C-1 R-1 |
+ ,---- ,---- |
+ \ \ |
+ = | | a[i * C + j] * r**(k * (i * C + j)) |
+ / / |
+ `---- `---- |
+ j = 0 i = 0 |
+ |
+ |
+ Extracting a constant from the inner sum: |
+ |
+ C-1 R-1 |
+ ,---- ,---- |
+ \ \ |
+ = | r**k*j * | a[i * C + j] * r**(k * i * C) |
+ / / |
+ `---- `---- |
+ j = 0 i = 0 |
+ |
+ |
+ Without any loss of generality, let k = n * R + m, |
+ where n < C and m < R: |
+ |
+ C-1 R-1 |
+ ,---- ,---- |
+ \ \ |
+ A[n*R+m] = | r**(R*n*j) * r**(m*j) * | a[i*C+j] * r**(R*C*n*i) * r**(C*m*i) |
+ / / |
+ `---- `---- |
+ j = 0 i = 0 |
+ |
+ |
+ Since r = w ** ((p-1) / (R*C)): |
+ |
+ a) r**(R*C*n*i) = w**((p-1)*n*i) = 1 |
+ |
+ b) r**(C*m*i) = w**((p-1) / R) ** (m*i) = r_R ** (m*i) |
+ |
+ c) r**(R*n*j) = w**((p-1) / C) ** (n*j) = r_C ** (n*j) |
+ |
+ r_R := root of the subfield of length R. |
+ r_C := root of the subfield of length C. |
+ |
+ |
+ C-1 R-1 |
+ ,---- ,---- |
+ \ \ |
+ A[n*R+m] = | r_C**(n*j) * [ r**(m*j) * | a[i*C+j] * r_R**(m*i) ] |
+ / ^ / |
+ `---- | `---- 1) transform the columns |
+ j = 0 | i = 0 |
+ ^ | |
+ | `-- 2) multiply |
+ | |
+ `-- 3) transform the rows |
+ |
+ |
+ Note that the entire RHS is a function of n and m and that the results |
+ for each pair (n, m) are stored in Fortran order. |
+ |
+ Let the term in square brackets be f(m, j). Step 1) and 2) precalculate |
+ the term for all (m, j). After that, the original matrix is now a lookup |
+ table with the mth element in the jth column at location m * C + j. |
+ |
+ Let the complete RHS be g(m, n). Step 3) does an in-place transform of |
+ length n on all rows. After that, the original matrix is now a lookup |
+ table with the mth element in the nth column at location m * C + n. |
+ |
+ But each (m, n) pair should be written to location n * R + m. Therefore, |
+ step 4) transposes the result of step 3). |
+ |
+ |
+ |
+Algorithm four-step (inverse transform): |
+---------------------------------------- |
+ |
+ A := input array |
+ d := len(A) = R * C |
+ p := prime |
+ d' := d**(p-2) # inverse of d |
+ w := primitive root of unity of the prime field |
+ r := w**((p-1)/d) # root of the subfield |
+ r' := w**((p-1) - (p-1)/d) # inverse of r |
+ a := output array |
+ |
+ 0) View the matrix as a C*R matrix. |
+ |
+ 1) Transpose the matrix, producing an R*C matrix. |
+ |
+ 2) Apply a length C FNT to each row. |
+ |
+ 3) Multiply each matrix element (addressed by i*C+n) by r**(i*n). |
+ |
+ 4) Apply a length R FNT to each column. |
+ |
+ |
+Proof (inverse transform): |
+-------------------------- |
+ |
+ The algorithm can be derived starting from the regular definition of |
+ the finite-field inverse transform of length d: |
+ |
+ d-1 |
+ ,---- |
+ \ |
+ a[k] = d' * | A[l] * r' ** (k * l) |
+ / |
+ `---- |
+ l = 0 |
+ |
+ |
+ The sum can be rearranged into the sum of the sums of columns. Note |
+ that at this stage we still have a C*R matrix, so C denotes the number |
+ of rows: |
+ |
+ R-1 C-1 |
+ ,---- ,---- |
+ \ \ |
+ = d' * | | a[j * R + i] * r' ** (k * (j * R + i)) |
+ / / |
+ `---- `---- |
+ i = 0 j = 0 |
+ |
+ |
+ Extracting a constant from the inner sum: |
+ |
+ R-1 C-1 |
+ ,---- ,---- |
+ \ \ |
+ = d' * | r' ** (k*i) * | a[j * R + i] * r' ** (k * j * R) |
+ / / |
+ `---- `---- |
+ i = 0 j = 0 |
+ |
+ |
+ Without any loss of generality, let k = m * C + n, |
+ where m < R and n < C: |
+ |
+ R-1 C-1 |
+ ,---- ,---- |
+ \ \ |
+ A[m*C+n] = d' * | r' ** (C*m*i) * r' ** (n*i) * | a[j*R+i] * r' ** (R*C*m*j) * r' ** (R*n*j) |
+ / / |
+ `---- `---- |
+ i = 0 j = 0 |
+ |
+ |
+ Since r' = w**((p-1) - (p-1)/d) and d = R*C: |
+ |
+ a) r' ** (R*C*m*j) = w**((p-1)*R*C*m*j - (p-1)*m*j) = 1 |
+ |
+ b) r' ** (C*m*i) = w**((p-1)*C - (p-1)/R) ** (m*i) = r_R' ** (m*i) |
+ |
+ c) r' ** (R*n*j) = r_C' ** (n*j) |
+ |
+ d) d' = d**(p-2) = (R*C) ** (p-2) = R**(p-2) * C**(p-2) = R' * C' |
+ |
+ r_R' := inverse of the root of the subfield of length R. |
+ r_C' := inverse of the root of the subfield of length C. |
+ R' := inverse of R |
+ C' := inverse of C |
+ |
+ |
+ R-1 C-1 |
+ ,---- ,---- 2) transform the rows of a^T |
+ \ \ |
+ A[m*C+n] = R' * | r_R' ** (m*i) * [ r' ** (n*i) * C' * | a[j*R+i] * r_C' ** (n*j) ] |
+ / ^ / ^ |
+ `---- | `---- | |
+ i = 0 | j = 0 | |
+ ^ | `-- 1) Transpose input matrix |
+ | `-- 3) multiply to address elements by |
+ | i * C + j |
+ `-- 3) transform the columns |
+ |
+ |
+ |
+ Note that the entire RHS is a function of m and n and that the results |
+ for each pair (m, n) are stored in C order. |
+ |
+ Let the term in square brackets be f(n, i). Without step 1), the sum |
+ would perform a length C transform on the columns of the input matrix. |
+ This is a) inefficient and b) the results are needed in C order, so |
+ step 1) exchanges rows and columns. |
+ |
+ Step 2) and 3) precalculate f(n, i) for all (n, i). After that, the |
+ original matrix is now a lookup table with the ith element in the nth |
+ column at location i * C + n. |
+ |
+ Let the complete RHS be g(m, n). Step 4) does an in-place transform of |
+ length m on all columns. After that, the original matrix is now a lookup |
+ table with the mth element in the nth column at location m * C + n, |
+ which means that all A[k] = A[m * C + n] are in the correct order. |
+ |
+ |
+-- |
+ |
+ [1] Joerg Arndt: "Matters Computational" |
+ http://www.jjj.de/fxt/ |
+ [2] David H. Bailey: FFTs in External or Hierarchical Memory |
+ http://crd.lbl.gov/~dhbailey/dhbpapers/ |
+ |
+ |
+ |