FFTs using dm_array_fft()

The library dm_array.c provides unified access to several different FFT routines through the routine dm_array_fft(). This routine can be trusted to preserve energy in both forward and inverse transforms (meaning that the value of the sum of squares of all amplitudes is not changed). This routine is just a ``wrapper'' for one of several underlying FFT routines. FFTW3 (www.fftw.org) is reputed to be the fastest of the single-processor FFT routines available, so it is one good choice. However, FFTW3 is not MPI-enabled as of yet, although an older version (FFTW 2.1.5) provides this capability and is still available. Apple has apparently used FFTW 2.1.5 as the basis for a very fast library for Xserve G5 clusters called dist_fft; this library is MPI-enabled and it also exploits the pipelining features of the altivec unit on G5 processors. The performance of Apple's dist_fft routines is discussed in a http://images.apple.com/acg/pdf/20040827_GigaFFT.pdfwhite paper which we have copied into GigaFFTonG5.pdf in CVS diffmic/doc/recon, and they are available from the http://developer.apple.com/samplecode/dist_fft/dist_fft.htmldist_fft web page. There are two or three steps to using dm_array_fft:

  1. You must first allocate memory for the complex array structure using the mechanism described in Sec. 1.4.5. (And remember that these routines treat 2D arrays as 3D arrays with $nz=1$). You must then create a ``plan'' (a set of precalculated parameters and radix choices) for forward and inverse transforms:
      MPI_Comm mpi_comm;
      dm_array_fft(&array_3d_cas,
                   (DM_ARRAY_CREATE_FFT_PLAN | DM_ARRAY_FFT_MEASURE),
                   mpi_comm);
    
    The act of creating a plan destroys any data in complex_array, so you should create your plan before you start to load data into complex_array.

    When creating a plan, you can trade off slow, exhaustive testing of the best optimization strategy so as to do the fastest FFTs (DM_ARRAY_FFT_PATIENT), a balanced mix of faster planning and somewhat slower FFT execution (DM_ARRAY_FFT_MEASURE), and fast planning with slower FFT execution (DM_ARRAY_FFT_ESTIMATE); the defalt type is DM_ARRAY_FFT_PATIENT.

  2. You can then happily continue to do in-place forward and inverse FFTs on the same array:
    dm_array_fft(&array_3d_cas,DM_ARRAY_FORWARD_FFT,mpi_comm);
    dm_array_fft(&array_3d_cas,DM_ARRAY_INVERSE_FFT,mpi_comm);

  3. If you change the array dimensions $\texttt{nx} \times \texttt{ny} \times \texttt{nz}$, you must destroy the plan, free and then reallocate the array memory using the mechanisms described in Sec. 1.4.5, and create a new plan for the new array size. The way to destroy an existing plan is
    dm_array_fft(&array_3d_cas,DM_ARRAY_DESTROY_FFT_PLAN,mpi_comm);

FFTW (non MPI) is used by dm_array_fft() as default , unless -DDIST_FFT is specified while compiling to use Apple DIST_FFT routines. Doing FFT with FFTW is tested in Cygwin environment. The routine takes care of renormalization and recentering array. Working with DIST_FFT is tested on Apple Cluster up to 32 processes (for details, please refer to Appendix B). The routine renormalizes FFT result, but dose not recenter array, which is shifted by n/2 along X, Y and Z directions.

Microscope User 2008-11-25