The file structure of CLUSTEREASY is the same as it is for LATTICEEASY, except for the addition of a file called mpiutil.cpp with functions for communicating between processors. All of the files have been modified in the parallel version, however, so you need to download the CLUSTEREASY files from the LATTICEEASY download page.
As noted above, CLUSTEREASY requires a cluster using MPI. On such a cluster there should typically be a script called mpiCC for compiling MPI C++ programs. The makefile that comes with CLUSTEREASY uses this command, so if your cluster is set up differently you may need to modify it accordingly.
You will also need a freely available set of Fourier Transform routines called FFTW. You can contact your system administrator if this is not already installed on your cluster. CLUSTEREASY calls FFTW for Fourier Transforms in 2D and 3D. FFTW doesn't have a 1D parallel FFT, so for 1D CLUSTEREASY uses a parallelized version of the FFTEASY routines that come with LATTICEEASY. Note that the parallel FFTEASY routine does not scale well, so large runs in 1D may become very slow.
Note that FFTW can be installed in three different ways: single-precision, double-precision, or both. When it is installed for both, all compiler flags for FFTW must include a letter ``s'' or ``d'' that specifies whether single or double precision is being used. By default CLUSTEREASY assumes that FFTW is installed for single and double precision. If it is installed for single precision only on your system you should remove the letter ``s'' from all the FFTW compiler flags in the makefile and FFTW header names in latticeeasy.h.
If FFTW is installed for both precisions and you wish to do runs with double precision, you need to make the following changes
#define float double
to
latticeeasy.h and ffteasy.cpp, just below the list
of header files.
If you have mpiCC and FFTW set up properly on your system then you
should be able to compile CLUSTEREASY simply by typing ``make.'' Note
that you do not specify the number of processors when you compile the
program, but rather in the command line when you run it. Your cluster
should have instructions for executing MPI programs, but on most
systems the command is
mpirun -np <number of processors> latticeeasy
Note that more processors doesn't always mean faster performance. If you use too many processors the program will spend more time communicating between processors than evolving the fields. You may have to do some trial and error for your particular problems, but a good rule of thumb is that you probably won't get much benefit from using more processors than , where is the number of gridpoints along an edge. Also, you will get slightly better performance per processor if the number of processors is a factor of so that the processors can divide the lattice up evenly.