Parallelization of MLFMA
Parallelization of MLFMA
Fast and accurate solutions of large-scale scattering problems involving three-dimensional closed conductors
with arbitrary shapes using the multilevel fast multipole algorithm (MLFMA). With an efficient parallelization of MLFMA,
scattering problems that are discretized with tens of millions of unknowns are easily solved on a cluster of computers. We
extensively investigate the parallelization of MLFMA, identify the bottlenecks, and provide remedial procedures to improve the
efficiency of the implementations. The accuracy of the solutions is demonstrated on a scattering problem involving a sphere of
radius 110 wavelenghts discretized with 41,883,638 unknowns, the largest integral-equation problem solved to date. In addition to canonical problems, we also present the solution of real-life problems involving complicated targets with large dimensions.
Tree size and the number of near-field interactions for the solutions of the sphere problems using top-down strategy to construct the multilevel tree.
Major parts of MLFMA and their computational requirements.
Communications performed in each MVM to match the near-field and far-field partitioning schemes.
All-to-all communications performed at LoD to change the far-field partitioning scheme from the distributed levels to the shared levels.
Interpolations in the shared levels involving one-to-one communications.
Processor pairing for the translations in the distributed levels.
Anterpolations (transpose interpolations) in the shared levels involving one-to-one communications.
Communications required in the matrix-vector multiplications by parallel MLFMA.
Parallelization efficiency for the solution of a scattering problem involving a sphere of radius 24 wavelenghts discretized with 2,111,952 unknowns.
Processing time and parallelization efficiency for various categorized parts of the MVMs for the solution of a scattering problem involving a sphere of radius 24 wavelenghts discretized with 2,111,952 unknowns.
Solutions of large sphere problems with MLFMA parallelized into 16 processes.
Time diagrams for the solution of a scattering problem involving a sphere of radius 80 wavelenghts discretized with 23,405,664 unknowns. (a) Overall time includes the input and clustering parts , calculation of the translation matrices, calculation of the near-field interactions, calculation of the radiation
and receiving patterns, and the iterative solution. (b) Matrix-vector multiplications include the near-field stage, aggregation in the distributed levels, all-to-all communications in the level of distribution (LoD), aggregation in the shared levels, translations without communications, translations with communications, disaggregation in the shared levels, and disaggregation in the
distributed levels followed by the receiving operation. In the diagrams, white areas correspond to waits before the operations that require synchronization.
For more information:
How to cite this paper:
- Please see the full article published in the IEEE Transactions on Antennas and Propagation in 2008.
- Ö. Ergül and L. Gürel, "Efficient parallelization of the multilevel fast multipole algorithm for the solution of large-scale scattering problems," IEEE Trans. Antennas Propagat., vol. 56, no. 8, pp. 2335-2345, Aug. 2008.