Research > Hierarchical Parallelization
Hierarchical Parallelization of MLFMA
We developed a novel hierarchical partitioning strategy for the efficient parallelization of the multilevel fast multipole algorithm (MLFMA) on distributed-memory architectures to solve large-scale problems in electromagnetics. Unlike previous parallelization techniques, the tree structure of MLFMA is distributed among processors by partitioning both clusters and samples of fields at each level. Due to the improved load-balancing, the hierarchical strategy offers a higher parallelization efficiency than previous approaches, especially when the number of processors is large. We demonstrate the improved efficiency on scattering problems discretized with millions of unknowns. In addition, we present the effectiveness of our algorithm by solving very large scattering problems involving a conducting sphere of radius 210 wavelengths and a complicated real-life target with a maximum dimension of 880 wavelengths. Both of the objects are discretized with more than 200 million unknowns.
Various strategies for partitioning of tree structure of MLFMA (a) Simple partitioning, where clusters are distributed in all levels (b, c) Hybrid partitioning with shared and distributed levels (d) Hierarchical partitioning
Distribution of a four-level tree structure among eight processors using the hierarchical partitioning strategy.
Aggregation operations from level 3 to level 4 for the partitioned tree structure.
Various strategies for partitioning of tree structure of MLFMA (a) Simple partitioning, where clusters are distributed in all levels (b, c) Hybrid partitioning with shared and distributed levels (d) Hierarchical partitioning.
Parallelization efficiency with respect to the number of processors for the solution of a scattering problem involving a sphere of radius 20 lambda discretized with 1,462,854 unknowns. (a) Overall efficiency including setup and iterations, when the solution is parallelized by using simple, hybrid, and hierarchical techniques. (b) Efficiencies for MLFMA stages, i.e., aggregation, translation, and disaggregation, using the hierarchical technique.
Parallelization efficiency for the solution of scattering problems involving (a) a sphere of radius 40 lambda discretized with 5,851,416 unknowns and (b) a sphere of radius 60 lambda discretized with 13,278,096 unknowns. Parallel efficiency is defined with respect to two and four processors, respectively.
For more information: