Katalog Plus
Bibliothek der Frankfurt UAS
Bald neuer Katalog: sichern Sie sich schon vorab Ihre persönlichen Merklisten im Nutzerkonto: Anleitung.
Dieses Ergebnis aus BASE kann Gästen nicht angezeigt werden.  Login für vollen Zugriff.

Separable projection integrals for higher-order correlators of the cosmic microwave sky: Acceleration by factors exceeding 100

Title: Separable projection integrals for higher-order correlators of the cosmic microwave sky: Acceleration by factors exceeding 100
Authors: Briggs, JP; Pennycook, SJ; Fergusson, JR; Jäykkä, J; Shellard, EPS
Publisher Information: Elsevier; //doi.org/10.1016/j.jcp.2016.01.019
Publication Year: 2016
Collection: Apollo - University of Cambridge Repository
Subject Terms: 4006 Communications Engineering; 40 Engineering
Description: © 2016. We present a case study describing efforts to optimise and modernise "Modal", the simulation and analysis pipeline used by the Planck satellite experiment for constraining general non-Gaussian models of the early universe via the bispectrum (or three-point correlator) of the cosmic microwave background radiation. We focus on one particular element of the code: the projection of bispectra from the end of inflation to the spherical shell at decoupling, which defines the CMB we observe today. This code involves a three-dimensional inner product between two functions, one of which requires an integral, on a non-rectangular domain containing a sparse grid. We show that by employing separable methods this calculation can be reduced to a one-dimensional summation plus two integrations, reducing the overall dimensionality from four to three. The introduction of separable functions also solves the issue of the non-rectangular sparse grid. This separable method can become unstable in certain scenarios and so the slower non-separable integral must be calculated instead. We present a discussion of the optimisation of both approaches.We demonstrate significant speed-ups of ≈100×, arising from a combination of algorithmic improvements and architecture-aware optimisations targeted at improving thread and vectorisation behaviour. The resulting MPI/OpenMP hybrid code is capable of executing on clusters containing processors and/or coprocessors, with strong-scaling efficiency of 98.6% on up to 16 nodes. We find that a single coprocessor outperforms two processor sockets by a factor of 1.3× and that running the same code across a combination of both microarchitectures improves performance-per-node by a factor of 3.38×. By making bispectrum calculations competitive with those for the power spectrum (or two-point correlator) we are now able to consider joint analysis for cosmological science exploitation of new data. ; This research is supported by an STFC consolidated grant ST/L000636/1, and funded in part by the Intel R ...
Document Type: article in journal/newspaper
File Description: application/pdf
Language: English
Relation: J. Briggs et al. Journal of Computational Physics (2016). DOI:10.1016/j.jcp.2016.01.019; https://www.repository.cam.ac.uk/handle/1810/253535
Availability: https://www.repository.cam.ac.uk/handle/1810/253535
Rights: Attribution 4.0 International ; https://creativecommons.org/licenses/by/4.0/
Accession Number: edsbas.97BC2FF5
Database: BASE