|
[1]
|
Sarah Huber, Yasunori Futamura, Martin Galgon, Akira Imakura, Bruno Lang, and
Tetsuya Sakurai.
Flexible subspace iteration with moments for an effective contour
integration-based eigensolver.
Numer. Linear Algebra Appl., 29(6):e2447:1--15, December 2022.
[ DOI ]
Contour integration schemes are a valuable tool for
the solution of difficult interior eigenvalue
problems. However, the solution of many large linear
systems with multiple right hand sides may prove a
prohibitive computational expense. The number of right
hand sides, and thus, computational cost may be
reduced if the projected subspace is created using
multiple moments. In this work, we explore heuristics
for the choice and application of moments with respect
to various other important parameters in a contour
integration scheme. We provide evidence for the
expected performance, accuracy, and robustness of
various schemes, showing that good heuristic choices
can provide a scheme featuring good properties in all
three of these measures.
|
|
[2]
|
Christie L. Alappat, Andreas Alvermann, Achim Basermann, Holger Fehske,
Yasunori Futamura, Martin Galgon, Georg Hager, Sarah Huber, Akira Imakura,
Masatoshi Kawai, Moritz Kreutzer, Bruno Lang, Kengo Nakajima, Melven
Röhrig-Zöllner, Tetsuya Saturai, Faisal Shahzad, Jonas Thies, and
Gerhard Wellein.
ESSEX: Equipping sparse solvers for exascale.
In Hans-Joachim Bungartz, Severin Reiz, Benjamin Uekermann, Philipp
Neumann, and Wolfgang E. Nagel, editors, Software for Exascale Computing
-- SPPEXA 2016--2019, volume 136 of LNCSE, pages 143--187. Springer
Nature Switzerland AG, Cham, 2020.
[ DOI ]
The ESSEX project has investigated programming
concepts, data structures, and numerical algorithms
for scalable, efficient, and robust sparse eigenvalue
solvers on future heterogeneous exascale systems.
Starting without the burden of legacy code, a holistic
performance engineering process could be deployed
across the traditional software layers to identify
efficient implementations and guide sustainable
software development.
At the basic building blocks level, a flexible MPI+X
programming approach was implemented together with a
new sparse data structure (SELL-C-σ) to
support heterogeneous architectures by design.
Furthermore, ESSEX focused on hardware-efficient
kernels for all relevant architectures and efficient
data structures for block vector formulations of the
eigensolvers. The algorithm layer addressed standard,
generalized, and nonlinear eigenvalue problems and
provided some widely usable solver implementations
including a block Jacobi–Davidson algorithm,
contour-based integration schemes, and filter
polynomial approaches. Adding to the highly efficient
kernel implementations, algorithmic advances such as
adaptive precision, optimized filtering coefficients,
and preconditioning have further improved time to
solution. These developments were guided by quantum
physics applications, especially from the field of
topological insulator- or graphene-based systems. For
these, ScaMaC, a scalable matrix generation framework
for a broad set of quantum physics problems, was
developed. As the central software core of ESSEX, the
PHIST library for sparse systems of linear equations
and eigenvalue problems has been established. It
abstracts algorithmic developments from low-level
optimization. Finally, central ESSEX software
components and solvers have demonstrated scalability
and hardware efficiency on up to 256 K cores using
million-way process/thread-level parallelism.
|
|
[3]
|
Andreas Alvermann, Achim Basermann, Hans-Joachim Bungartz, Christian Carbogno,
Dominik Ernst, Holger Fehske, Yasunori Futamura, Martin Galgon, Georg Hager,
Sarah Huber, Thomas Huckle, Akihiro Ida, Akira Imakura, Masatoshi Kawai,
Simone Köcher, Moritz Kreutzer, Pavel Kus, Bruno Lang, Hermann Lederer,
Valeriy Manin, Andreas Marek, Kengo Nakajima, Lydia Nemec, Karsten Reuter,
Michael Rippl, Melven Röhrig-Zöllner, Tetsuya Sakurai, Matthias
Scheffler, Christoph Scheurer, Faisal Shahzad, Danilo Simoes Brambila,
Jonas Thies, and Gerhard Wellein.
Benefits from using mixed precision computations in the ELPA-AEO
and ESSEX-II eigensolver projects.
Japan J. Indust. Appl. Math., 36(2):699--717, 2019.
[ DOI ]
We first briefly report on the status and recent
achievements of the ELPA-AEO (Eigenvalue Solvers for
Petaflop Applications--Algorithmic Extensions and
Optimizations) and ESSEX II (Equipping Sparse Solvers
for Exascale) projects. In both collaboratory efforts,
scientists from the application areas, mathematicians,
and computer scientists work together to develop and
make available efficient highly parallel methods for
the solution of eigenvalue problems. Then we focus on
a topic addressed in both projects, the use of mixed
precision computations to enhance efficiency. We give
a more detailed description of our approaches for
benefiting from either lower or higher precision in
three selected contexts and of the results thus
obtained.
|
|
[4]
|
Martin Galgon, Sarah Huber, and Bruno Lang.
Mixed precision in subspace iteration-based eigensolvers.
Proc. Appl. Math. Mech., 18:e201800334, December 2018.
[ DOI ]
We consider the use of mixed precision in three
subspace iteration-based eigensolvers. We show that
for several algorithmic steps, computation in single
precision will temporarily limit convergence.
|
|
[5]
|
Lukas Krämer and Bruno Lang.
Convergence of integration-based methods for the solution of standard
and generalized Hermitian eigenvalue problems.
Electron. Trans. Numer. Anal., 48:183--201, 2018.
[ DOI ]
Recently, methods based on spectral projection and
numerical integration have been proposed in the
literature as candidates for reliable high-performance
eigenvalue solvers. The key ingredients of this type
of eigenvalue solver are a Rayleigh-Ritz process and a
routine to compute an approximation to the desired
eigenspace. The latter computation can be performed by
numerical integration of the resolvent. In this
article we investigate the progress of the
Rayleigh-Ritz process and the achievable quality of
the computed eigenpairs for the case that an upper
bound for the normwise difference between the
currently used subspace and the desired eigenspace is
available. Then, such bounds are derived for the
Gauß-Legendre rule and the trapezoidal rule.
|
|
[6]
|
Martin Galgon, Lukas Krämer, and Bruno Lang.
Improving projection-based eigensolvers via adaptive techniques.
Numer. Linear Algebra Appl., 25(1):e2124, January 2018.
[ DOI ]
We consider subspace iteration (or projection-based)
algorithms for computing those eigenvalues (and
associated eigenvectors) of a Hermitian matrix that
lie in a prescribed interval. For the case that the
projector is approximated with polynomials, we present
an adaptive strategy for selecting the degree of these
polynomials such that convergence is achieved with
near-to-optimum overall work without detailed a priori
knowledge about the eigenvalue distribution. The idea
is then transferred to the approximation of the
projector by numerical integration, which corresponds
to FEAST algorithm proposed by E. Polizzi in 2009.
[E. Polizzi: Density-matrix-based algorithm for
solving eigenvalue problems. Phys. Rev. B 2009;
79:115112]. Here, our adaptation controls the number
of integration nodes. We also discuss the interaction
of the method with search space reduction methods.
|
|
[7]
|
Martin Galgon, Lukas Krämer, Bruno Lang, Andreas Alvermann, Holger
Fehske, Andreas Pieper, Georg Hager, Moritz Kreutzer, Faisal Shahzad, Gerhard
Wellein, Achim Basermann, Melven Röhrig-Zöllner, and Jonas Thies.
Improved coefficients for polynomial filtering in ESSEX.
In Tetsuya Sakurai, Shao-Liang Zhang, Toshiyuki Imamura, Yusaku
Yamamoto, Yoshinobu Kuramashi, and Takeo Hoshi, editors, Eigenvalue
Problems: Algorithms, Software and Applications in Petascale Computing.
Proc. EPASA 2015, Tsukuba, Japan, September 2015, volume 117 of
LNCSE. Springer International Publishing, Cham, 2017.
[ DOI ]
The ESSEX project is an ongoing effort to provide
exascale-enabled sparse eigensolvers, especially for
quantum physics and related application areas. In this
paper we first briefly summarize some key achievements
that have been made within this project.
Then we focus on a projection-based eigensolver with
polynomial approximation of the projector. This
eigensolver can be used for computing hundreds of
interior eigenvalues of large sparse matrices. We
describe techniques that allow using lower-degree
polynomials than possible with standard Chebyshev
expansion of the window function and kernel smoothing.
With these polynomials, the degree, and thus the
number of matrix--vector multiplications, typically
can be reduced by roughly one half, resulting in
comparable savings in runtime.
|
|
[8]
|
Moritz Kreutzer, Jonas Thies, Melven Röhrig-Zöllner, Andreas
Pieper, Faisal Shahzad, Martin Galgon, Achim Basermann, Holger Fehske, Georg
Hager, and Gerhard Wellein.
GHOST: Building blocks for high performance sparse linear algebra
on heterogeneous systems.
Int. J. Parallel Prog., October 2016.
[ http ]
While many of the architectural details of future
exascale-class high performance computer systems are
still a matter of intense research, there appears to
be a general consensus that they will be strongly
heterogeneous, featuring “standard” as well as
“accelerated” resources. Today, such resources are
available as multicore processors, graphics processing
units (GPUs), and other accelerators such as the Intel
Xeon Phi. Any software infrastructure that claims
usefulness for such environments must be able to meet
their inherent challenges: massive multi-level
parallelism, topology, asynchronicity, and
abstraction. The “General, Hybrid, and Optimized
Sparse Toolkit” (GHOST) is a collection of building
blocks that targets algorithms dealing with sparse
matrix representations on current and future
large-scale systems. It implements the “MPI+X”
paradigm, has a pure C interface, and provides
hybrid-parallel numerical kernels, intelligent
resource management, and truly heterogeneous
parallelism for multicore CPUs, Nvidia GPUs, and the
Intel Xeon Phi. We describe the details of its design
with respect to the challenges posed by modern
heterogeneous supercomputers and recent algorithmic
developments. Implementation details which are
indispensable for achieving high efficiency are
pointed out and their necessity is justified by
performance measurements or predictions based on
performance models. We also provide instructions on
how to make use of GHOST in existing software
packages, together with a case study which
demonstrates the applicability and performance of
GHOST as a component within a larger software stack.
The library code and several applications are
available as open source.
|
|
[9]
|
Andreas Pieper, Moritz Kreutzer, Andreas Alvermann, Martin Galgon, Holger
Fehske, Georg Hager, Bruno Lang, and Gerhard Wellein.
High-performance implementation of Chebyshev filter diagonalization
for interior eigenvalue computations.
J. Comput. Phys., 325:226--243, 2016.
[ DOI ]
We study Chebyshev filter diagonalization as a tool
for the computation of many interior eigenvalues of
very large sparse symmetric matrices. In this
technique the subspace projection onto the target
space of wanted eigenvectors is approximated with
filter polynomials obtained from Chebyshev expansions
of window functions. After the discussion of the
conceptual foundations of Chebyshev filter
diagonalization we analyze the impact of the choice of
the damping kernel, search space size, and filter
polynomial degree on the computational accuracy and
effort, before we describe the necessary steps towards
a parallel high-performance implementation. Because
Chebyshev filter diagonalization avoids the need for
matrix inversion it can deal with matrices and problem
sizes that are presently not accessible with rational
function methods based on direct or iterative linear
solvers. To demonstrate the potential of Chebyshev
filter diagonalization for large-scale problems of
this kind we include as an example the computation of
the 102 innermost eigenpairs of a topological
insulator matrix with dimension 109 derived from
quantum physics applications.
|
|
[10]
|
Moritz Kreutzer, Jonas Thies, Andreas Pieper, Andreas Alvermann, Martin Galgon,
Melven Röhrig-Zöllner, Faisal Shahzad, Achim Basermann, Alan R.
Bishop, Holger Fehske, Georg Hager, Bruno Lang, and Gerhard Wellein.
Performance engineering and energy efficiency of building blocks for
large, sparse eigenvalue computations on heterogeneous supercomputers.
In Hans-Joachim Bungartz, Philipp Neumann, and Wolfgang E. Nagel,
editors, Software for Exascale Computing -- SPPEXA 2013--2015, volume
113 of LNCSE, pages 317--338. Springer, Switzerland, 2016.
[ DOI ]
Numerous challenges have to be mastered as
applications in scientific computing are being
developed for post-petascale parallel systems. While
ample parallelism is usually available in the
numerical problems at hand, the efficient use of
supercomputer resources requires not only good
scalability but also a verifiably effective use of
resources on the core, the processor, and the
accelerator level. Furthermore, power dissipation and
energy consumption are becoming further optimization
targets besides time-to-solution. Performance
Engineering (PE) is the pivotal strategy for
developing effective parallel code on all levels of
modern architectures. In this paper we report on the
development and use of low-level parallel building
blocks in the GHOST library (“General, Hybrid, and
Optimized Sparse Toolkit”). We demonstrate the use of
PE in optimizing a density of states computation using
the Kernel Polynomial Method, and show that reduction
of runtime and reduction of energy are literally the
same goal in this case. We also give a brief overview
of the capabilities of GHOST and the applications in
which it is being used successfully.
|
|
[11]
|
Jonas Thies, Martin Galgon, Faisal Shahzad, Andreas Alvermann, Moritz Kreutzer,
Andreas Pieper, Melven Röhrig-Zöllner, Achim Basermann, Holger
Fehske, Georg Hager, Bruno Lang, and Gerhard Wellein.
Towards an exascale enabled sparse solver repository.
In Hans-Joachim Bungartz, Philipp Neumann, and Wolfgang E. Nagel,
editors, Software for Exascale Computing -- SPPEXA 2013--2015, volume
113 of LNCSE, pages 295--316. Springer, Switzerland, 2016.
[ DOI ]
As we approach the exascale computing era, disruptive
changes in the software landscape are required to
tackle the challenges posed by manycore CPUs and
accelerators. We discuss the development of a new
`exascale enabled’ sparse solver repository (the
ESSR) that addresses these challenges---from
fundamental design considerations and development
processes to actual implementations of some
prototypical iterative schemes for computing
eigenvalues of sparse matrices. Key features of the
ESSR include holistic performance engineering, tight
integration between software layers and mechanisms to
mitigate hardware failures.
|
|
[12]
|
Martin Galgon, Lukas Krämer, Jonas Thies, Achim Basermann, and Bruno
Lang.
On the parallel iterative solution of linear systems arising in the
FEAST algorithm for computing inner eigenvalues.
Parallel Comput., 49:153--163, 2015.
[ DOI ]
Methods for the solution of sparse eigenvalue problems
that are based on spectral projectors and contour
integration have recently attracted more and more
attention. Such methods require the solution of many
shifted linear systems of full size. In most of the
literature concerning these eigenvalue solvers, only
few words are said on the solution of the linear
systems, but they turn out to be very hard to solve by
iterative linear solvers in practice. In this work we
identify a row projection method for the solution of
the inner linear systems encountered in the FEAST
algorithm and introduce a novel hybrid parallel and
fully iterative implementation of the eigenvalue
solver. Our approach ultimately aims at achieving
extreme parallelism by exploiting the algorithm's
potential on several levels. We present numerical
examples where graphene modeling is one of the target
applications. In this application, several hundred or
even thousands of eigenvalues from the interior of the
spectrum are required, which is a big challenge for
state-of-the-art numerical methods.
|
|
[13]
|
Martin Galgon, Lukas Krämer, Bruno Lang, Andreas Alvermann, Holger
Fehske, and Andreas Pieper.
Improving robustness of the FEAST algorithm and solving eigenvalue
problems from graphene nanoribbons.
Proc. Appl. Math. Mech., 14(1):821--822, December 2014.
[ DOI ]
We consider the FEAST eigensolver, introduced by
Polizzi in 2009 [5]. We describe an improvement
concerning the reliability of the algorithm and
discuss an application in the solution of eigenvalue
problems from graphene modeling.
|
|
[14]
|
Andreas Alvermann, Achim Basermann, Holger Fehske, Martin Galgon, Georg Hager,
Moritz Kreutzer, Lukas Krämer, Bruno Lang, Andreas Pieper, Melven
Röhrig-Zöllner, Faisal Shahzad, Jonas Thies, and Gerhard Wellein.
ESSEX: Equipping sparse solvers for exascale.
In Luís Lopes et al., editors, Euro-Par 2014: Parallel
Processing Workshops, volume 8806 of LNCS, pages 577--588. Springer,
2014.
[ DOI ]
The ESSEX project investigates computational issues
arising at exascale for large-scale sparse eigenvalue
problems and develops programming concepts and
numerical methods for their solution. The project
pursues a coherent co-design of all software layers
where a holistic performance engineering process
guides code development across the classic boundaries
of application, numerical method, and basic kernel
library. Within ESSEX the numerical methods cover
widely applicable solvers such as classic Krylov,
Jacobi-Davidson, or the recent FEAST methods, as well
as domain-specific iterative schemes relevant for the
ESSEX quantum physics application. This report
introduces the project structure and presents selected
results which demonstrate the potential impact of
ESSEX for efficient sparse solvers on highly scalable
heterogeneous supercomputers.
|