10th International Workshop on Parallel Matrix Algorithms and Applications
10th International Workshop on Parallel Matrix Algorithms and Applications
June 27-29, 2018 // ETH Zurich // Zurich, Switzerland


The scientific program of PMAA18 consists of plenary speakers, participant-organized minisyposia, and contributed presentations.


A minisymposium consists of four (or eight) talks with a common technical theme. Minisymposium organizers are responsible for securing the participation of their speakers and collecting and forwarding initial speaker and talk title information, upon which minisymposium selection will be based. Once minisymposia are approved, speakers may use the web-based automated submission procedure as other presenters to enter their finalized titles, abstracts, author and affiliation data, and speaker designation data (in the case of multiply-authored papers).

Please see these instructions for more details on minisymposium proposal submissions.

Contributed Presentations

A contributed talk is a 25-minute presentation, which will be grouped by the organizers with other contributed talks into the two-hour parallel sessions each morning and afternoon of the conference.

Presenters are asked to submit their titles, abstract (600 words or less), author and affiliation data, and speaker designation data (in the case of multiply-authored papers) using the abstract submission procedure.

The deadline for fullest consideration is March 30, 2018. Submitters will be notified of acceptance no later than April 20, 2018. If an earlier notification is required for visa purposes for participants from countries with restricted access to Switzerland, please e-mail your submission to as soon as possible, and special attention will be given.

Keynote Speakers

Dr. Chao Yang

Lawrence Berkeley National Laboratory, USA

Scalable Eigensolver with Applications in Computational Physics and Chemistry

Solving the quantum many-body problem efficiently and accurately is one of the biggest challenges in computational physics and chemistry. There are broadly two approaches to seeking an approximate solution to this high-dimensional eigenvalue problem. One relies on projecting the many-body Hamiltonian onto a carefully chosen subspace of many-body basis functions. The other relies on constructing an effective mean-field model to capture the essential many-body physics that governs the interaction among different particles. These approaches yield algebraic eigenvalue problems that have different characteristics. Developing efficient computational schemes to tackle these problems on massively parallel computers requires choosing appropriate data structures to represent both the discretized Hamiltonian and the eigenvector to be computed, mapping such data structures onto a distribute memory multi-core processor grid, exploiting multiple threads within a computational node and improving the scalability of the computation by generating multiple levels of concurrency and reducing communication overhead. In this talk, I will give an overview on recent progress in these areas and point out the remaining challenges.

Prof. Dr. Andreas Stathopoulos

College of William & Mary, Williamsburg VA, USA

Does machine learning need the power of iterative methods for the SVD?

Machine learning has emerged as one of the primary clients for large scale singular value calculations. Applications include clustering, recommendation systems, factor models in econometrics, and large-scale kernel methods. The matrices can be very large, although nonzero sparsity may not always be present. In some cases, one or two largest or smallest singular triplets are needed, while in other cases, a low rank approximation to the matrix is needed. One particular difference of these applications from traditional PDE applications is that low accuracy (typically 1-2 relative digits) is sufficient.
To solve the SVD problem on large matrices, practitioners have traditionally turned to iterative methods such as Lanczos bidiagonalization or the restarted and preconditioned variants based on Davidson and LOBPCG. But for the specific requirements in machine learning, two different classes of methods are becoming increasingly popular. One class is the randomized SVD methods which focuses on choosing an appropriately sized initial space which with minimal iterations gives the desired space. The second class is streaming methods, where the matrix is accessed in its entirety but only once.
In this talk we address the question of what problems are best suited for what type of methods, and present a unified view of randomized and iterative methods that is helpful both for developing and for using SVD software.

Prof. Dr. Frédéric Nataf

Université Pierre et Marie Curie, Paris, France

Domain Decomposition Methods: Theory and Applications

Domain decomposition methods are a popular way to solve large linear systems on parallel architectures. These methods are based on a divide/conquer strategy. At each step of the algorithm, a problem is solved concurrently in each subdomain and then interfaces data are exchanged between neighboring subdomains. These are coarse grain algorithms since there are based on local volume computation and only surface data movements. Thanks to their very good ratio local computations/data movement, they are thus naturally well adapted to modern computer architectures.
But, the original Schwarz method is slow. When implemented with a minimal overlap, it amounts to a block Jacobi method. Its convergence rate can be improved by using more generous overlaps and modifying the local blocks. In order to reach scalability, when the number of subdomains is large, a second level is introduced. At each step of the algorithm, a coarse problem with one or few unknowns per subdomain is used to further coordinate the solution between the subdomains globally. Theoretical results and numerical investigations (over a billion unknowns) for porous media flows, linear elasticity equations confirm robustness with respect to heterogeneous coefficients, automatic (non regular) partitions into subdomains and nearly incompressible behavior. Numerical results for large scale harmonic wave propagation phenomena will be shown. These results are obtained via an implementation in a Domain Specific Language devoted to the finite element method.

Prof. Dr. Edgar Solomonik

University of Illinois at Urbana-Champaign, USA

Scalable Tensor Algorithms for Scientific Computing

Matrix and tensor eigenvalue computations consist of sequences of rectangular QR factorization and (sparse) tensor contractions (matrix products). We present results in improving the communication cost of these building blocks and show that they are optimal with respect to lower bounds. These algorithmic techniques show performance improvement across a range of parallel architectures. We make available distributed-memory implementations of these sparse and dense tensor algebra routines via the Cyclops library. We highlight the application of this C++/Python library to high-accuracy chemistry, quantum circuit simulation, and graph analysis.


A minisymposium consists of four (25+5)-minute presentations, meaning a talk of 25 minutes with an additional five minutes for discussion after each presentation.

Prospective minisymposium organizers should submit a short proposal for the minisymposium by email to (Deadline for minisymposium proposal: April 8, 2018), and each accepted minisymposium speaker should submit a 600-word abstract (in a text file) through the abstract submission web page (Deadline for abstract submission April 20, 2018).

These will be reviewed by the scientific committee. The number of minisymposia may be limited to retain an acceptable level of parallelism in the conference sessions.

Minisymposia proposals should be e-mailed to by April 8, 2018 to receive fullest consideration. Selections will be announced within two days after this deadline.

Proposals consist of:

  1. over-arching title,
  2. organizer(s) with affiliations and e-mails,
  3. abstract (600 words or less) of overall minisymposium,
  4. list of speakers with affiliations and e-mails,
  5. tentative titles of each speaker's talk.

Proposals should be in uncompressed plaintext, pdf, or doc format. Minisymposia are allowed two or four hours each. This means four or eight (25+5)-minute talks.

Accepted Minisymposia

  • Parallelization aspects of SVD and EVD computations (5 speakers)
    M. Vajteršic, G. Okša
  • Efficient dense eigensolvers - Methods and applications (8)
    Th. Huckle, T. Imamura, B. Lang
  • High performance accurate computing (9)
    H. Hasegawa
  • Parallel eigenvalue solvers for large scale problems (8)
    E. Di Napoli, P. Arbenz
  • Scalable communication-reducing Krylov subspace methods (4)
    S. Cools
  • Recent advances in parallel sparse direct solvers (4)
    E. G. Ng
  • Resilience in scientific computing (4)
    E. Agullo, L. Giraud, K. Teranishi
  • Krylov and regularization methods for large scale inverse problems (4)
    N. Schenkels and W. Vanroose
  • Task-based programming for scientific computing (8)
    E. Agullo, A. Buttari, G. Bosilca
  • Parallel-in-time methods for HPC (4)
    M. Bolten, R. Krause, R. Speck