Reproducibility and Accuracy for High-Performance Computing - Université Pierre et Marie Curie Accéder directement au contenu
Communication Dans Un Congrès Année : 2015

Reproducibility and Accuracy for High-Performance Computing

Résumé

On modern multi-core, many-core, and heterogeneous architectures, floating-point computations, especially reductions, may become non-deterministic and, therefore, non-reproducible mainly due to the non-associativity of floating-point operations. We introduce an approach to compute the correctly rounded sums of large floating-point vectors accurately and efficiently, achieving deterministic results by construction. Our multi-level algorithm consists of two main stages: a filtering stage that relies on fast vectorized floating-point expansions, and an accumulation stage based on superaccumulators in a high-radix carry-save representation. We extend this approach to dot product and matrix-matrix multiplication. In this talk, I will present the reproducible and accurate (rounding to the nearest) algorithms for summation, dot product, and matrix-matrix multiplication as well as their implementations in parallel environments such as Intel server CPUs, Intel Xeon Phi, and both NVIDIA and AMD GPUs. I will show that the performance of our algorithms is comparable with the standard implementations.
raim2015.pdf (1.22 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-01140531 , version 1 (10-04-2015)

Identifiants

  • HAL Id : hal-01140531 , version 1

Citer

Roman Iakymchuk, Caroline Collange, David Defour, Stef Graillat. Reproducibility and Accuracy for High-Performance Computing. RAIM: Rencontres Arithmétiques de l’Informatique Mathématique, Apr 2015, Rennes, France. ⟨hal-01140531⟩
464 Consultations
201 Téléchargements

Partager

Gmail Facebook X LinkedIn More