Cavelan, Aurélien and Fang, Aiman and Chien, Andrew A. and Robert, Yves. (2017) Resilient N-Body Tree Computations with Algorithm-Based Focused Recovery: Model and Performance Analysis. In: High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation. PMBS 2017, 10724.
PDF
- Accepted Version
5Mb |
Official URL: http://edoc.unibas.ch/56464/
Downloads: Statistics Overview
Abstract
This paper presents a model and performance study for Algorithm-Based Focused Recovery (ABFR) applied to N-body computations, subject to latent errors. We make a detailed comparison with the classical Checkpoint/Restart (CR) approach. While the model applies to general frameworks, the performance study is limited to perfect binary trees, due to the inherent difficulty of the analysis. With ABFR, the crucial parameter is the detection interval, which bounds the error latency. We show that the detection interval has a dramatic impact on the overhead, and that optimally choosing its value leads to significant gains over the CR approach.
Faculties and Departments: | 05 Faculty of Science > Departement Mathematik und Informatik > Informatik > High Performance Computing (Ciorba) |
---|---|
UniBasel Contributors: | Cavelan, Aurélien and Ciorba, Florina M. |
Item Type: | Conference or Workshop Item, refereed |
Conference or workshop item Subtype: | Conference Paper |
Publisher: | Springer |
ISBN: | 978-3-319-72970-1 |
e-ISBN: | 978-3-319-72971-8 |
Series Name: | Lecture Notes in Computer Science |
ISSN: | 0302-9743 |
Note: | Publication type according to Uni Basel Research Database: Conference paper |
Language: | English |
Identification Number: | |
edoc DOI: | |
Last Modified: | 18 May 2018 12:53 |
Deposited On: | 18 May 2018 12:52 |
Repository Staff Only: item control page