Benoit, Anne and Cavelan, Aurélien and Ciorba, Florina M. and Le Fèvre, Valentin and Robert, Yves. (2018) Combining Checkpointing and Replication for Reliable Execution of Linear Workflows. In: 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPS Workshops. pp. 793-802.
Full text not available from this repository.
Official URL: https://edoc.unibas.ch/68674/
Downloads: Statistics Overview
Abstract
This report combines checkpointing and replication for the reliable execution of linear work ows. While both methods have been studied separately, their combination has not yet been investigated despite its promising potential to minimize the execution time of linear work ows in failure-prone environments. The combination raises new problems: for each task, we have to decide whether to checkpoint and/or replicate it. We provide an optimal dynamic programming algorithm of quadratic complexity to solve both problems. This dynamic programming algorithm has been validated through extensive simulations that reveal the conditions in which checkpointing only, replication only, or the combination of both techniques lead to improved performance.
Faculties and Departments: | 05 Faculty of Science > Departement Mathematik und Informatik > Informatik > High Performance Computing (Ciorba) |
---|---|
UniBasel Contributors: | Cavelan, Aurélien and Ciorba, Florina M. |
Item Type: | Conference or Workshop Item, refereed |
Conference or workshop item Subtype: | Conference Paper |
Publisher: | IEEE Computer Society |
Note: | Publication type according to Uni Basel Research Database: Conference paper |
Related URLs: | |
Identification Number: | |
Last Modified: | 12 Jun 2020 09:34 |
Deposited On: | 12 Jun 2020 09:34 |
Repository Staff Only: item control page