Welcome to the Reading Group in Fault Tolerance for High Performance Computing. This space has a few goals:
The reading group meets every other Wednesday at 4:00pm in room 3102 (Siebel Center).
- September 15, 2010
- October 6, 2010
- October 20, 2010
- October 27, 2010
- November 10, 2010
- March 09, 2011
- March 23, 2011
- April 27, 2011
- May 11, 2011
- May 25, 2011
|Building Fault Survivable MPI Programs with FT-MPI Using Diskless Checkpointing|| Z. Chen , G.E. Fagg , E. Gabriel
J. Langou , T. Angskun , G. Bosilca
|A large-scale study of failures in high-performance computing systems||B. Schroeder, G A. Gibson|| PDF
| Predicting Node Failure in High Performance Computing Systems fom Failure and Usage Logs
|| Nithin Nakka, Ankit Agrawal, Alok Choudhary
If you are in charge of presenting a paper for the next session, please follow these steps:
- Prepare a short presentation (5-7 slides) about the paper. Include a summary of the main ideas, highlighting the strong and weak points. Pay attention to things like the clarity in the description, the contribution, the quality of experiments and the impact in terms of future work.
- Go to section "Sessions" page by clicking the link at the bottom of this page.
- Create a page named as the actual date of the session.
- Copy the template from the page "Meeting Page Template" (find a link at the bottom of this page).
- Fill in all the fields of the template.
- Add a link to the page below the title "Sessions" on this page.
Most of the templates for the pages on this website were extracted from the "Parallel Reading Group" website.