Abstract
Time-varying behaviors of GPU program vulnerability could be exploited to reduce overheads for fault-tolerant designs. However, the inherent parallelism and performance overheads for massive fault injection (FI) hindered such assessments using FI. NVBitFI, a GPU FI tool featuring high-performance and good compatibility, allows time-varying vulnerability evaluations using FI within a reasonable time. We extended NVBitFI to control FI tests on the temporal dimension. A scalable workflow characterizing the time-varying vulnerability of GPU programs at two granularities is presented. A convenient approach to profile vulnerability with actual GPU time is also proposed. Results obtained from 60K fault injections demonstrated the feasibility of the proposed methodologies. A case study exploring the improved instruction-level grouping is presented. More than 340K faults are injected into the vectorAdd kernel to show the possibility to generalize the time-varying behavior of smaller inputs to realistic workloads with large inputs.
| Original language | English |
|---|---|
| Title of host publication | 35th IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFT 2022 |
| Volume | 2022- |
| DOIs | |
| State | Published - 2022 |
Fingerprint
Dive into the research topics of 'Understanding time-varying vulnerability accross GPU Program Lifetime'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver