Toward predictable, efficient, system-level tolerance of transient faults Conference Paper uri icon