All-Programmable System-on-Chips (APSoCs) constitute a compelling option for employing applications in radiation environments thanks to their high-performance computing and power efficiency merits. Despite these advantages, APSoCs are sensitive to radiation like any other electronic device. Processors embedded in APSoCs, therefore, have to be adequately hardened against ionizing-radiation to make them a viable choice of design for harsh environments. This paper proposes a novel lockstep-based approach to harden the dual-core ARM Cortex-A9 processor in the Xilinx Zynq-7000 APSoC against radiation-induced soft errors by coupling it with a MicroBlaze TMR subsystem in the programmable logic (PL) layer of the Zynq. The proposed technique uses the concepts of checkpointing along with roll-back and roll-forward mechanisms at the software level, i.e. software redundancy, as well as processor replication and checker circuits at the hardware level (i.e. hardware redundancy). Results of fault injection experiments show that the proposed approach achieves high levels of protection against soft errors by mitigating around 98% of bit-flips injected into the register files of both ARM cores while keeping timing performance overhead as low as 25% if block and application sizes are adjusted appropriately. Furthermore, the incorporation of the roll-forward recovery operation in addition to the roll-back operation improves the Mean Workload between Failures (MWBF) of the system by up to ≈19% depending on the nature of the running application, since the application can proceed faster, in a scenario where a fault occurs, when treated with the roll-forward operation rather than roll-back operation. Thus, relatively more data can be processed before the next error occurs in the system.
Bibliographical noteThis is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
FunderUK Engineering and Physical Sciences Research Council through grants EP/P017487/1 , EP/R02572X/1 and EP/V000462/1
- ARM cortex-a processor
- Fault tolerance
- MicroBlaze processor
- Soft error mitigation
- Zynq APSoC
ASJC Scopus subject areas
- Electronic, Optical and Magnetic Materials
- Atomic and Molecular Physics, and Optics
- Condensed Matter Physics
- Safety, Risk, Reliability and Quality
- Surfaces, Coatings and Films
- Electrical and Electronic Engineering