The purpose of the project is to create methodologies and tools to increase the dependability of digital systems.
- Duration: 2001-2003
- Coordinator: Politecnico di Torino
- Partners: Politecnico di Torino, Tallinn Technical University
- Funded by: Italian Ministry of Foreign Affairs
In space applications, digital systems work in a critical environment, where the radiation level is several times higher than on the earth. Under the radiation’s effects, the probability of transient and permanent faults occurrence is not negligible. Therefore, high dependability and availability are two of the main requirements in space applications. Moreover, high dependability is required in all those applications where a failure can produce catastrophic consequences.
In the past, radiation hardened components have been widely used to solve radiation effect problems, but these devices are much more expensive than non-radiation hardened components and have limited availability. In addition, the hardened devices are typically two or three generations behind non-hardened devices in terms of performance, and as the number of radiation hardened device manufacturers decreases, very few of the radiation hardened components will consequently be available in the future. Therefore, alternative methods that do not have these drawbacks need to be explored.
The need for low-cost, state-of-the-art high performance computing systems has been pushing researchers to investigate new fault-tolerance techniques. To overcome the drawbacks of using radiation hardened components, space system designers have recently considered the use of unhardened Commercial-Off-The-Shelf (COTS) components in the system because they can use the state-of-art technology available in the market at a cost very low compared to radiation hardened devices.
Unfortunately, commercial components are designed to work in an environment different from that of military and aerospace systems. They usually have limited fault avoidance and error detection capabilities. If commercial components have to be used for critical applications with no change in hardware, fault tolerance should be provided through software techniques. Software Implemented Hardware Fault Tolerance (SIHFT) detects or tolerates faults in the hardware by software method without any special hardware for error detection or fault tolerance. The benefit of employing SIHFT is that we can improve the availability of the system at low cost using the existing design of the hardware available in the market.
The goal of the project is the definition and the application of pure software methodologies able to detect and correct hardware faults generated by environmental radiations. Since the costs needed to rewrite the software of a system is too high, the project will lead to the implementation of a set of software tools able to automatically transform a program into a new functional-equivalent version able to detect and tolerate hardware faults.
The project will allow the implementation of a dependable digital system, without any modifications to the existing hardware, without any changes in the existing design flow, and without any knowledge in the dependability field by the software designers. The proposed techniques will focus on software data integrity and on the correct execution of the control flow of the program.