Fault Tolerant ARM Microprocessor.
- Duration: 2001-2003
- Coordinator: Politecnico di Torino
- Partners: Politecnico di Torino Alenia Spazio SPA Yogitech
- Funded by: ASI (Agenzia Spaziale Italiana)
FARM research and demonstrate solutions for the design of System on Chip ICs that tolerate soft errors by developing circuit and software designs to recognise and correct for faults during operation.
FARM will use intelligent techniques to target commercial applications and not compromise performance or escalate cost due to massive use of hardware or software redundancy or non-standard process technologies.
Noise and radiation cause soft errors by disturbing the state of memory bits or registers inside semiconductor devices. Deep Sub-Micron CMOS device geometries of 0.18 um and below operating at Vdd of 1.5 volts or below are becoming highly susceptible to soft errors because less energy is required to change the state of each flip-flop and memory cell. SoCs contain many millions of flip-flops, not just in memory but also throughout the Register Transfer Logic (RTL), so the likelihood of soft error rates becoming noticeable increases with device complexity and is significant when a single device contains 50 million transistors (12.5 million gates) or more. Such devices are common in contemporary SoC employing multiple embedded processor cores, DSPs, IP blocks and on-chip SRAM.
Soft Error Rate problems in Integrated Circuits have been recognised for some time by the space, nuclear and high energy physics communities where various techniques are used to protect and compensate circuits operating in radiation environments. Electronics for satellites, space missions, nuclear plants or high-energy physics experiments all have to function in the presence of a high radiation background. So-called Single Event Upsets (SEU) in RTL may cause digital circuits to crash frequently and techniques involving redundancy at circuit level (triple flip-flops with majority voting) or system level (dual processor with watchdog) are employed. Alternatively hardened process libraries have been employed which tend to be only available on older and less dense technologies (e.g. 0.7 um). Any of these techniques can double the circuit’s size, which taking yield curves into account may triple device cost. In addition performance is significantly impaired because of slower system clocks, increased gate count and inclusion of redundant software. Therefore a more sophisticated approach will be required for commercial applications where cost and performance are important factors. Research work in the US and presentations at recent conferences indicate this issue is increasingly recognised and may become a barrier to progress for large SoCs fabricated on 100 nm DSM for commercial products.
The FARM project will develop and combine techniques that both protect circuits against SEUs and allow circuits to recognise and compensate for soft errors. The resulting Intellectual Property can then be applied at commercial foundries from year 2005 on when 0.13 um and 100 nm processes that may be susceptible to radiation go to commercial production. FARM will research, develop and demonstrate:
- A Fault tolerant ARM microprocessor IP (FARM IP) block with operating system interface to detect an SEU in the processor or system bus and restore execution of the application.
- A fault tolerant Embedded SRAM with memory bits less susceptible to SEUs and automatic error detection and correction.
- A digital library with gates having a modified layout that achieves a level of tolerance against ionising dose effects.
The ARM series microprocessor is chosen because it is the widest used core in Europe. The circuitry to detect SEUs can be connected via existing de-bug interfaces available to designers on the ARM 9 core and ARM Module Bus Architecture. This circuitry in itself will have to be made fault resistant and will co-operate with the Operating System to provide an Application Programming Interface that can restore and re-execute code when an SEU occurs in the system. This approach is necessary because the best ARM cores are ‘hard’ macros inserted into device designs at GDSII that are not open to internal modifications and are unavailable in hardened process libraries. Open source OS and C compilers will be used in the project but the techniques and the IP block API will be suitable for transfer to proprietary software later.
The SRAM will be formed using an SRAM compiler to output an embedded memory core (ESRAM) with higher bit density than is produced with logic gates. Error Correction Circuit logic, which in itself must be fault resistant, will monitor the SRAM in the background and restore any bit patterns that are corrupted by a SEU. The ESRAM ECC logic will be under control of the FARM IP, which will also monitor fault correction activity to determine fault probability.
The tolerant process library can be developed to provide a range of transistors, gates and other functions wherein unusual physical topologies are employed for total ionising dose tolerance. This library will be used to develop ICs for applications in radiation environments such as commercial satellites. It may be combined with an ARM core, the FARM IP and FARM ESRAM to build fault tolerant SoCs. Thereafter it can be applied to other logic or mixed-signal circuits. The library will be characterised and models for the most common commercial simulation, synthesis and layout tools will be produced.
The FARM project will combine all of these techniques to produce an SoC that is demonstrably fault resistant and tolerant. In addition, a second SoC developed using the ‘tolerant’ library will be tailored to radiation environments, where the higher risk of SEUs is coupled to the need for ionising radiation tolerance. The results will be made available to customers wishing to design and apply such devices in the future. In this way FARM will develop understanding and solutions for SER/SEU issues that confront an increasing number of DSM SoC designs in the future.