Feedback-Based Low-Power Soft-Error-Tolerant Design for Dual-Modular Redundancy

Feedback-Based Low-Power Soft-Error-Tolerant Design for Dual-Modular Redundancy


Triple-modular redundancy (TMR), which consists of three identical modules and a voting circuit, is a common architecture for soft-error tolerance. However, the original TMR suffers from two major drawbacks: the large area overhead and the vulnerability of the voter. In order to overcome these drawbacks, we propose a new complementary dual-modular redundancy (CDMR) scheme for mitigating the effect of soft errors. Inspired by the Markov random field (MRF) theory, a two-stage voting system is implemented in CDMR, including a first stage optimal MRF structure and a second-stage high-performance merging unit. The CDMR scheme can reduce the voting circuit area by 20% while saving the area of one redundant module, achieving at least 26% error-rate reduction at an ultralow supply voltage of 0.25 V with 8.33% faster timing compared to previous voter designs.


  • Modelsim
  • Xilinx 14.2


Triple-modular redundancy (TMR) was first proposed by Von Neumann et al, and has since been adopted as a technique to improve error tolerance at the cost of increased circuit area. TMR can only tolerate soft errors when the probability of three or two modules failing simultaneously is much lower than that of a single module. However, one obvious drawback is the increased area overhead. Therefore, partial TMR (PTMR) was proposed to reduce the area overhead by tradingoff reliability. The dual-modular redundancy (DMR) scheme presented and uses a three-module structure with self-feedback. Robust C-elements and multiplexers are used, respectively, to form voters in two different DMR designs. An algorithmic noise-tolerant (ANT) technique was proposed to solve the problem of soft errors caused by voltage over scaling. Algorithmic soft-error tolerance (ASET) and fine-grain soft-error tolerance (FGSET) designs are both extended ANT designs. The designs  suffer from two drawbacks. First, they still consume large area overhead. Second, reliability loss is incurred by soft errors in the voting design. The reason is that redundancies and estimator-based redundancies work well only when voters never fail, which might be an unrealistic assumption if the circuits are designed using a deep submicrotechnology or an ultralow supply voltage is used. Under such conditions, it is likely that such a failure could occur in the voting circuit, which is a main cause of TMR failure. For a multistage design, three identical voters could be used in each stage to tolerate errors that occur in one of the TMR voters, but this would add undesirable overhead to the design. Some approaches, such as generalized modular redundancy, approximate TMR, and a simulation-based synthesis scheme, improve the original TMR, but they only offer either an optimal implementation strategy or tradeoff accuracy.

A number of error-tolerant methods, such as Markov random field (MRF), differential cascode voltage switch (DCVS), and DCVS-MRF, have been proposed. In these designs, the basic elements include feedback loops that help them to achieve high soft-error tolerance. However, these implementations require higher area overhead than traditional structures. To solve soft-error issues in the voter and save area overhead, we propose a new complementary DMR (CDMR) scheme, as shown in Fig. 1. The CDMR scheme ensures the significance of soft-error tolerance even for the voting circuit. This is achieved by separately processing one module (M1) through a structure with a stable logic “1” as output (referred to as structure A in Fig. 1), and processing another identical module (M2) through a structure with a stable logic “0” as output (shown in Fig. 1 as structure B). A second-stage feedback structure is then used to merge the stable logic “1” and stable logic “0” outputs from the first stage, ensuring the best performance from the first stage (shown in Fig. 1 as structure C). The CDMR scheme outperforms existing designs in two key aspects by: 1) tolerating many soft errors propagated to the voting circuit and 2) saving the area overhead.


  • Larger area overheads are present
  • Soft errors are not reduced


MRF-Inspired two- stage feedback design

Fig. 2 can complement the loss of the error tolerance in g2 for the first stage using its latching property. The proposed structure benefits from the presence of stage 2 to improve its reliability, which is a feature that TMR, DMR, or other designs lack. Let us extend the single-error assumption for stage 1 by assuming that only one error can emerge from one of the complementary propagation chains at the same time. In other words, when an error occurs from stage 1, the latch structure of g3–g4 in stage 2 does not propagate errors received from stage 1. With respect to our proposed CDMR, the two redundant inputs to the voter must be complementary.

Table I

Values of g3–g4 Feedback and will propagate through stages 1 and 2 as complementary signals in the absence of errors. For example, an ideal input bit stream for xa(xa = xb) is {x0∼ x4 = 0 and x5∼ x9 = 1}. Four bits, x7 and x9 of xdand x 1 and x2  of x are flipped by noise, as circled by a small circle in Fig. 2. Their corresponding bits in the other branch are robust “1” because of the high tolerance of noisy input bit “0” in both NAND gates g1 and g2. This is why we only consider the cases where errors occur in weak “0” in xd or xe. This condition causes the second stage g3–g4 to remain in the hold state in Table I acting as an RS latch, thus protecting the final output results from the influence of the error bits in xd and xebased on the previous correct outputs. We adopted the widely used double-exponential current source to simulate the above cases where a charged or ionizing particle hits the output “0” of stage 1 circuit.

where Qtotal is the total charge caused by the particle strike, and τr and τf are the rising time constant and the falling time constant, respectively. As τ rand τ f are generally set to 50 and 164 ps for different process technologies, we used the current source Qtotal = 70 f c in our simulation. Regardless of whether x a and xb are both high or low, when a charged particle attacks x d or xe, there is one single peak shown in Fig. 2 in output x f . Compared with a much longer pulse at the output of a TMR voter when an error hits on one of its inner branches, it can be regarded to be less harmless in the proposed voter after sampling, as the error is too short to be sampled multiple times. The results in Fig. 2confirm the same error tolerance as what we deduced from the proposed structure in Fig. 2. In the extended one error condition, the output of our module can achieve correct operation as long as the two inner complementary signals are not in error at the same time.

we see that the proposed design has better soft-error tolerance. Therefore, the proposed voting circuit has both higher modular soft terror tolerance and reliability than those of TMR. For multistage logic, the voter is concatenated in each stage to improve the overall system reliability, as shown in Fig. 4(a)–(d). The original TMR, FGSET, and DMR voters for multistage are simply duplicated [refer to Fig. 4(a)–(c)]. However, the proposed voter has enclosed feedback loops and two outputs without voting duplication between two stages, as shown in Fig. 4(d). Note that this design has two complementary outputs as references for error correction. Overall, the area overhead is reduced by at least 50% compared to the designs used in TMR and DMR. We consider a 4-bit ripple-carry adder (RCA) as a case study for the proposed voter in Fig. 4. The input to the proposed design requires a differential input; thus, we redesigned the full adder (FA) as . We present two design schemes for adders. Scheme 1 (S1) in Fig. 4 is designed for a single unit with DMR, in which the outputs of the two modules are connected to a voter. Scheme 2 (S2) in Fig. 4 is implemented as a multistage design by adding a voter at every stage


  • Larger area overheads are reduced
  • Soft errors are reduced


Yan Li, Yufeng Li, Han Jie , Jianhao Hu, Fan Yang, Xuan Zeng, Bruce Cockburn, and Jie Chen, “Feedback-Based Low-Power Soft-Error-Tolerant Designfor Dual-Modular Redundancy”, IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2018.

About the Author