Designing Reliable Analog Circuits in a Less-Than-Reliable World
Analog touches everything. If not for the multitude of analog and mixed-signal components that vendors have developed in the past decades, the digital “revolution” that has made so many industry sectors leap forward would not be possible.
Nor could these advances take place without an equal dedication to reliability, defined as the probability of a system or system element performing its intended function under stated conditions without failure for a designated period of time.
The semiconductor industry has no choice but to improve the reliability of what it designs and manufactures, particularly in safety-critical markets. As an example, consider that some automakers now require chips to last 18 years with zero defects in cars and trucks that represent some of the harshest application conditions. And in some industrial applications, where replacing sensors is difficult and part-level repair must be eliminated, chips need to last 20 years or more.
Design for Reliability
Chief causes of reliability issues include complexity of the design, error-prone processes, and limited verification to assess robustness. The mechanisms involved can be a physical, chemical, electrical, thermal, or other process that results in failure.
One culprit is advanced scaling of CMOS technology--adding functionality in smaller and faster chips that require less power. While scaling facilitates the design of highly integrated mixed-signal systems, it also poses reliability challenges: scaling does not apply to analog circuits in the same way as it does digital ICs. Generally speaking, the extreme advanced nodes do not have a major impact on the transistors in the analog portion of the design.
In this article we will examine reliability from the analog design and protection control IC perspective. Analog requires a different approach, one that has to be addressed by the designer without a heavy assist from design tools. The relative lack of automation in analog design puts more burden of achieving quality and performance on the experience of the individual designer and design teams.
It has become accepted practice that reliability must be woven into the total development cycle. This is known as Design for Reliability (DFR), and it is typically employed from early in the concept stage all the way through to product obsolescence--to ensure that customer expectations are fully met throughout the life of the product. DFR can design out or mitigate potential failure modes prior to production release, based on testing to discover issues and statistical analysis methods for reliability prediction.
Reliability engineering during this phase seeks to increase system robustness through measures such as redundancy, built-in test, and advanced diagnostics. Discrete analog component parameters tend to drift over time; environmental effects--corrosion, vibration, and temperature--also are problematic. Reliability testing can be performed at the component, subsystem, and system level throughout the product or system lifecycle.
System Protection
Faults happen within the electric network due to equipment failures or outside disturbances. But by implementing the proper circuit protection in your system, disasters can be prevented by, for instance, opening a circuit within the network to isolate the electrical fault. These protection systems sense the voltages and currents due to the electrical fault.
Analog output circuit protection covers a variety of situations. Understanding the environment and the application helps engineers decide what, and how much, should be done to protect the analog output.
To protect against common circuit faults (such as forward/reverse voltage/current protection), parts such as the MAX17608/09/10 adjustable overvoltage and overcurrent protection devices are well-suited to protect systems against positive and negative input voltage faults. The adjustable input overvoltage protection range is 5.5V to 60V, and the adjustable input undervoltage protection range is 4.5V to 59V. The input overvoltage-lockout (OVLO) and undervoltage-lockout (UVLO) thresholds are set using external resistors.
The devices feature programmable current-limit protection up to 1A, hence controlling the inrush current at startup while charging high capacitances at the output.
MAX17608 and MAX17610 block current flowing from OUT to IN, whereas MAX17609 allows current flow in the reverse direction. The devices feature thermal shutdown protection against excessive power dissipation. They are available in a small, 12-pin (3mm x 3mm) TDFN-EP package.
Reliability Monitoring
To ensure failure rates are kept to a minimum, the reliability of devices representative of those shipped from key wafer fab and assembly processes should be monitored under accelerated conditions. Sample sizes can vary but are typically a couple of hundred devices per family (or more) divided between various environmental stresses. The results can then be updated in a regularly published report.
Packaging used to protect and interconnect the IC must also be part of the reliability assurance chain. Packages are typically made with a conductive alloy lead frame that holds the IC die (or dies) and provides some mechanism of connecting it to the lead frame (such as flip-chip, direct die attach, or metal wires). The lead-frame assemblies are encapsulated with various materials, such as epoxy or ceramic, to protect the IC lead-frame assembly and provide mechanical stability.
Package integrity testing verifies the incoming quality of products to currently qualified specifications. For example, ultrasound testing is a sensitive nondestructive technique that can delineate package voids and internal interface separations and cracks. Standards such as JEDEC J-STD-020 can be used to identify the moisture sensitivity classification level of non-hermetic, solid-state surface mount devices (SMDs). The procedure can prevent potential damage as a result of moisture-induced stress during soldering operations.
Other preconditioning stresses are used to simulate typical device performance at elevated temperatures and maximum operating voltages. All failures should be analyzed for cause of failure. The results of this analysis can be used to establish corrective actions to eliminate the failure mechanisms found.
Devices are electrically tested at various read points in order to determine, under accelerated conditions, how the device might perform initially and during long-term life. Devices should be required to satisfy all data sheet specifications over full voltage and temperature ranges.
Maxim monitors the reliability of devices representative of those shipped from production. To find out more about the company‘s Reliability Monitor Program (RMP) go to: https://www.maximintegrated.com/en/support/qa-reliability/reliability/reliability-monitor-program.html