In a previous posting, I addressed two types of failure: early life and random. By definition, early life failures occur early during the product lifecycle whereas random fails can occur at any time. Near the end of the product lifecycle appears yet another type of failure: wear-out failure.
The intention of most designs is to create a product that provides value and delays or avoids wear out long enough to permit that value to be delivered. However, most materials change (degrade) over time. Water, oxygen, and physical wear and many other factors tend to erode a product’s ability to function. There are many ways in which products wear out. The rate of use, the operating temperature, and many other local operation and environmental factors contribute to the rate of wear out. Wear-out failures include the following: metal fatigue (due to crack formation), corrosion (from a chemical change), abrasive wear (e.g., brake pads), and polymer loss of elasticity (from crazing).
Once the risk of failure from supply chain, manufacturing, and overstress are minimized, the remaining risk is wear out. A design that does not account for this source of failure may experience premature failure of all products placed in service. This may impact warranty claims, loss of brand image, etc. Fortunately, with a focus on the failure mechanisms discovered or known, we can reasonably estimate how long before a product will succumb to wear-out failure.
There are a few common means to estimate when a product will fail. One has to keep in mind the amount of variation present in and among individual products and when, where, and how they are used. The set of assumptions made during the approach to the estimate is often as important as how well the failure mechanisms are known and modeled. Various methods are used for applying stress to a product to make a life estimate; these include looking at nominal or worst-case stress situations or using Monte Carlo methods.
One approach is to estimate the worst-case set of stresses and apply those to the failure mechanisms most likely to occur to form a basis for the prediction. This is a conservative, yet practical approach. A similar approach is to use nominal conditions but this is not as conservative and so is rarely used.
Another approach is to use a random set of stress conditions drawn from the known set of stress condition distributions and apply those to life models of the dominant failure mechanisms. Repeating the selection of conditions and projecting the time to failure via appropriate life models provides an estimate of the life distribution, not just a point estimate.
For the individual failure mechanisms or with a product that may expect a single dominant failure mechanism, focusing on the life model of the failure mechanism is appropriate. Not everything is temperature driven and modeled by the Arrhenius rate equation. Thermal cycling may cause solder fatigue, changes in temperature and humidity may cause CMOS electromigration, and there are hundreds or more models specific to individual failure mechanisms.
The term ”physics of failure” implies that the model of the failure mechanism reaches down to the physics (or chemistry) level.
It is beyond the scope of this posting to address all the means to characterize the time-to-failure behavior of a failure mechanism, yet there are many good references on the subject. Testing may focus on samples, components, subsystems, or full products and may include normal use rate and conditions or accelerated use rate and/or stress conditions. The focus is on the failure mechanism, and using a known model from the literature or internal experimentation enables one to understand how long the product is likely to last. This estimate is then compared to the reliability goal.
Besides the focus on failure mechanism models, a widespread practice for estimating reliability of electronic products is to use a parts-count prediction method. There are standards that offer a listing of failure rates for components. These documents provide a means to tally up expected failure rates and predict the product failure rate relatively quickly. One should, however, take the results of such approaches with due skepticism as they are rarely accurate and may provide a result that is incorrect by over 100%.
Parts-count prediction like engineering judgment does play a role in estimating product life; it assists the design team in making decisions. Parts-count predictions also encourage reducing parts count within a product and keeping the temperature low across the components. These are good outcomes and assist in the creation of a reliable product.
Minimizing and controlling supply chain and manufacturing sources of failure and designing the product to withstand the expected variation in stress and strengths involved provide a solid platform for obtaining a reliable product. The decisions made during design, including material, component, and assembly details, will impact the time until the onset of wear out. If designed properly, a product may have a long and useful life and provide value. Understanding the failure mechanisms allows us to know the likelihood of that product surviving for the duration of its expected life.
Bio:
Fred Schenkelberg is an experienced reliability engineering and management consultant with his firm FMS Reliability. His passion is working with teams to create cost-effective reliability programs that solve problems, create durable and reliable products, increase customer satisfaction, and reduce warranty costs. If you enjoyed this articles consider subscribing to the ongoing series Musings on Reliability and Maintenance Topics.