#90 – RELIABILITY TESTING CONSIDERATIONS – FRED SCHENKELBERG

Reliability testing to determine what will fail or when the failures will occur can be expensive. Organizations invest in the development of a product and attempt through the design process to create a product that is reliable. The design process has many unknowns though. This includes uncertainties about materials, design margins, use environments, loads, and aging effects. Using the best practices of design for reliability will minimize this list of risks to product reliability, yet it will not resolve all the uncertainty.

Reliability testing is expensive. It is an investment in the knowledge about a design. Reliability testing done well reduces uncertainty and risk. Planning a useful reliability testing program entails consideration of a number of factors. Asking the right questions is the best way to explore these factors.

How Critical Is the Product?

Is the product important to customers? If an unexpected failure would shut down the customer’s business or cause significant damage, we may want to understand reliability very well. In contrast, if a failure would have little to no consequence, the motivation to conduct reliability testing is less.

Is the product critical to the business? Here ‘critical’ may include brand promise or market share or profitability. A product that has less than expected reliability may have significant impact on profitability and operations. If the failure of the product, even at a high rate, would have little impact on the business, again there is less motivation for testing.

Are Safety and Reliability a Concern?

Related to criticality is safety. If product failure leads to dangerous conditions, then understanding the failure mechanisms and time to failure is important. Some products inherently only fail safely, reducing the need for testing.

Does the Customer Need a Certain Level of Reliability?

All customers have some level of reliability expectation. Some provide specific requirements. If specified, are the requirements clear and appropriate? For unspecified requirements, do we have a good estimate of customer expectations?

In either case, the bounds for reliability testing are often set by customer expectations. Creating test plans that will evaluate a product’s ability to meet customer expectations is not new. For reliability, it may take a little work to interpret expectations and convert them into meaningful reliability objectives, yet the results of testing then provide relevant information.

How Mature Is the Design?

Reliability testing often takes time to accomplish. For example, a thermal cycling test for solder joint fatigue failures may require four to six months to accomplish. If the nature of the electronic packaging is not yet known, we have the choice of testing the possible options (which can be expensive) or waiting until a choice is made (reducing the time for testing before having to make a decision on reliability suitability).

Another consideration is maturity in the market and application. If we are using common technologies that have a suitable track record in a range of applications, there is less need for reliability testing. It is about risk. In immature designs or technologies, it is the unknown that creates risk and the need for reliability testing.

Are New Technologies or Processes Involved?

The word ‘new’ comes with a red flag for attention. What do we know about the new technology or process? ‘New’ often means we have a lot to learn, and reliability testing is one appropriate method. A good practice, time permitting, is to fully evaluate the reliability aspects of a new technology or process offline from a specific product or application. Once these are vetted and understood, then we can add the ‘new’ to a product development program. This is not often done and provides a source of significant risk in the product development process.

How Complex Is the Product?

Simple products are easy to understand and evaluate, and complex products are not. Simple products have fewer failure mechanisms and interactions, making design reliability testing straightforward. Complex products are a group of simple products and include the interactions among all the simpler elements.

Consider the following example: A reliability professional working with business desktop computers has found that for some unknown reasons (not fully understood) some models of hard drives would not operate well with some models of power supplies. All the individual elements worked well with other models of hard drives or power supplies and all were within required operating specifications. Yet, something caused certain pairings to fail. This might be termed the ‘white space problem,’ referring to the unstated and real interactions that occur within complex products.

What Are the Environmental Extremes Involved?

The concepts of stress and strength apply here. How much margin does the design have for the expected range of environmental stress that it will experience? This is not just temperature and humidity but also shock load during transportation, the number of times a handheld device will be dropped and from how high, the range of chemicals and concentrations, the use rate and loads, etc. One has to consider the full range of environmental exposure and combinations that may significantly accelerate failures or accumulate damage, thus shortening product life.

What Is the Budget for Testing?

As stated earlier, reliability testing is expensive. If the budget for reliability testing is insufficient, the risks associated with the lack of knowledge remain. The balance between knowledge and investment must be taken into account with every test plan. Having a clear understanding of the budgetary constraints allows a proper discussion on the right amount of testing.

Are Equipment and Facilities Available to Perform the Tests?

With enough money and time, we can create test facilities to evaluate nearly any risk. If these are readily available, then testing can proceed. If not, then the question is whether the knowledge gained by testing is worth the extra investment.

For example, NASA has large vacuum chambers that permit the creation of all the conditions of outer space (except low gravity). The chambers can replicate solar radiation, temperature, vacuum, etc., allowing researchers to evaluate how materials and systems operate in nearly the same conditions as experienced in space. Because spacecraft are expensive projects and system failures can have devastating consequences, such elaborate testing facilities are warranted.

How Many Items Are Available for Testing?

Early planning and a large budget help here, yet it is rare that we have sufficient samples to make testing results statistically significant. This is another situation in which a balance must be achieved, here between sample size and risk. One of the contributors to failure is the naturally occurring variations among products. Some are stronger, and some are weaker. Testing just a few samples does not adequately reflect the range of (often) unknown important variations.

If possible, one can create devices at the range of variation. For example, it is expensive to create prototype integrated circuits, so getting 100 samples that adequately represent the range of variation induced by the fabrication process is cost prohibitive. If the circuit timing is important, one possibility is to create a few samples with slow and fast circuit properties. This enables testing at the edges of the expected variation at less overall cost.

Given a fixed number of samples, one should be cognizant of what can and cannot be learned though reliability testing.

What Is the Existing Design Reliability?

A common goal is to make a new design as good as or better than the previous version. The ability to conduct comparison testing may simplify reliability testing and reduce the sample size required. If the current design is already reliable enough, that may reduce the need to test the new design as extensively. We can focus reliability testing on the new elements rather than on the overall product.

Testing is not the only way to determine when or how a product will fail, yet it is often the most effective approach. The best test is the one done by customers as they use the product. Since we would like to know the results before the customer discovers them, we do reliability testing. Sometimes it is possible to replicate customer conditions. Sometimes we use very controlled conditions that focus on one failure mechanism. In all cases the testing attempts to provide insight into how the design will perform over time.

There are many other considerations, yet the above list provides a good start. Some of the items that impact the ability of a reliability test to be useful include thinking ahead. Early in a program, before all the significant risks are understood, we should consider the possibility of reliability testing, which includes setting aside the needed budget, resources, and time for testing. Reliability is important and measuring the ability of a design to meet the reliability goals becomes an essential element of feedback to the design team.

Bio:

Fred Schenkelberg is an experienced reliability engineering and management consultant with his firm FMS Reliability. His passion is working with teams to create cost-effective reliability programs that solve problems, create durable and reliable products, increase customer satisfaction, and reduce warranty costs. If you enjoyed this articles consider subscribing to the ongoing series Musings on Reliability and Maintenance Topics.

CERM ® RISK INSIGHTS

Future of Quality: Risk™