Computer reliability has been near the top of the list of concerns for system developers since the dawn of the computer. The level of reliability has increased multifold over the years, making computing reliable enough for even the most critical applications. However, system developers would love to be able to better predict the expected failure rate of computing systems so that they can be sure to design in the appropriate operating margins to maximize reliability while managing competitive costs. Failure rate predictions are utilized by logistics and systems engineers for a myriad of purposes, including reliability analysis, cost trade studies, availability analysis, spares planning, redundancy modeling, scheduled maintenance planning, product warranties, and guarantees. The traditional tools available for reliability prediction have not been able to keep up with the steep technology curve of electronics. Members of VITA saw the deficiencies and banded together to address the issues, launching the VITA 51 working group.
One of the most popular tools for determining the reliability of electronic equipment is MIL-HDBK-217, “Reliability Prediction of Electronic Equipment.” In 1994, William Perry, U.S. Secretary of Defense, published his pivotal memorandum titled “Specifications & Standards – A New Way of Doing Business.” This memo, and the changes in military acquisition that followed, caused many military standards to be cancelled in favor of commercial standards and practices. A consequence of this memo was the DoD no longer updating MIL-HDBK-217, but looking to industry organizations to provide updated reliability prediction methods.
In 2008, Naval Surface Warfare Center (NSWC) Crane gathered an industry working group to develop a revision to MIL-HDBK-217, just about the same time that the ANSI/VITA 51.1 work was completed. The data gathered for VITA 51 was reused in the NSWC Crane effort. Several organizations participated in developing new guidelines for reliability prediction.
The Reliability Information Analysis Center (RIAC) was used to gather, keep, and manage the data brought in by industry contributors. RIAC is a DoD information analysis center sponsored by the Defense Technical Information Center. As an active member of the working group and an unbiased, third-party DoD agency, the RIAC is serving as the repository for any data that is contributed in support of this effort. In this capacity, the RIAC can provide Nondisclosure Agreements (NDAs) to protect any data that a potential contributor wishes to remain proprietary. The RIAC ensures that all submitted data is properly sanitized, such that no proprietary information is provided to the at-large MIL-HDBK-217 working group.
Another key contributor is the Aerospace Vehicle Systems Institute (AVSI). They address issues that impact the aerospace community through international cooperative research and collaboration conducted by industry, government, and academia. AVSI provides findings to the -217 working group on the work being performed in current AVSI reliability initiatives.
A draft, MIL-HDBK-217G, was completed in 2010 and released for public review, but it was quickly retracted pending internal discussions in the DoD about reliability policy. On March 21, 2011, the Under Secretary of Defense for Acquisition, Technology, and Logistics issued a memorandum, “Directive-Type Memorandum (DTM) 11-003 – Reliability Analysis, Planning, Tracking, and Reporting,” stating a desire to immediately enhance reliability in the acquisition process and to improve the efficiency of the Department of Defense acquisition system (see www.dtic.mil/whs/directives/corres/pdf/DTM-11-003.pdf). In October 2011, the DoD showed a new interest in MIL-HDBK-217 revisions, especially to include use of Physics of Failure (PoF) methods. Time is running out though, as the memorandum is scheduled to expire at the end of 2012.
Enter the VITA 51 standards
ANSI/VITA 51.0-2008 (R2012)
Reliability Prediction – This document is the base specification, building a framework for reliability predictions and setting the ground rules for subsidiary specifications (Figure 1). It provides a failure rate prediction standard for electronics with background information on reliability prediction methodologies. The limitations of existing prediction practices are addressed with a series of subsidiary specifications that contains the “best practices” within the industry for performing electronics failure rate predictions. Developing ANSI/VITA 51.0 and the subsidiary specifications is an effort to give the Mean Time Between Failure (MTBF) calculations consistency and repeatability. Required by the specification is the use of a disclosure statement to document modeling assumptions, defaults, and exceptions to the methodologies used when establishing the prediction models. ANSI/VITA 51 establishes a Community of Practice that can further develop the body of work. This specification was most recently revised in 2012.
ANSI/VITA 51.1-2008
Reliability Prediction MIL-HDBK-217 Subsidiary Specification – This specification provides standard defaults and methods to adjust the models in MIL-HDBK-217F Notice 2. This is not a revision or replacement of MIL-HDBK-217F Notice 2 but a standardization of the inputs to the MIL-HDBK-217F Notice 2 calculations to give more consistent results. Not all component families from MIL-HDBK-217 are covered within ANSI/VITA 51.1.
One of the most significant adjustments within ANSI/VITA 51.1 is the change to piQ for commercial quality components. Originally piQ was established as 10, but was changed to 1 for many components covered by MIL-HDBK-217F Notice 2. This adjustment is based on the fact that commercial grade components have become much more reliable since the last revision of MIL-HDBK-217, so the change better reflects this improvement.
ANSI/VITA 51.1 references sources used for making the adjustments called out in the specification. It uses “engineering judgment” as a source for some factors plus field and test data for resistor and capacitor factors. All the source data was provided to NSWC Crane and was used in the MIL-HDBK-217 Rev G effort.
Because of similarity with Rev G, some electronics suppliers are utilizing ANSI/VITA 51.1-2008. This specification is currently undergoing its five-year update within the VITA 51 working group.
ANSI/VITA 51.2-2011
Physics of Failure (PoF) Reliability Predictions – This specification provides standard processes, instructions, and default parameters for using the PoF approach for modeling the reliability of electronic products where the models are based on physical properties of the materials used in the product. It includes a discussion of the philosophy, context for use, definitions, models for key failure mechanisms, definition of the input data required, default values if technically feasible, or the typical range of values as a guideline. It defines how modeling results are interpreted and used. It requires the documentation of modeling inputs, assumptions made during the analysis, modifications to the models, and rationale for the analysis. ANSI/VITA 51.2 establishes uniform practices for board level, packaging, and component models as well as setting guidelines for PoF program planning.
In developing the specification, the standard’s editors leveraged current industry developments using research results and expertise from AVSI, AMSAA, CALCE, and DfR Solutions. Modeling for lead-free solder fatigue is included in the specification to help with predictions in this developing scenario. Overall, ANSI/VITA 51.2 clarifies expectations for reliability prediction providers and customers. Work continues on this specification as the working group gathers more data.
ANSI/VITA 51.3-2010
Qualification and Environmental Stress Screening in Support of Reliability Predictions – This standard provides rules, permissions, and observations to assure that cost-effective qualification and Environmental Stress Screening (ESS) support valid reliability predictions and enhance electronics reliability. It includes a discussion of the systems engineering relationships between qualification, Environmental Stress Screening, and reliability.
Qualification durability environment verification is conducted in such a way as to validate the underlying assumptions with reliability analyses. The qualification testing for durability environments is not the same as “reliability development testing,” which is used to improve reliability by eliminating failure mechanisms identified during test. Qualification testing described in this specification is intended to verify that an item has sufficient durability to survive the specified lifetime in the specified environment. On the bathtub curve, this would be verifying that the leading edge of the rise in failure rate (that is, the wearout period) will not occur during the specified life. Qualification doesn’t change the bathtub curve shape, which means it doesn’t change a product’s reliability. It provides an understanding of whether the failure rate can be assumed constant during the product’s life (Figure 2).
ESS is a screen used to eliminate “infant mortality” failures. By applying a controlled amount of stress, failures associated with manufacturing defects are forced to occur before the product is delivered. This means the remaining failures will more likely be dominated by mechanisms that occur randomly, and thus place the product in the “flat” part of the bathtub curve. This doesn’t improve reliability, in the sense that it doesn’t change the height of the flat portion of the bathtub curve, but it provides a better customer experience because infant mortality failures are eliminated prior to product delivery.
ANSI/VITA 51.3 specifies what qualification and ESS considerations will accomplish a conscious design of a product to the intended portion of the bathtub curve by considerations of fatigue and durability analysis.
The Reliability Community
The Reliability Community is a collaborative effort by VITA members to develop a series of standards and guidelines to establish reliability practices for the critical embedded computing industry (Figure 3). The community comprises representatives from electronics suppliers, system integrators, and the Department of Defense (DoD). These members have developed community of practice documents that define electronics failure rate prediction methodologies and standards.
(Click graphic to zoom by 1.8x)
|
A working group was formed to investigate and develop industry standards to address electronics failure rate prediction and assessment. Where applicable, these standards provide adjustment factors to existing standards. As new electronics technology is developed, new methods will be developed, documented, and added to future releases of these standards and subsidiary specifications. The purpose of the Reliability Community is to establish an ecosystem of interested parties that promotes and creates reliability practices.
The Reliability Community addresses the limitations of existing prediction practices, with a series of subsidiary specifications that contains the “best practices” within industry for performing electronics failure rate predictions. The Reliability Community recognizes there are many industry reliability methods, each with a custodian and acceptable practices to calculate electronics failure rate predictions. If such a method is identified as requiring additional standards for use by electronics module suppliers, a new subsidiary specification will be considered by the Reliability Community working group.
Join the Reliability Community on LinkedIn at http://opsy.st/TAiY9z.