Tuesday, April 24, 2007

Failure Mode and Effect Analysis (FMEA)

Failure Mode and Effects Analysis (FMEA) and Failure Modes, Effects and Criticality Analysis (FMECA) are methodologies designed to identify potential failure modes for a product or process, to assess the risk associated with those failure modes, to rank the issues in terms of importance and to identify and carry out corrective actions to address the most serious concerns.

Although the purpose, terminology and other details can vary according to type (e.g. Process FMEA, Design FMEA, etc.), the basic methodology is similar for all. This document presents a brief general overview of FMEA / FMECA analysis techniques and requirements.

FMEA / FMECA Overview
In general, Failure Modes, Effects and Criticality Analysis (FMEA / FMECA) requires the identification of the following basic information:

  • Item(s)
  • Function(s)
  • Failure(s)
  • Effect(s) of Failure
  • Cause(s) of Failure
  • Current Control(s)
  • Recommended Action(s)
  • Plus other relevant details

Most analyses of this type also include some method to assess the risk associated with the issues identified during the analysis and to prioritize corrective actions. Two common methods include:

  • Risk Priority Numbers (RPNs)
  • Criticality Analysis (FMEA with Criticality Analysis = FMECA)

Published Standards and Guidelines
There are a number of published guidelines and standards for the requirements and recommended reporting format of failure mode and effects analyses. Some of the main published standards for this type of analysis include SAE J1739, AIAG FMEA-3 and MIL-STD-1629A. In addition, many industries and companies have developed their own procedures to meet the specific requirements of their products/processes. Figure 1 shows a sample Process FMEA in the Automotive Industry Action Group (AIAG) FMEA-3 format. Click to enlarge the image.

Figure 1 [Enlarge]

Click to enlarge... Process FMEA (PFMEA) in AIAG FMEA-3 format

Basic Analysis Procedure for FMEA or FMECA
The basic steps for performing an Failure Mode and Effects Analysis (FMEA) or Failure Modes, Effects and Criticality Analysis (FMECA) include:

  • Assemble the team.
  • Establish the ground rules.
  • Gather and review relevant information.
  • Identify the item(s) or process(es) to be analyzed.
  • Identify the function(s), failure(s), effect(s), cause(s) and control(s) for each item or process to be analyzed.
  • Evaluate the risk associated with the issues identified by the analysis.
  • Prioritize and assign corrective actions.
  • Perform corrective actions and re-evaluate risk.
  • Distribute, review and update the analysis, as appropriate.

The process for conducting an FMEA is straightforward. The basic steps are outlined below.

  1. Describe the product/process and its function. An understanding of the product or process under consideration is important to have clearly articulated. This understanding simplifies the process of analysis by helping the engineer identify those product/process uses that fall within the intended function and which ones fall outside. It is important to consider both intentional and unintentional uses since product failure often ends in litigation, which can be costly and time consuming.
  2. Create a Block Diagram of the product or process. A block diagram of the product/process should be developed. This diagram shows major components or process steps as blocks connected together by lines that indicate how the components or steps are related. The diagram shows the logical relationships of components and establishes a structure around which the FMEA can be developed. Establish a Coding System to identify system elements. The block diagram should always be included with the FMEA form.
  3. Complete the header on the FMEA Form worksheet: Product/System, Subsys./Assy., Component, Design Lead, Prepared By, Date, Revision (letter or number), and Revision Date. Modify these headings as needed.

    Failure Modes and Effects Analysis (FMEA) Example 2
  4. Use the diagram prepared above to begin listing items or functions. If items are components, list them in a logical manner under their subsystem/assembly based on the block diagram.
  5. Identify Failure Modes. A failure mode is defined as the manner in which a component, subsystem, system, process, etc. could potentially fail to meet the design intent. Examples of potential failure modes include:
    • Corrosion
    • Hydrogen embrittlement
    • Electrical Short or Open
    • Torque Fatigue
    • Deformation
    • Cracking

  1. A failure mode in one component can serve as the cause of a failure mode in another component. Each failure should be listed in technical terms. Failure modes should be listed for functions of each component or process step. At this point the failure mode should be identified whether or not the failure is likely to occur. Looking at similar products or processes and the failures that have been documented for them is an excellent starting point.
  2. Describe the effects of those failure modes. For each failure mode identified the engineer should determine what the ultimate effect will be. A failure effect is defined as the result of a failure mode on the function of the product/process as perceived by the customer. They should be described in terms of what the customer might see or experience should the identified failure mode occur. Keep in mind the internal as well as the external customer. Examples of failure effects include:
    • Injury to the user
    • Inoperability of the product or process
    • Improper appearance of the product or process
    • Odors
    • Degraded performance
    • Noise


Establish a numerical ranking for the severity of the effect. A common industry standard scale uses 1 to represent no effect and 10 to indicate very severe with failure affecting system operation and safety without warning. The intent of the ranking is to help the analyst determine whether a failure would be a minor nuisance or a catastrophic occurrence to the customer. This enables the engineer to prioritize the failures and address the real big issues first.

  1. Identify the causes for each failure mode. A failure cause is defined as a design weakness that may result in a failure. The potential causes for each failure mode should be identified and documented. The causes should be listed in technical terms and not in terms of symptoms. Examples of potential causes include:
    • Improper torque applied
    • Improper operating conditions
    • Contamination
    • Erroneous algorithms
    • Improper alignment
    • Excessive loading
    • Excessive voltage

  1. Enter the Probability factor. A numerical weight should be assigned to each cause that indicates how likely that cause is (probability of the cause occuring). A common industry standard scale uses 1 to represent not likely and 10 to indicate inevitable.
  2. Identify Current Controls (design or process). Current Controls (design or process) are the mechanisms that prevent the cause of the failure mode from occurring or which detect the failure before it reaches the Customer. The engineer should now identify testing, analysis, monitoring, and other techniques that can or have been used on the same or similar products/processes to detect failures. Each of these controls should be assessed to determine how well it is expected to identify or detect failure modes. After a new product or process has been in use previously undetected or unidentified failure modes may appear. The FMEA should then be updated and plans made to address those failures to eliminate them from the product/process.
  3. Determine the likelihood of Detection. Detection is an assessment of the likelihood that the Current Controls (design and process) will detect the Cause of the Failure Mode or the Failure Mode itself, thus preventing it from reaching the Customer. Based on the Current Controls, consider the likelihood of Detection using the following table for guidance.
  4. Review Risk Priority Numbers (RPN). The Risk Priority Number is a mathematical product of the numerical Severity, Probability, and Detection ratings:
    RPN = (Severity) x (Probability) x (Detection)
    The RPN is used to prioritize items than require additional quality planning or action.
  5. Determine Recommended Action(s) to address potential failures that have a high RPN. These actions could include specific inspection, testing or quality procedures; selection of different components or materials; de-rating; limiting environmental stresses or operating range; redesign of the item to avoid the failure mode; monitoring mechanisms; performing preventative maintenance; and inclusion of back-up systems or redundancy.
  6. Assign Responsibility and a Target Completion Date for these actions. This makes responsibility clear-cut and facilitates tracking.
  7. Indicate Actions Taken. After these actions have been taken, re-assess the severity, probability and detection and review the revised RPN's. Are any further actions required?
  8. Update the FMEA as the design or process changes, the assessment changes or new information becomes known.

Risk Evaluation Methods
A typical failure modes and effects analysis incorporates some method to evaluate the risk associated with the potential problems identified through the analysis. The two most common methods, Risk Priority Numbers and Criticality Analysis, are described next.

Risk Priority Numbers
To use the Risk Priority Number (RPN) method to assess risk, the analysis team must:

  • Rate the severity of each effect of failure.
  • Rate the likelihood of occurrence for each cause of failure.
  • Rate the likelihood of prior detection for each cause of failure (i.e. the likelihood of detecting the problem before it reaches the end user or customer).

Severity

Hazardous without warning

Very high severity ranking when a potential failure mode effects safe system operation without warning

10

Hazardous with warning

Very high severity ranking when a potential failure mode affects safe system operation with warning

9

Very High

System inoperable with destructive failure without compromising safety

8

High

System inoperable with equipment damage

7

Moderate

System inoperable with minor damage

6

Low

System inoperable without damage

5

Very Low

System operable with significant degradation of performance

4

Minor

System operable with some degradation of performance

3

Very Minor

System operable with minimal interference

2

None

No effect

1

Probability

PROBABILITY of Failure

Failure Prob

Ranking

Very High: Failure is almost inevitable

>1 in 2

10


1 in 3

9

High: Repeated failures

1 in 8

8


1 in 20

7

Moderate: Occasional failures

1 in 80

6


1 in 400

5


1 in 2,000

4

Low: Relatively few failures

1 in 15,000

3


1 in 150,000

2

Remote: Failure is unlikely

<1>

1

Detectability

Detection

Likelihood of DETECTION by Design Control

Ranking

Absolute Uncertainty

Design control cannot detect potential cause/mechanism and subsequent failure mode

10

Very Remote

Very remote chance the design control will detect potential cause/mechanism and subsequent failure mode

9

Remote

Remote chance the design control will detect potential cause/mechanism and subsequent failure mode

8

Very Low

Very low chance the design control will detect potential cause/mechanism and subsequent failure mode

7

Low

Low chance the design control will detect potential cause/mechanism and subsequent failure mode

6

Moderate

Moderate chance the design control will detect potential cause/mechanism and subsequent failure mode

5

Moderately High

Moderately High chance the design control will detect potential cause/mechanism and subsequent failure mode

4

High

High chance the design control will detect potential cause/mechanism and subsequent failure mode

3

Very High

Very high chance the design control will detect potential cause/mechanism and subsequent failure mode

2

Almost Certain

Design control will detect potential cause/mechanism and subsequent failure mode

1

Source : http://www.isixsigma.com

  • Calculate the RPN by obtaining the product of the three ratings:

RPN = Severity x Occurrence x Detection

The RPN can then be used to compare issues within the analysis and to prioritize problems for corrective action. This risk assessment method is commonly associated with Failure Mode and Effects Analysis (FMEA).

Criticality Analysis
The MIL-STD-1629A document describes two types of criticality analysis: quantitative and qualitative. To use the quantitative criticality analysis method, the analysis team must:

  • Define the reliability/unreliability for each item, at a given operating time.
  • Identify the portion of the item’s unreliability that can be attributed to each potential failure mode.
  • Rate the probability of loss (or severity) that will result from each failure mode that may occur.
  • Calculate the criticality for each potential failure mode by obtaining the product of the three factors:

Mode Criticality = Item Unreliability x Mode Ratio of Unreliability x Probability of Loss

  • Calculate the criticality for each item by obtaining the sum of the criticalities for each failure mode that has been identified for the item.

Item Criticality = SUM of Mode Criticalities

To use the qualitative criticality analysis method to evaluate risk and prioritize corrective actions, the analysis team must:

  • Rate the severity of the potential effects of failure.
  • Rate the likelihood of occurrence for each potential failure mode.
  • Compare failure modes via a Criticality Matrix, which identifies severity on the horizontal axis and occurrence on the vertical axis.

These risk assessment methods are commonly associated with Failure Modes, Effects and Criticality Analysis (FMECA).

Applications and Benefits
The Failure Modes, Effects and Criticality Analysis (FMEA / FMECA) procedure is a tool that has been adapted in many different ways for many different purposes. It can contribute to improved designs for products and processes, resulting in higher reliability, better quality, increased safety, enhanced customer satisfaction and reduced costs. The tool can also be used to establish and optimize maintenance plans for repairable systems and/or contribute to control plans and other quality assurance procedures. It provides a knowledge base of failure mode and corrective action information that can be used as a resource in future troubleshooting efforts and as a training tool for new engineers. In addition, an FMEA or FMECA is often required to comply with safety and quality requirements, such as ISO 9001, QS 9000, ISO/TS 16949, Six Sigma, FDA Good Manufacturing Practices (GMPs), Process Safety Management Act (PSM), etc.

Source : http://www.weibull.com/basics/fmea.htm

No comments: