Answered step by step
Verified Expert Solution
Question
1 Approved Answer
NASA/SP-2011-3422 Version 1.0 November 2011 NASA Risk Management Handbook NASA/SP-2011-3422 Version 1.0 NASA Risk Management Handbook National Aeronautics and Space Administration NASA Headquarters Washington, D.C.
NASA/SP-2011-3422 Version 1.0 November 2011 NASA Risk Management Handbook NASA/SP-2011-3422 Version 1.0 NASA Risk Management Handbook National Aeronautics and Space Administration NASA Headquarters Washington, D.C. 20546 November 2011 i To request print or electronic copies or provide comments, contact the Office of Safety and Mission Assurance. Electronic copies are also available from NASA Center for AeroSpace Information 7115 Standard Drive Hanover, MD 21076-1320 at http://ntrs.nasa.gov ii NASA STI Program ... in Profile Since its founding, NASA has been dedicated to the advancement of aeronautics and space science. The NASA scientific and technical information (STI) program plays a key part in helping NASA maintain this important role. CONFERENCE PUBLICATION. Collected papers from scientific and technical conferences, symposia, seminars, or other meetings sponsored or co-sponsored by NASA. The NASA STI program operates under the auspices of the Agency Chief Information Officer. It collects, organizes, provides for archiving, and disseminates NASA's STI. The NASA STI program provides access to the NASA Aeronautics and Space Database and its public interface, the NASA Technical Report Server, thus providing one of the largest collections of aeronautical and space science STI in the world. Results are published in both non-NASA channels and by NASA in the NASA STI Report Series, which includes the following report types: SPECIAL PUBLICATION. Scientific, technical, or historical information from NASA programs, projects, and missions, often concerned with subjects having substantial public interest. TECHNICAL TRANSLATION. Englishlanguage translations of foreign scientific and technical material pertinent to NASA's mission. Specialized services also include creating custom thesauri, building customized databases, and organizing and publishing research results. TECHNICAL PUBLICATION. Reports of completed research or a major significant phase of research that present the results of NASA Programs and include extensive data or theoretical analysis. Includes compilations of significant scientific and technical data and information deemed to be of continuing reference value. NASA counterpart of peerreviewed formal professional papers but has less stringent limitations on manuscript length and extent of graphic presentations. For more information about the NASA STI program, see the following: Access the NASA STI program home page at http://www.sti.nasa.gov E-mail your question via the Internet to help@sti.nasa.gov Fax your question to the NASA STI Help Desk at 443-757-5803 TECHNICAL MEMORANDUM. Scientific and technical findings that are preliminary or of specialized interest, e.g., quick release reports, working papers, and bibliographies that contain minimal annotation. Does not contain extensive analysis. Phone the NASA STI Help Desk at 443-757-5802 Write to: NASA STI Help Desk NASA Center for AeroSpace Information 7115 Standard Drive Hanover, MD 21076-1320 CONTRACTOR REPORT. Scientific and technical findings by NASA-sponsored contractors and grantees. iii iv ACKNOWLEDGMENTS The project manager and the authors express their gratitude to NASA Office of Safety and Mission Assurance (OSMA) management (Mr. Bryan O'Connor, former Chief of OSMA; Mr. Terrence Wilcutt, Chief of OSMA; Mr. Wilson Harkins, Deputy Chief of OSMA; and Mr. Thomas Whitmeyer, Director of Mission Support Division) for their support and encouragement in developing this document. The development effort leading to this document was conducted in stages, and was supported by the individuals listed alphabetically below, who each brought unique experience and insights to the development.1 AUTHORS: Dr. Homayoon Dezfuli (Project Manager) NASA System Safety Technical Fellow, NASA Headquarters Dr. Allan Benjamin Mr. Christopher Everett Mr. Gaspare Maggio Dr. Michael Stamatelatos Information Systems Laboratories Information Systems Laboratories Information Systems Laboratories Director of Safety and Assurance Requirements Division, NASA Headquarters Idaho National Laboratory Dr. Robert Youngblood CONTRIBUTING AUTHORS: Dr. Sergio Guarro Dr. Peter Rutledge Mr. James Sherrard Dr. Curtis Smith Dr. Rodney Williams The Aerospace Corporation Quality Assurance & Risk Management Services Information Systems Laboratories Idaho National Laboratory Information Systems Laboratories REVIEWERS: This development benefited from review comments provided on the initial draft by many individuals. The authors wish to specifically thank the following individuals: Dr. Robert Abelson Dr. Timothy Barth Mr. John Chiorini Mr. Chester Everline Mr. Louis Fussell Dr. Frank Groen Mr. David Lengyel Dr. Robert Mulvihill Ms. Sylvia Plants 1 NASA Jet Propulsion Laboratory NASA Kennedy Space Center Center for Systems Management Jet Propulsion Laboratory Futron Corporation NASA Headquarters NASA Headquarters Quality Assurance & Risk Management Services Science Applications International Corporation Affiliations are as of the time of contribution to the development effort. v Mr. William Powell Mr. David Pye Dr. James Rose Dr. Fayssal Safie Dr. Nathan Siu Dr. Clayton Smith Ms. Sharon Thomas Ms. Ellen Stigberg Dr. William Vesely Mr. Tracy Wrigley Dr. Thomas Zang NASA Marshall Space Flight Center Perot Systems Jet Propulsion Laboratory NASA Marshall Space Flight Center U. S. Nuclear Regulatory Commission Applied Physics Laboratory NASA Johnson Space Center NASA Headquarters NASA Headquarters Bastion Technologies, Inc. NASA Langley Research Center vi TABLE OF CONTENTS Table of Contents ............................................................................................... vii List of Figures ......................................................................................................xi List of Tables ..................................................................................................... xiii Preface ................................................................................................................. xv 1 INTRODUCTION ...........................................................................................1 1.1 1.2 1.3 1.4 1.4.1 1.4.2 1.5 1.5.1 1.5.2 1.5.3 1.5.4 1.6 1.6.1 1.6.2 1.6.3 1.6.4 1.6.5 1.6.6 Purpose ............................................................................................................ 1 Scope and Depth .............................................................................................. 1 Background ..................................................................................................... 3 Applicability of Risk Management................................................................... 4 When is RIDM Invoked? ................................................................................. 5 When is CRM Applied? ................................................................................... 6 Overview of the RIDM Process ....................................................................... 6 Part 1, Identification of Alternatives ................................................................ 9 Part 2, Risk Analysis of Alternatives .............................................................. 10 Part 3, Risk-Informed Alternative Selection ................................................... 12 Avoiding Decision Traps ............................................................................... 14 Overview of the CRM Process ....................................................................... 15 Step 1, Identify .............................................................................................. 16 Step 2, Analyze .............................................................................................. 16 Step 3, Plan.................................................................................................... 19 Step 4, Track ................................................................................................. 19 Step 5, Control ............................................................................................... 20 Communicate and Document ......................................................................... 20 2 RIDM PROCESS INTERFACES ................................................................. 21 2.1 2.2 2.3 2.3.1 2.3.2 2.4 Negotiating Objectives across Organizational Unit Boundaries ...................... 22 Preparing a Preliminary Risk Management Plan ............................................ 22 Coordination of RIDM and CRM ................................................................... 22 Initializing the CRM Risks Using the Risk Analysis of the Selected Alternative ..................................................................................................... 23 Rebaselining of Performance Requirements ................................................... 24 Maintaining the RIDM Process ...................................................................... 27 3 THE RIDM PROCESS .................................................................................. 29 3.1 3.1.1 3.1.1.1 3.1.1.2 3.1.2 3.1.2.1 3.1.2.2 Part 1 - Identification of Alternatives............................................................. 30 Step 1 - Understand Stakeholder Expectations and Derive Performance Measures ....................................................................................................... 30 Understand Stakeholder Expectations ............................................................ 30 Derive Performance Measures ....................................................................... 32 Step 2 - Compile Feasible Alternatives .......................................................... 43 Compiling an Initial Set of Alternatives ......................................................... 43 Structuring Possible Alternatives (e.g., Trade Trees) ...................................... 43 vii 3.2 3.2.1 3.2.1.1 3.2.1.2 3.2.1.3 3.2.1.4 3.2.1.5 3.2.2 3.2.2.1 3.2.2.2 3.2.2.3 3.2.2.4 3.2.2.5 3.2.2.6 3.2.2.7 3.2.2.8 3.3 3.3.1 3.3.1.1 3.3.1.2 3.3.2 3.3.2.1 3.3.2.2 3.3.2.3 3.3.2.4 3.3.2.5 3.3.2.6 3.3.2.7 3.3.2.8 3.3.2.9 Part 2 - Risk Analysis of Alternatives ............................................................ 46 Step 3 - Set the Framework and Choose the Analysis Methodologies ............ 46 Structuring the Analysis Process .................................................................... 48 Configuration Control .................................................................................... 49 Implementing Various Levels of Model Rigor in Selecting Risk Analysis Methods......................................................................................................... 49 Implementing a Graded Approach in Quantifying Individual Scenarios ......... 53 Use of Existing Analyses ............................................................................... 53 Step 4 - Conduct the Risk Analysis and Document the Results ...................... 53 Probabilistic Modeling of Performance .......................................................... 54 Use of Qualitative Information in RIDM ........................................................ 57 Risk Analysis Support of Robust Decision Making ........................................ 59 Sequential Analysis and Downselection ......................................................... 60 Model Uncertainty and Sensitivity Studies ..................................................... 62 Analysis Outputs ........................................................................................... 64 Assessing the Credibility of the Risk Analysis Results ................................... 65 The Technical Basis for Deliberation ............................................................. 66 Part 3 - Risk-Informed Alternative Selection ................................................. 70 Step 5 - Develop Risk-Normalized Performance Commitments ..................... 71 Establishing Risk Tolerances on the Performance Measures .......................... 73 Ordering the Performance Measures .............................................................. 75 Step 6 - Deliberate, Select an Alternative, and Document the Decision Rationale ....................................................................................................... 78 Convening a Deliberation Forum ................................................................... 78 Identify Contending Alternatives ................................................................... 79 Additional Uncertainty Considerations .......................................................... 80 Other Considerations ..................................................................................... 82 Deliberation Is Iterative ................................................................................. 83 Communicating the Contending Alternatives to the Decision Maker .............. 83 Alternative Selection Is Iterative .................................................................... 85 Selecting a Decision Alternative .................................................................... 86 Documenting the Decision Rationale ............................................................. 86 4 THE CRM PROCESS ................................................................................... 89 4.1 4.1.1 4.1.2 4.1.2.1 4.1.2.2 4.1.3 4.1.3.1 4.1.4 4.2 4.2.1 4.2.1.1 4.2.1.2 4.2.1.3 Initializing the CRM Process ......................................................................... 93 Development of Risk Management Plan ........................................................ 93 Inputs to CRM ............................................................................................... 94 Inputs from the RIDM Process ....................................................................... 94 Inputs from Systems Engineering .................................................................. 95 Risk Tolerance Targets at Projected Milestones ............................................. 96 Alternate Performance Margin Targets at Projected Milestones ..................... 97 Developing Initial Risk Taxonomies ............................................................ 100 The CRM Identify Step ................................................................................ 100 The Structure of an Individual Risk.............................................................. 101 The Risk Statement ...................................................................................... 101 Validating an Individual Risk ...................................................................... 103 Taxonomic Categorization of Individual Risks............................................. 106 viii 4.2.1.4 4.2.2 4.2.3 4.3 4.3.1 4.3.2 4.3.2.1 4.3.2.2 4.3.2.3 4.3.2.4 4.3.2.5 4.3.2.6 4.3.3 4.3.3.1 4.3.3.2 4.3.3.3 4.4 4.4.1 4.4.1.1 4.4.1.2 4.4.2 4.4.2.1 4.4.2.2 4.4.3 4.5 4.6 4.7 4.7.1 4.7.2 4.8 4.8.1 4.8.2 4.8.3 The Narrative Description ............................................................................ 112 Sources of Risk Identification ...................................................................... 113 Risk Advocacy and Ownership .................................................................... 116 Analyze Step ............................................................................................... 118 Introduction to Graded Analysis and the Use of Risk Scenario Diagrams in CRM............................................................................................................ 120 Quick Look Analyze Step ............................................................................ 122 Likelihood and Severity Ranking ................................................................. 126 Uncertainty Ranking .................................................................................... 134 Timeframe Ranking ..................................................................................... 137 Near-Term (Tactical) Criticality Ranking..................................................... 138 Long-Term (Strategic) Criticality Ranking ................................................... 141 Relationship Between Criticality Rankings and the Risk Matrix................... 143 Graded Approach Analyze Step ................................................................... 146 Developing Risk Scenario Diagrams (RSDs) ............................................... 146 Updating the Performance Risk Models and Calculating Performance Risk . 155 Determining the Risk Drivers ...................................................................... 156 The CRM Plan Step ..................................................................................... 160 Generating Risk Response Alternatives........................................................ 163 Generating Risk Response Options .............................................................. 163 Combining Risk Response Options to produce a set of Candidate Risk Response Alternatives .................................................................................. 172 Risk Analysis of Risk Response Alternatives ............................................... 173 Integrating the Candidate Risk Response Alternatives into the Risk Analysis .................................................................................................................... 173 Conducting the Risk Analysis and Documenting the Results ........................ 177 Deliberation and Selection of a Risk Response............................................. 179 The CRM Track Step ................................................................................... 183 The CRM Control Step ................................................................................ 185 Communicate and Document ....................................................................... 187 Communication within CRM ....................................................................... 187 Documentation within CRM ........................................................................ 188 Applicability of Project-Centered CRM Processes to Other Risk Domains ... 188 Institutional Risks ........................................................................................ 189 Enterprise Risks ........................................................................................... 189 Agency-Wide Strategic Risks ...................................................................... 190 5 REFERENCES ............................................................................................ 193 APPENDIX A: Acronyms and Abbreviations ................................................. 197 APPENDIX B: Definitions ............................................................................... 201 APPENDIX C: Content Guide for the Technical Basis for Deliberation ...... 205 APPENDIX D: Content Guide for the Risk-Informed Selection Report ....... 207 APPENDIX E: Selected NASA Examples of RIDM Process Elements.......... 209 ix APPENDIX F: Practical Aspects of the Risk Management Plan .................. 225 APPENDIX G: Hypothetical Individual Risks Used for the Planetary Science Example in the CRM Development ................................................................. 229 x LIST OF FIGURES Figure 1. Systems Engineering Engine ........................................................................................2 Figure 2. Risk Management as the Interaction of Risk-Informed Decision Making and Continuous Risk Management .....................................................................................3 Figure 3. Flowdown of Performance Requirements (Illustrative) .................................................5 Figure 4. The RIDM Process .......................................................................................................7 Figure 5. Functional Roles and Information Flow in RIDM (Notional) ........................................8 Figure 6. Uncertainty of Forecasted Outcomes Due to Uncertainty of Analyzed Conditions ...... 10 Figure 7. The CRM Process ...................................................................................................... 15 Figure 8. Integration of Individual Risks to Produce Performance Risks .................................... 18 Figure 9. Coordination of RIDM and CRM within the NASA Hierarchy (Illustrative) ............... 21 Figure 10. RIDM Input to CRM Initialization ........................................................................... 23 Figure 11. Rebaselining of Performance Requirements ............................................................. 25 Figure 12. Scope of Potentially Affected Organizations Given Rebaselining ............................. 26 Figure 13. RIDM Process Steps................................................................................................. 29 Figure 14. RIDM Process Flowchart: Part 1, Identification of Alternatives ................................ 31 Figure 15. Notional Objectives Hierarchy ................................................................................. 34 Figure 16. Fundamental vs. Means Objectives ........................................................................... 37 Figure 17. Types of Performance Measures ............................................................................... 39 Figure 18. The Relationship between Performance Objectives and Performance Measures ........ 40 Figure 19. Example Launch Vehicle Trade Tree from ESAS ..................................................... 44 Figure 20. RIDM Process Part 2, Risk Analysis of Alternatives ................................................. 46 Figure 21. Risk Analysis Framework (Alternative Specific) ...................................................... 48 Figure 22. Analysis Methodology Guidance Chart .................................................................... 50 Figure 23. Risk Analysis Using a Monte Carlo Sampling Procedure.......................................... 56 Figure 24. Uncertain Performance Parameters Leading to Performance Measure Histograms .... 58 Figure 25. Robustness and Uncertainty...................................................................................... 60 Figure 26. Downselection of Alternatives .................................................................................. 61 Figure 27. Conceptualization of the Formulation of Modeling Uncertainty ................................ 63 Figure 28. Notional Depiction of Decision Sensitivity to Input Parameters ................................ 64 Figure 29. Analysis Level Matrix .............................................................................................. 66 Figure 30. Notional Imposed Constraints Risk Matrix ............................................................... 67 Figure 31. Notional Band Aid Chart for Performance Measure X .............................................. 68 Figure 32. Comparison of Uncertainty Distributions .................................................................. 68 Figure 33. RIDM Process Part 3, Risk-Informed Alternative Selection ...................................... 70 Figure 34. Establishing Performance Commitments .................................................................. 72 Figure 35. Performance Commitments and Risk Tolerances for Three Alternatives ................... 76 Figure 36. An Example Uncertainty Consideration: The Potential for High Performance .......... 81 Figure 37. Notional Performance Commitment Chart ................................................................ 84 Figure 38. Notional Risk List for Alternative X ......................................................................... 85 Figure 39. The CRM Process .................................................................................................... 91 Figure 40. CRM Process Flow Diagram .................................................................................... 92 Figure 41. Decreasing Uncertainty and Risk over Time ............................................................. 96 Figure 42. Notional Risk Burn-Down Schedules for Several Performance Requirements .......... 98 xi Figure 43. Notional Margin Burn-Down (Risk Relaxation) Schedules for Several Performance Margins ................................................................................................................... 99 Figure 44. The CRM Identify Step .......................................................................................... 101 Figure 45. Generating and Validating an Individual Risk......................................................... 104 Figure 46. Example Condition/Departure Taxonomy ............................................................... 106 Figure 47. Example Asset Taxonomy ...................................................................................... 107 Figure 48. Risk Statement Structure and Taxonomies .............................................................. 108 Figure 49. Quick-Look Analyze Step ...................................................................................... 120 Figure 50. Graded Approach Analyze Step .............................................................................. 121 Figure 51. Simple Risk Scenario Diagram ............................................................................... 123 Figure 52. Moderately Detailed Risk Scenario Diagram .......................................................... 124 Figure 53. Example Use of Criticality Analysis to Justify the Placement of Individual Risks on a Risk Matrix ............................................................................................................ 145 Figure 54. Format for an RSD Showing the Effects of Each Pathway on the Organizational Unit's Performance Requirements.......................................................................... 148 Figure 55. Expanded RSD for the Individual Risk Associated with Planet X Atmospheric Uncertainties from the Perspective of the RCS Organizational Unit ....................... 150 Figure 56. Interfacing of Risk Scenario Diagrams for Different Organizational Units .............. 153 Figure 57. The CRM Plan Step (Tactical Response) ................................................................ 160 Figure 58. The CRM Plan Step (Strategic Response) ............................................................... 161 Figure 59. CRM Plan Tasks .................................................................................................... 162 Figure 60. Relationship between Risk Response Options and Risk Response Alternatives ....... 163 Figure 61. Departure Prevention and Consequence Reduction ................................................. 165 Figure 62. Notional Risk Response Matrix .............................................................................. 173 Figure 63. Notional Performance Risk Chart ........................................................................... 178 Figure 64. Notional Band-Aid Chart for Performance Measure X ............................................ 178 Figure 65. Performance Risks and Risk Tolerances for the Contending Risk Response Alternatives ........................................................................................................... 181 Figure 66. The CRM Track Step ............................................................................................. 183 Figure 67. Performance Risk Tracking Chart ........................................................................... 185 Figure 68. The CRM Control Step ........................................................................................... 186 Figure E-1. Crew Transport to and from ISS DRM .................................................................. 209 Figure E-2. Lunar Sortie Crew with Cargo DRM .................................................................... 210 Figure E-3. ESAS FOMs ......................................................................................................... 211 Figure E-4. Possible Range of ESAS Launch Trade Study....................................................... 212 Figure E-5. Launch Order Analysis Trade Space ..................................................................... 214 Figure E-6. Robotic Servicing Decision Tree .......................................................................... 215 Figure E-7. Option Tree Analysis ............................................................................................ 217 Figure E-8. ESAS Entry, Descent, and Landing Event Tree..................................................... 218 Figure E-9. CLV LEO Launch Systems LOM ......................................................................... 219 Figure E-10. CLV LEO Launch Systems LOC ........................................................................ 220 Figure E-11. Launch Order Downselection and Rationale ....................................................... 221 Figure E-12. Launch Decision Relative to Ares I and Ares V Stack Costs ............................... 222 Figure E-13. Shuttle-Derived CLV FOM Assessment Summary.............................................. 223 Figure E-14. Expected Value versus Life Cycle Cost............................................................... 224 xii LIST OF TABLES Table 1. A Constructed Scale for Stakeholder Support .............................................................. 38 Table 2. Performance Measures Examples for Planetary Spacecraft and Launch Vehicles ......... 41 Table 3. Key Aspects of Credibility Assessment Levels ............................................................ 65 Table 4. Examples of Generic Factors for Uncertainty Ranking ............................................... 134 Table 5. Condensing the Tactical Criticality Ranking .............................................................. 139 Table E-1. Launch Order Risk Analysis FOMs ........................................................................ 211 Table E-2. Alternatives Brainstorming .................................................................................... 215 xiii xiv NASA RISK MANAGEMENT HANDBOOK Preface In some form, risk management (RM) has always been an integral part of virtually every challenging human endeavor. A formal and, at that time, qualitative RM process known as Continuous Risk Management (CRM) was introduced to NASA in the latter half of the 1990s. More rigorous quantitative RM processes including Risk-Informed Decision Making (RIDM) and an enhanced version of CRM have only recently been developed for implementation as an integral part of systems engineering at NASA. While there will probably always be vigorous debate over the details of what comprises the best approach to managing risk, few will disagree that effective risk management is critical to program and project success and affordability. Since their introduction and until recently, NASA RM processes have been based on CRM, which stresses the management of risk during the Implementation phase of the NASA Program/Project Life Cycle. In December of 2008, NASA issued NPR 8000.4A [1], which introduced RIDM as a complementary process to CRM that is concerned with analysis of important and/or direction-setting decisions. In the past, RM was considered equivalent to CRM; now, RM is defined as comprising both CRM and RIDM. In April 2010, NASA issued NASA/SP-2010-576, the NASA Risk-Informed Decision Making Handbook. This handbook introduced RIDM as the front-end of the RM process, described the details of how RIDM is conducted, and ended with a description of how the results of RIDM transition to and set the stage for CRM, the final portion of the RM process. The RIDM Handbook did not proceed to describe CRM, as the development of an enhanced version of CRM was still a work in progress in 2010. Now this handbook addresses the entirety of the NASA RM process, including both RIDM and CRM. Beginning with and facilitated by RIDM, decisions made during the course of a program ultimately burn in the risk that must be managed during the life cycle of the program (primarily during the development portion of the life cycle) using CRM processes to ensure progress towards the program's goal. RIDM helps to ensure that decisions between alternatives are made with an awareness of the risks associated with each, thereby helping to prevent late design changes, which can be key drivers of risk, cost overruns, schedule delays, and cancellation. Most project cost-saving opportunities occur in the definition, planning, and early design phases of a project [2]. After being initialized by the results of RIDM, CRM is used to manage the aggregate risk that threatens the achievement of performance requirements. It does so based on a given set of performance requirements and decision maker risk tolerance levels, analyzing identified risk scenarios with possible mitigations and with follow-up monitoring and communications; by maintaining current and modifying as needed the RIDM risk models; by identifying new risks as they arise and including in the models those risks that were not considered discriminators in RIDM; by documenting individual risks in the form of risk statements with accompanying descriptive narratives for complete understanding; by analyzing departure events; by estimating aggregate risk and the criticality of individual risks; by developing risk scenarios leading to the analysis of pivotal events and the identification of risk xv drivers; by developing aggregate risk models for performance requirements; by identifying risk mitigation options and new risks that may arise from their implementation; by tracking and controlling the effectiveness of mitigations; and finally, by communicating and documenting all risk information necessary to an effective RM process. The RIDM process described in this document attempts to respond to some of the primary issues that have derailed programs in the past, namely: 1) the mismatch between stakeholder expectations and the true resources required to address the risks to achieve those expectations; 2) the miscomprehension of the risk that a decision maker is accepting when making commitments to stakeholders; and 3) the miscommunication in considering the respective risks associated with competing alternatives. The CRM process described herein is an enhanced version of NASA's traditional CRM paradigm. While it maintains the traditional core elements of CRM as we have known them in the past, it builds upon the solid foundation of quantitative parameters and data made possible by the RIDM front-end to the RM process. This approach fundamentally changes the focus from qualitative assessments to quantitative analyses, from the management of individual risks to the management of aggregate risk, and from eliminating or reducing the impact of single unwanted events to the management of risk drivers. This quantification allows managers to discover the drivers of the total risk and find the interactions and dependencies among their causes, mitigations, and impacts across all parts of the program's organization. In addition, quantification supports the optimization of constrained resources, leading to greater affordability. Armed with a set of performance requirements and knowledge of the decision maker's risk tolerance, CRM is used to manage the individual risks that collectively contribute to the aggregate risk of not meeting program/project performance requirements and goals. This handbook is primarily written for systems engineers, risk managers, and risk analysts assigned to apply the requirements of NPR 8000.4A. However, program managers of NASA programs and projects can also get a sense of the value added of the process by reading the RIDM and CRM overview sections. These sections are designed to provide concise descriptions of RIDM and CRM and to highlight key areas of the processes. The RM methodology introduced by this handbook is part of a systems engineering process which emphasizes the proper use of risk analysis in its broadest sense to make risk-informed decisions that impact the mission execution domains of safety, technical, cost, and schedule. In future versions of this handbook, the RM principles discussed here will be updated in an evolutionary manner and expanded to address operational procedures, procurement, strategic planning, and institutional RM as experience is gained in the field. Additionally, technical appendices will be developed and added to provide tools and templates for implementation of the RM process. This handbook has been informed by many other guidance efforts underway at NASA, including the NASA Systems Engineering Handbook (NASA/SP 2007 6105 Rev. 1), the 2008 NASA Cost Estimating Handbook (NASA CEH 2008), and the NASA Standard for Models and Simulation (NASA STD 7009) to name a few. How these documents relate and interact with the RM Handbook is discussed in subsequent chapters. With this in mind, this handbook could be seen as xvi a complement to those efforts in order to help ensure programmatic success and affordability. In fact, the RM methodology has been formulated to complement, but not duplicate, the guidance in those documents. Taken together, the overall guidance is meant to maximize program/project success and affordability by providing: 1) systematic and well thought out processes for conducting the discipline processes as well as integrating them into a formal risk analysis framework and communicating those results to a decision maker so that he or she can make the best informed decisions possible, and 2) a systematic and rigorous process that manages individual and aggregate risks in order to meet program/project performance requirements and goals within levels of risk considered tolerable to the involved decision maker. Although formal decision analysis methods are now highly developed for unitary decisionmakers, it is still a significant challenge to apply these methods in a practical way within a complex organizational hierarchy having its own highly developed program management policies and practices. This handbook is a step towards meeting that challenge for NASA but is certainly not the final step in realizing the proper balance between formalism and practicality. Therefore, efforts will continue to ensure that the methods in this document are properly integrated and updated as necessary, to provide value to the program and project management processes at NASA. While the RM process described in this handbook currently focuses mainly on risks to the achievement of numerical mission performance requirements associated with the safety, technical, cost, and schedule mission execution domains, future revisions of this document will address institutional, enterprise, and Agency-wide strategic risks. In the meantime, it should be noted that the techniques presented here may well be applicable to these latter types of risks, even now. Finally, it is important to point out that this handbook is not a prescription for how to do risk management. Rather, it is guidance on how RM can be done in an integrated framework that flows through a logical and carefully thought-out sequence of related activities that can and should always be tailored to the situation at hand. Homayoon Dezfuli, Ph.D. Project Manager, NASA Headquarters November 2011 xvii xviii NASA RISK MANAGEMENT HANDBOOK 1 INTRODUCTION 1.1 Purpose The purpose of this handbook is to provide guidance for implementing the Risk Management (RM) requirements of NASA Procedural Requirements (NPR) document NPR 8000.4A, Agency Risk Management Procedural Requirements [1], with a specific focus on programs and projects, and applying to each level of the NASA organizational hierarchy as requirements flow down. This handbook supports RM application within the NASA systems engineering process, and is a complement to the guidance contained in NASA/SP-2007-6105, NASA Systems Engineering Handbook [2]. Specifically, this handbook provides guidance that is applicable to the common technical processes of Technical Risk Management and Decision Analysis established by NPR 7123.1A, NASA Systems Engineering Process and Requirements [3]. These processes are part of the Systems Engineering Engine (Figure 1) that is used to drive the development of the system and associated work products to satisfy stakeholder expectations in all mission execution domains, including safety, technical, cost, and schedule. Like NPR 7123.1A, NPR 8000.4A is a discipline-oriented NPR that intersects with product-oriented NPRs such as NPR 7120.5D, NASA Space Flight Program and Project Management Requirements [4]; NPR 7120.7, NASA Information Technology and Institutional Infrastructure Program and Project Management Requirements [5]; and NPR 7120.8, NASA Research and Technology Program and Project Management Requirements [6]. In much the same way that the NASA Systems Engineering Handbook is intended to provide guidance on the implementation of NPR 7123.1A, this handbook is intended to provide guidance on the implementation of NPR 8000.4A. 1.2 Scope and Depth This handbook provides guidance for conducting RM in the context of NASA program and project life cycles, which produce derived requirements in accordance with existing systems engineering practices that flow down through the NASA organizational hierarchy. The guidance in this handbook is not meant to be prescriptive. Instead, it is meant to be general enough, and contain a sufficient diversity of examples, to enable the reader to adapt the methods as needed to the particular risk management issues that he or she faces. The handbook highlights major issues to consider when managing programs and projects in the presence of potentially significant uncertainty, so that the user is better able to recognize and avoid pitfalls that might otherwise be experienced. 1 of 234 Figure 1. Systems Engineering Engine Examples are provided throughout the handbook to illustrate the application of RM methods to specific issues of the type that are routinely encountered in NASA programs and projects. An example notional planetary mission is postulated and used throughout the document as a basis for illustrating the execution of the various process steps that constitute risk management in a NASA context ( yellow boxes). In addition, key terms and concepts are defined throughout the document ( boxes). blue Where applicable, guidance is also given on the spectrum of techniques that are appropriate to use, given the spectrum of circumstances under which risks are managed, ranging from narrow-scope risk management at the hardware component level that must be accomplished using a minimum of time and resources, to broad-scope risk management involving multiple organizations upon which significant resources may be brought to bear. The fact that new techniques are discussed is not intended to automatically imply that a whole new set of analyses is needed. Rather, the risk analyses should take maximum advantage of existing activities, while also influencing them as needed in order to produce results that address objectives, at an appropriate level of rigor to support robust decision making. In all cases, the goal is to apply a level of effort to the task of risk management that provides assurance that objectives are met. 2 of 234 1.3 Background NPR 8000.4A provides the requirements for risk management for the Agency, its institutions, and its programs and projects as required by NASA Policy Directive (NPD) 1000.5, Policy for NASA Acquisition [7]; NPD 7120.4C, Program/Project Management [8]; and NPD 8700.1, NASA Policy for Safety and Mission Success [9]. As discussed in NPR 8000.4A, risk is the potential for performance shortfalls, which may be realized in the future, with respect to achieving explicitly established and stated performance requirements. The performance shortfalls may be related to institutional support for mission execution2 or related to any one or more of the following mission execution domains: Safety Technical Cost Schedule In order to foster proactive risk management, NPR 8000.4A integrates two complementary processes, Risk-Informed Decision Making (RIDM) and Continuous Risk Management (CRM), into a single coherent framework. The RIDM process addresses the risk-informed selection of decision alternatives to assure effective approaches to achieving objectives, and the CRM process addresses implementation of the selected alternative to assure that requirements are met. These two aspects work together to assure effective risk management as NASA programs and projects are conceived, developed, and executed. Figure 2 illustrates the concept. RM RIDM + CRM Figure 2. Risk Management as the Interaction of Risk-Informed Decision Making and Continuous Risk Management Within the NASA organizational hierarchy, high-level objectives, in the form of NASA Strategic Goals, flow down in the form of progressively more detailed performance requirements, whose satisfaction assures that the objectives are met. Each organizational unit within NASA negotiates 2 For the purposes of this version of the handbook, performance shortfalls related to institutional support for mission execution are subsumed under the affected mission execution domains of the program or project under consideration. More explicit consideration of institutional risks will be provided in future versions of this handbook. 3 of 234 with the unit(s) at the next lower level in the organizational hierarchy a set of objectives, deliverables, performance measures, performance requirements, resources, and schedules that defines the tasks to be performed by the unit(s). Once established, the lower level organizational unit employs CRM to manage its own risks against these specifications, and, as appropriate, reports risks and elevates decisions for managing risks to the next higher level based on predetermined risk thresholds that have been negotiated between the two units. Figure 3 depicts this concept. Invoking the RIDM process in support of key decisions as requirements flow down through the organizational hierarchy assures that objectives remain tied to NASA Strategic Goals while also capturing why a particular path for satisfying those requirements was chosen. Managing risk using the CRM process assures that risk management decisions are informed by their impact on objectives at every level of the NASA hierarchy. As applied to CRM, risk is characterized as a set of triplets: The scenario(s) leading to degraded performance with respect to one or more performance measures (e.g., scenarios leading to injury, fatality, destruction of key assets; scenarios leading to exceedance of mass limits; scenarios leading to cost overruns; scenarios leading to schedule slippage). The likelihood(s) (qualitative or quantitative) of those scenarios. The consequence(s) (qualitative or quantitative severity of the performance degradation) that would result if those scenarios were to occur. Uncertainties are included in the evaluation of likelihoods and consequences. 1.4 Applicability of Risk Management The RM approach presented in this handbook is applicable to processes conducted within a systems engineering framework, involving the definition of top-level objectives, the flowdown of top-level objectives in the form of derived performance requirements, decision making about the best way to meet requirements, and implementation of decisions in order to achieve the requirements and, consequently, the objectives. The RM approach is applied in situations where fundamental NASA values in the domains of safety and technical accomplishment have to be balanced against programmatic realities in the domains of schedule and cost. Since the process of implementing an RM approach in itself introduces cost to the project, it is essential that the approach be used in a cost-effective manner. To this end, the methods advocated in this handbook rely on a graded approach to analysis, to manage analysis costs. Because analysis cost is optimized using this approach, the savings achieved by resolving risks before they become problems invariably exceeds the cost of implementing the approach. 4 of 234 Figure 3. Flowdown of Performance Requirements (Illustrative) 1.4.1 When is RIDM Invoked? RIDM is invoked for key decisions such as architecture and design decisions, make-buy decisions, source selection in major procurements, and budget reallocation (allocation of reserves), which typically involve requirements-setting or rebaseling of requirements. RIDM is invoked in many different venues, based on the systems engineering and other management processes of the implementing organizational unit. These include boards and panels, authority to proceed milestones, safety review boards, risk reviews, engineering design and operations planning decision forums, configuration management processes, and commit-to-flight reviews, among others. RIDM is applicable throughout the project life cycle whenever trade studies are conducted. The processes for which decision analysis is typically appropriate, per the NASA Systems Engineering Handbook, are also those for which RIDM is typically appropriate. These decisions typically have one or more of the following characteristics: High Stakes High stakes are involved in the decision, such as significant costs, significant potential safety impacts, or the importance of meeting the objectives. Complexity The actual ramifications of alternatives are difficult to understand without detailed analysis. 5 of 234 Uncertainty Uncertainty in key inputs creates substantial uncertainty in the outcome of the decision alternatives and points to risks that may need to be managed. Multiple Attributes Greater numbers of attributes cause a greater need for formal analysis. Diversity of Stakeholders Extra attention is warranted to clarify objectives and formulate performance measures when the set of stakeholders reflects a diversity of values, preferences, and perspectives. Satisfaction of all of these conditions is not a requirement for conducting RIDM. The point is, rather, that the need for RIDM increases as a function of the above conditions. 1.4.2 When is CRM Applied? CRM is applied towards the achievement of defined performance requirements. In particular, CRM is applied following the invocation of RIDM for key decisions involving requirementssetting or rebaselining of requirements. CRM processes are applicable at any level of the NASA organizational hierarchy where such requirements are defined, and the CRM processes at each such level are focused on achieving the requirements defined at that level. Within the context of the NASA program/project life cycle, CRM is applicable during implementation, once performance requirements have been defined. In addition, CRM processes are applicable to formulation activities, such as technology development, involving the achievement of specific objectives within defined cost and schedule constraints. As implied by its name, CRM entails the continuous management of risks to the performance requirements throughout all phases of implementation, from design and manufacture to operations and eventual closeout, to assure that performance expectations are maintained, and that operational experience is assessed for indications of underappreciated risk. In the event that one or more performance requirements cannot be met with the risk response options that are available to the project, the CRM process can provide both motivation and justification for seeking waivers from having to meet those requirements. That would be the case when the CRM process is able to show that a given requirement is either unnecessary or counterproductive to the success of the mission. 1.5 Overview of the RIDM Process [10] As specified in NPR 8000.4A, the RIDM process itself consists of the three parts shown in Figure 4. This section provides an overview of the process and an introduction to the concepts and terminology established for its implementation. A detailed exposition of the steps associated with each part of the process can be found in Section 3, The RIDM Process. 6 of 234 Risk-Informed Decision Making (RIDM) Part 1 Part 2 Part 3 Identification of Alternatives Identify Decision Alternatives (Recognizing Opportunities) in the Context of Objectives Risk Analysis of Alternatives Risk Analysis (Integrated Perspective) and Development of the Technical Basis for Deliberation Risk-Informed Alternative Selection Deliberate and Select an Alternative and Associated Performance Commitments Informed by (not solely based on) Risk Analysis Figure 4. The RIDM Process Throughout the RIDM process, interactions take place among the stakeholders, the risk analysts, the subject matter experts (SMEs), the Technical Authorities, and the decision-maker to ensure that objectives, values, and knowledge are properly integrated and communicated into the deliberations that inform the decision. Figure 5 notionally illustrates the functional roles and internal interfaces of RIDM. As shown in the figure, it is imperative that the analysts conducting the risk analysis of alternatives incorporate the objectives of the various stakeholders into their analyses. These analyses are performed by, or with the support of, SMEs in the domains spanned by the objectives. The completed risk analyses are deliberated, along with other considerations, and the decision-maker selects a decision alternative for implementation (with the concurrence of the relevant Technical Authorities). The risk associated with the selected decision alternative becomes the central focus of CRM activities, which work to mitigate it during implementation, thus avoiding performance shortfalls in the outcome. The RIDM process is portrayed in this handbook primarily as a linear sequence of steps, each of which is conducted by individuals in their roles as stakeholders, risk analysts, SMEs, and decision-makers. The linear step-wise approach is used for instructional purposes only. In reality, some portions of the processes may be conducted in parallel, and steps may be iterated upon multiple times before moving to subsequent steps. 7 of 234 Technical Authorities Engineering Safety & Mission Assurance Health & Medical Risk Concurrence Decision Decision Maker (incl. risk acceptance) per NPD 1000.0A Contending Pros / Cons Alternatives Deliberation Analysis Results Other Stakeholders Internal (e.g. Mission Directorates, Center Directors, Mission Support Offices) Objectives Risk Analysts Performance Models External Subject Matter Experts Safety Technical Cost Schedule Figure 5. Functional Roles and Information Flow in RIDM (Notional) RIDM Functional Roles* Stakeholders - A stakeholder is an individual or organization that is materially affected by the outcome of a decision or deliverable; e.g., Center Directors (CDs), Mission Support Offices (MSOs). Risk Analysts - A risk analyst is an individual or organization that applies probabilistic methods to the quantification of performance with respect to the mission execution domains of safety, technical, cost, and schedule. Subject Matter Experts - A subject matter expert is an individual or organization with expertise in one or more topics within the mission execution domains of safety, technical, cost, or schedule. Technical Authorities - The individuals within the Technical Authority process who are funded independently of a program or project and who have formally delegated Technical Authority traceable to the Administrator. The three organizations who have Technical Authorities are Engineering, Safety and Mission Assurance, and Health and Medical. [11] Decision-Maker - A decision-maker is an individual with responsibility for decision making within a particular organizational scope. *Not to be interpreted as official job positions but as functional roles. 8 of 234 In particular, Part 2, Risk Analysis of Alternatives, is internally iterative as analyses are refined to meet decision needs in accordance with a graded approach, and Part 2 is iterative with Part 3, Risk-Informed Alternative Selection, as stakeholders and decision-makers iterate with the risk analysts in order to develop a sufficient technical basis for robust decision making. Additionally, decisions may be made via a series of downselects, each of which is made by a different decision-maker who has been given authority to act as proxy for the responsible decision authority. Risk-informed decision making is distinguished from risk-based decision making in that RIDM is a fundamentally deliberative process that uses a diverse set of performance measures, along with other considerations, to inform decision making. The RIDM process acknowledges the role that human judgment plays in decisions, and that technical information cannot be the sole basis for decision making. This is not only because of inevitable gaps in the technical information, but also because decision making is an inherently subjective, values-based enterprise. In the face of complex decision making involving multiple competing objectives, the cumulative judgment provided by experienced personnel is an essential element for effectively integrating technical and nontechnical factors to produce sound decisions. 1.5.1 Part 1, Identification of Alternatives In Part 1, Identification of Alternatives, objectives, which in general may be multifaceted and qualitative, are decomposed into their constituent-derived objectives, each of which reflects an individual issue that is significant to some or all of the stakeholders. At the lowest level of decomposition are performance objectives, each of which is associated with a performance measure that quantifies the degree to which the performance objective is addressed by a given decision alternative. In general, a performance measure has a direction of goodness that indicates the direction of increasingly beneficial performance measure values. A comprehensive set of performance measures is considered collectively for decision making, reflecting stakeholder interests and spanning the mission execution domains of: Safety (e.g., avoidance of injury, fatality, or destruction of key assets) Technical (e.g., thrust or output, amount of observational data acquired) Cost (e.g., execution within allocated cost) Schedule (e.g., meeting milestones) Objectives whose performance measure values must remain within defined limits for every feasible decision alternative give rise to imposed constraints that reflect those limits. Objectives and imposed constraints form the basis around which decision alternatives are compiled, and performance measures are the means by which their ability to meet imposed constraints and satisfy objectives is quantified. 9 of 234 1.5.2 Part 2, Risk Analysis of Alternatives In Part 2, Risk Analysis of Alternatives, the performance measures of each alternative are quantified, taking into account any significant uncertainties that stand between the selection of an the alternative and the accomplishment of the objectives. Given the presence of uncertainty, the actual outcome of a particular decision alternative will be only one of a spectrum of forecasted outcomes, depending on the occurrence, nonoccurrence, or quality of occurrence of intervening events. Therefore, it is incumbent upon risk analysts to model each significant possible outcome, accounting for its probability of occurrence, in terms of the scenarios that produce it. This produces a distribution of outcomes for each alternative, as characterized by probability density functions (pdfs) over the performance measures (see Figure 6). Uncertain Conditions Probabilistically - Determined Outcomes Funding Environment Operating Environment Technology Development Design, Test & Production Processes Safety Risk Technical Risk Cost Risk Schedule Risk Etc. Performance Measure 1 ... Limited Data Risk Analysis of an Alternative Performance Measure n * Performance measures depicted for a single alternative Figure 6. Uncertainty of Forecasted Outcomes Due to Uncertainty of Analyzed Conditions RIDM is conducted using a graded approach, i.e., the depth of analysis needs to be commensurate with the stakes and complexity of the decision situations being addressed. Risk analysts conduct RIDM at a level sufficient to support robust selection of a preferred decision alternative. If the uncertainty on one or more performance measures is preventing the decision-maker from confidently assessing important differences between alternatives, then the risk analysis may be iterated in an effort to reduce uncertainty. The analysis stops when the technical case is made; if the level of uncertainty does not preclude a robust decision from being made then no further uncertainty reduction is warranted. 10 of 234 Performance Objectives, Performance Measures, and Imposed Constraints In RIDM, top-level objectives, which may be multifaceted and qualitative, are decomposed into a set of performance objectives, each of which is implied by the top-level objectives, and which cumulatively encompass all the facets of the top-level objectives. Unlike top-level objectives, each performance objective relates to a single facet of the top-level objectives, and is quantifiable. These two properties of performance objectives enable quantitative comparison of decision alternatives in terms of capabilities that are meaningful to the RIDM participants. Examples of possible performance objectives are: Maintain Astronaut Health and Safety Minimize Cost Maximize Payload Capability Maximize Public Support A performance measure is a metric used to quantify the extent to which a performance objective is fulfilled. In RIDM, a performance measure is associated with each performance objective, and it is through performance measure quantification that the capabilities of the proposed decision alternatives are assessed. Examples of possible performance measures, corresponding to the above performance objectives, are: Probability of Loss of Crew (P(LOC)) Cost ($) Payload Capability (kg) Public Support (1 - 5) Note that, in each case, the performance measure is the means by which the associated performance objective is assessed. For example, the ability of a proposed decision alternative to Maintain Astronaut Health and Safety (performance objective) may be measured in terms of its ability to minimize P(LOC) (performance measure). Although performance objectives relate to single facets of the top-level objectives, this does not necessarily mean that the corresponding performance measure is directly measurable. For example, P(LOC) might be used to quantify Maintain Astronaut Health and Safety, but the quantification itself might entail an assessment of vehicle reliability and abort effectiveness in the context of the defined mission profile. An imposed constraint is a limit on the allowable values of the performance measure with which it is associated. Imposed constraints reflect performance requirements that are negotiated between NASA organizational units which define the task to be performed. In order for a proposed decision alternative to be feasible it must comply with the imposed constraints. A hard limit on the minimum payload capability that is acceptable is an example of a possible imposed constraint. The principal product of the risk analysis is the Technical Basis for Deliberation (TBfD), a document that catalogues the set of candidate alternatives, summarizes the analysis methodologies used to quantify the performance measures, and presents the results. The TBfD is the input that risk-informs the deliberations that support decision making. The presence of this information does not necessarily mean that a decision is risk-informed; rather, without such information, a decision is not risk-informed. Appendix C contains a template that provides guidance on TBfD content. It is expected that the TBfD will evolve as the risk analysis iterates. 11 of 234 Robustness A robust decision is one that is based on sufficient technical evidence and characterization of uncertainties to determine that the selected alternative best reflects decision-maker preferences and values given the state of knowledge at the time of the decision, and is considered insensitive to credible modeling perturbations and realistically foreseeable new information. 1.5.3 Part 3, Risk-Informed Alternative Selection In Part 3, Risk-Informed Alternative Selection, deliberation takes place among the stakeholders and the decision-maker, and the decision-maker either culls the set of alternatives and asks for further scrutiny of the remaining alternatives OR selects an alternative for implementation OR asks for new alternatives. To facilitate deliberation, a set of performance commitments is associated with each alternative. Performance commitments identify the performance that an alternative is capable of, at a given probability of exceedance, or risk tolerance. By establishing a risk tolerance for each performance measure independent of the alternative, comparisons of performance am
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started