On January 28, 1986, nearly 30 years ago, the space shuttle Challenger broke apart 73 seconds into its flight, leading to the tragic deaths of its seven crew members. As the spacecraft disintegrated over the Atlantic Ocean, the paradigm of risk management shifted from reactive to proactive. Taxonomies, frameworks, methodologies and tools developed rapidly to meet this need to manage risk proactively. And while, nearly 30 years later, we are more confident through the evolution of risk management that has taken place to answer the reactive question, “Are we riskier today than we were yesterday?” we face the stark realization that we are not truly able to answer an even more important question: “Will we be riskier tomorrow than we are today?”
Realizing a collective vision to have informative dashboards that look forward, providing confidence in assessments of how risky things are that lie ahead, is the work of the current generation. That makes today an exciting time for risk management. Great progress has been made, but as we reflect today, we know so much more can and must be done.
At this point, we thought we would take a pause and look back 30 years on how risk management has evolved and some of the lessons we can draw from the past.
Historically, risk management has focused on managing financial and hazard risks through hedging and insurance, along with such operational risks as environmental and health and safety. Focused solely on protecting enterprise value, this traditional risk management model was highly fragmented, with a strong focus on functional excellence. Deeply rooted in the command-and-control structure, this issue of fragmentation and its potential consequences was a major underpinning of the Challenger catastrophe.
Formed to investigate the disaster, the Rogers Commission determined that the Challenger accident was caused by a failure in the O-rings sealing a joint between two lower segments of the right solid rocket booster. The objective of the O-rings was to prevent leakage of hot gases during the propellant burn of the rocket motor. When this leakage occurred during the Challenger launch, it caused the structural failure that led to the shuttle’s destruction and the crew’s demise.
The report also pointed to the failure of both NASA and the aerospace company contracted to manufacture the O-rings, first to recognize it as a problem, then to redesign it once the problem was recognized and finally to identify the danger by treating the problem as an acceptable flight risk. More importantly, there were flaws in the pre-launch decision-making process, which caused the launch decision makers to be unaware of the recent history of problems concerning the O-rings and the joint. They also were unaware of the contractor’s initial written recommendation advising against launching at temperatures below 53 degrees Fahrenheit and the continuing opposition of the contractor’s engineers after its management reversed the company’s position. The rest is history.
Seventeen years later, the space shuttle Columbia disintegrated during re-entry into the earth’s atmosphere, resulting in another tragic loss of all seven crew members. The Columbia disaster resulted from damage sustained during launch when a small piece of foam insulation broke off the main propellant tank under the aerodynamic forces of launch. This was not a new problem. The space shuttle had a long history of shedding external tank foam. Even though early in the space shuttle program foam loss was considered a dangerous problem, it ultimately was regarded as an ongoing maintenance issue. In fact, the Columbia Accident Investigation Board noted photographic evidence of foam shedding for 65 of the 79 missions prior to the fatal flight.
Both of these catastrophes illustrate the phenomenon of accepting events that are not supposed to happen, or the “normalization of deviance.” For example, prior to the Challenger incident, flight seals had shown erosion and blow-by on prior flights. But erosions and blow-bys are not expected of the design; they were warnings that something was wrong. The O-rings by design were not intended to erode, so erosion was a clue that something was wrong. If a reasonable launch schedule were to be maintained, engineering often could not function quickly enough to keep up with the expectations of originally conservative certification criteria designed to guarantee a safe vehicle.
In these situations, subtly, and often with apparently logical arguments, the criteria were altered so that flights could still be certified in time to authorize launches of the spacecraft. These flights, therefore, flew in relatively unsafe conditions with a chance of failure that, while very low, existed nonetheless. This is exactly what happened with the foam-shedding that ultimately spelled the end of Columbia, as it, too, evolved into an “in-family” event or a non-safety-of-flight issue that was believed not to pose a threat to the crew or the spacecraft. Accordingly, it, too, was deemed an accepted risk and, therefore, a maintenance issue.
The Last 30 Years – Some Valuable Lessons
Several points come to mind as we look back on these space shuttle disasters. First, cost and schedule priorities create enormous pressure on organizations and can lead to decision-making behavior that trumps safety concerns. Second, while rationalization may be an expected human activity, it can be dangerous when applied to already risky environments. Third, taking comfort in low probabilities doesn’t mean an event won’t occur, a truism that suggests a need to ask the question, “Are we prepared to respond if the event were to occur?” Fourth, contrarian views that are encouraged and considered can save an organization from making a costly mistake. Finally, and perhaps most important, decisions regarding risk and risk taking can potentially impact an organization’s reputation.
Over the past 30 years, we can point to similar “high impact, low likelihood” events that literally “stopped the show” for the proud companies experiencing them. The Exxon Valdez crisis in 1989, the mega financial derivatives losses of the 1990s by different companies of $1 billion or more, the spectacular failure of Long-Term Capital Management in 1998, the unbelievable terrorist events of September 11, the fallout of Hurricane Katrina in 2005, the Deepwater Horizon drilling rig explosion, the global financial crisis and the tragic Japanese tsunami are just some examples of unexpected happenings over the last tumultuous 30 years. This period has been marked by an unprecedented change in fundamental business and societal models (e.g., the end of the cold war, the explosion of derivative-based financial instruments, globalization, the Enron era, the advent of social media, the disruptive effect of new technologies, to name a few) that have raised the bar on the importance of being more informed and responsive to changes to an organization’s risk profile.
The last 15 years in particular have been besieged by high-profile business scandals and financial failures, sparking unprecedented regulation and providing some valuable lessons for risk management. These lessons address 10 common failures of risk management, as outlined below:
- Beware of poor risk governance and “tone of the organization,” leading to the lack of transparency, openness and commitment to continuous improvement that are so essential for risk management to function effectively.
- Watch out for reckless risk taking due to the absence of limits, checks and balances, independent monitoring and reporting and skin-in-the-game compensation structures; ironically, reckless risk taking is often perpetrated by the “smartest people in the room.”
- An inability to implement enterprise risk management effectively within strategy setting and across the enterprise exposes the organization to the vagaries of silo thinking.
- Ineffective risk assessments often: do not extend the time horizon far enough; narrow the focus to operational and compliance risks; give insufficient emphasis on understanding what management and the Board doesn’t know; place excessive reliance on probability assessments; fail to consider the velocity to impact, persistence of impact and response readiness for “high impact, low likelihood” risks; and/or fall short of improving the preparedness for the unexpected crisis.
- Not integrating risk management with strategy setting and performance management makes it almost impossible to establish relevance in the C-suite and position the organization as an early mover to capitalize on market opportunities and address emerging risks.
- Falling prey to a “herd mentality” or committing to “dance until the music stops” rather than seeking to become an early mover to act on emerging opportunities or risks before they become common knowledge compromises an organization’s ability to pay attention to the warning signs posted by the risk management function.
- Misunderstanding the “If you can’t measure it, you can’t manage it!” mindset gives managers an excuse to do nothing at all with respect to understanding and addressing difficult-to-measure risks. Inability to measure a risk will not make it go away and, if the financial crisis taught us anything, it’s that what we don’t know is more important than what we do know.
- Accepting a lack of transparency in high-risk areas (e.g., lack of information for decision-making) causes management to lose touch with reality, leaving decision makers with little insight as to the emergence or source of risk and/or what is really happening or potentially can happen.
- Management’s ignoring the dysfunctional behavior and “blind spots” created by the organization’s culture is a sure sign that trouble lies ahead.
- Not involving the Board quickly on the things that really matter is bad governance when significant risks are involved.
The above lessons reflect common risk management failures and, in effect, offer a diagnostic approach for the Board and executive management for assessing the health and viability of their organization’s risk management, particularly if enhanced with key indicators and suggested steps for avoiding them.
Over the last 10 years, COSO, ISO and others have published useful and comprehensive risk management and internal control frameworks, and executives have begun to recognize the importance of integrating risk management with core management processes. However, as evidenced by the global financial crisis, risk management capabilities in general are still relatively immature and applying enterprise risk management has been uneven at best. Contributing factors include the absence of such vital pillars as a fully engaged Board, a bought-in CEO, an open and transparent risk culture, a compensation structure that balances short- and long-term interests and, most importantly, the organizational will and discipline to act in a contrarian manner when the warning signs signal an opportunity or a threat is at hand at the crucial moment. Today, the concept of enterprise risk management remains an enigma to many senior executives and directors. The good news is that this state of affairs presents an opportunity for clarity.
Next month, we will look out over the next 30 years and report what we see in our crystal ball.
 Report of the Presidential Commission of the Space Shuttle Challenger Accident, 1986, Volume 1, Chapter 4, page 72.
 Ibid, Volume 1, Chapter 6, page 148.
 Ibid, Volume 1, Chapter 5, page 82.
 Columbia Accident Investigation Board, 2003, Chapter 6, pages 121-122.
 As defined by the Columbia Accident Investigation Board, a reportable problem that was previously experienced, analyzed and understood
 Ibid, pages 122 and 130.