Thursday, September 25, 2008

Another Technology Tailchase Against Human Error


Another Technology Tail Chase

In an article by the Associated Press today (GAO: Risk of runway collision still high” by Joan Lowy), “the GAO reported that the rate of close calls on airport runways is up over last year and the risk of a collision is high, according to Gerald Dillingham, the General Accountability Office's top expert on aviation safety.”

In testimony to a House panel, Mr. Dillingham said that even though the Federal Aviation Administration "has given a higher priority to runway safety" the rate of serious incidents — measured by number of incidents per 1 million takeoffs and landings — has increased about 10%, Dillingham told the House transportation committee's aviation subcommittee. Both the number and rate of all types of runway incursions are also are up, Dillingham said.

The FAA, responsible for oversight of the airline safety programs has proposed three technology solutions to this human error problem:

• Installation of electronic mapping equipment in the cockpits of 80 airliners, belonging to four airlines, that will provide the position of the aircraft while on the ground
• Installation of runway status lights over the next three years at 21 airports to signal pilots when a runway is safe to enter or cross, and in the long term,
• Plans for a satellite-based map system on all commercial airliners that will show pilots the location of their aircraft in the air and on the ground, as well as the positions of other planes.

Yikes - another plan to spend tens of millions of taxpayer dollars in an attempt to find technological solutions for people problems. Don’t get me wrong, I’m for safety widgets and gadgets where they make sense, but they must always be complemented to the maximum extent with the human side of the error chain.

Everyone seems to agree, but we still burn incense at the alter of the techno gods instead of confronting the challenge head on (pun intended).

Later in the same article referenced above, “GAO expert Dillingham and FAA Chief Operating Officer Hank Krakowski agreed that mistakes by pilots and controllers rather than technology problems were key factors in many incursions.”

Really? If so, why are we not getting serious – really serious – about error control training and personal responsibility? The problem here lies in the mistaken belief that the major airline training programs are already doing all they can to educate and train pilots against human error.

There is much more that can and should be done. Outdated programs such as Crew Resource Management (CRM) and Threat and Error Management (TEM) are required for airline pilots on a recurring basis, but these classroom group hug sessions have changed very little since the late 1980s when they were first offered, and pale in comparison to the rigorous error control training currently required in some sectors of the US military. Compared to what the Marines have done with their Global War on Error initiative, traditional airline human factors programs have become polished antiques.

It used to be that the military turned to the airlines to find industry best practices. Perhaps now it is time for the regulator (FAA) and oversight agencies (GAO and NTSB) to politely suggest (or demand?) that the Part 121 air cariers look at the highly sophisticated error control programs currently available to confront this serious problem.

NTSB Chairman Mark Rosenker told The Associated Press he applauds the steps the FAA has taken to reduce runway accidents but worries they may not be enough to head off a disaster. He goes on to say ‘we have been very close in recent years to seeing a terrible collision . . . very fortunate that the airmanship and the seconds in which pilots have had to react have averted potentially catastrophic results.”

So if human error is the problem – and fast thinking pilots are the best solution, why are we – once again - putting all our eggs in the technology basket when the answer is much simpler, far less expensive and far more effective?

Monday, September 15, 2008

Hurricane Survivors and Train Wrecks


This past week has seen tragedy strike Americans on two coasts; one the result of a natural disaster, Hurricane Ike; and the other man made, the head on collision between a Los Angeles Metrolink commuter train and a freight train near Simi Valley, California. Interestingly, the grim reaper’s toll is nearly the same for both. As we sort through the rubble in a search for lessons, a few items stand out. In one particularly interesting story, one brave but foolish soul found his 15 minutes of fame by being the only person in Surfside Beach, Texas who defied authorities and refused to evacuate before the oncoming hurricane. Sixty-seven year old Ray Wilkinson, a former Marine and retired carpenter refused the mandatory evacuation order from police and drank the night away as the winds and rain howled. Authorities found him Saturday morning after the storm – safe and stone drunk. When interviewed after the event, he stated “I consider myself to be stupid.” No argument there.

Further west, there was little or no warning for the victims of the worst train crash in the US since 1993. Early words from the investigation hint definitively at human error as the primary cause, with two possible threads emerging in the early investigation. The first is that recorded audio tapes from the Metrolink train seem to indicate that the required crew coordination calls between the engineer and conductor did not take place on the two signals prior to the crash. If this is indeed what occurred, it is a simple case of non-compliance with regulations and procedures. Another lead said that two teenagers reported receiving a text message from the engineer just moments before the crash, possibly indicating the engineer was not paying full attention to the task at hand. Both are errors of personal origin and both can be trained (no pun intended) out of existence. But it is unlikely that we will do so. I fully expect we will once again hide behind the oft heard refrain that "to err is human." In this case - as in many similar cases where the outcome of an avoidable mistake results in an immeasurable toll on innocent victims - to err is inhuman.

Already there are cries for technology to save the day and prevent this type of error from ever occurring again. If the past is prologue, whomever markets the “Positive Train Control” system, which supposedly stops a train if a signal is disobeyed by the humans, is about to get a new contract. But technology is never the complete answer, and I seriously doubt it will be here. Don’t get me wrong, I am all for gadgets that makes things safer, but my experience tells me that if you make something foolproof today – tomorrow someone will give birth to a more sophisticated fool. And while this is a good way to keep the wheels of the "safety-industrial complex" turning, there is a simpler and far more effective measure we can and should be taking. More on that in a moment.

So my point in bringing these two tragedies together on the same page is simply this; I’m not sure that Mr Wilkinson, currently Surfside Beach’s most famous resident, isn’t the key to understanding the human dynamic in both of these events. While I don’t advocate his Jack Daniels approach to problem solving, he at least recognizes his own human limitations and has found a way to deal with them. As the NTSB and government officials work out the details of the Metrolink disaster, perhaps someone will realize our current collective limitations in understanding the nature of error and advocate getting serious about training to control these events.

The science is available and the means are here for such error control programs to become a part of every high risk employees’ training. But that is not the typical American response to large scale disasters. Why make people learn when we can build a machine so we don’t have to think? If the past is a guide, the response to this tragedy will be to (1) blame the dead engineer, (2) sue the hell out of Metrolink, and (3) put some new technology in place to prevent this specific event from occurring again at this specific location, and (4) claim the problem is fixed . . . until the next time.

But we can do better. Error control will never be engineered out of existence with technology. Human error is a force of nature, and just like the hurricane it can be studied, its movements tracked and predicted. To be certain, there are places where technology can help, but at the end of the day error is an individual phenomena that can be measured, understood and predicted. Predictable is preventable.

The forgotten key to error control is personal responsibility and accountability. As simple as it sounds, we can teach people to make less errors. This cannot be done in the traditional sense of training against someone’s last mistake, but rather through a systemic approach to comprehending how and why we get unintended consequences from our well intended decisions and actions. It is no longer enough to train people to do things right – they must learn why they do things wrong, and these are two very different skill sets. If you want to know more about how this is being successfully accomplished today, please contact me at tony@convergentperformance.com.

Friday, September 12, 2008

In Praise of Plan C


Last week, NASA Administrator Michael Griffin was embarrassed by a leaked email about his frustration with the United States plan to retire the Space Shuttle fleet without a good backup plan for manning the International Space Station. If the current plans remain intact, the US will not have its own launch vehicle during the period of time between the current Shuttle fleet’s retirement in 2010 and the new Orion Crew Exploration Vehicle fleet launch – estimated in 2015. It seems during the great thaw in post-Cold War US-Russo relations, we put all our space eggs in a matryoshka (Russian nesting doll). Our Plan B was to purchase space on Russian Soyuz launches to keep up our end of the bargain for manning the International Space Station. When Russian tanks rolled into Georgia last month and President Dmitry Medvedev made statements about not fearing the return of the Cold War, that plan hit a small snag. Administrator Griffin’s email pointed to the now obvious fact that there was no Plan C, where he went on to say, "In a rational world, we would have been allowed to pick a Shuttle retirement date to be consistent with Ares/Orion availability, we would have been asked to deploy Ares/Orion as early as possible and we would have been provided the necessary budget to make it so.”

Would have, could have, should have - 20/20 hindsight is of little practical value, assuming no nearby portal through the time/space continuum. But a Plan C – that is something we can use in the here and now.

A more personal example:

Back when my skin was green and baggy and my mind, body and soul were owned by the U.S. Air Force, we used to have a fancy name for Plan C – “tertiary.” The North American Encarta Dictionary defines tertiary as “third in degree, order, place, or importance” – Plan C. I am embarrassed to say that the one time in my life I really needed to know the meaning of this adjective, I was running behind on my way to a Saturday morning B-1B bomber training launch, doomed to playing catch up by a teenage cashier at the Seven Eleven who broke open his roll of nickels and scattered them on the floor while attempting to make change for $1.39 of rot gut coffee. As I rushed through Squadron Operations on my way to the jet that fall morning, I was perplexed by a strange note, neatly handwritten on the bottom of our mission flight schedule that read cryptically “usafa flyby tertiary.” Quickly turning to the daily operations officer – I inquired “What the hell does this mean?” “Nothing really,” he replied, “there is an Air Force football game we are supporting with a flyby in Colorado Springs, but both the primary and secondary aircraft have launched and are in the green – forget about it, the chances of you guys getting a call are next to nil.” Good enough for me, "let’s roll," I said to my intrepid crew mates. You can guess the rest . . . or at least part of it.

Half way through our low level route in western Kansas we got a call from a Flight Service Station on Guard frequency that asked we abort the route and contact our Command Post. The little voice in my head started his evil and foreboding “I told you so” chuckle. In short order, we were told that we were to proceed with greatest haste to make an 1158 Mountain Time flyover in accordance with the “wing flyover plan” – a document we had never heard of and did not possess. (We were later to learn that there were only two copies of said document.) These were minor hurdles for high steppers like us. With real time precision communications and precise execution, we swept the wings back and got clearance for a supersonic run to central Colorado, where if we could keep our speed up, we would arrive 40 seconds early – plenty of time to visually ID the stadium. To make a long story readable, we flew over the stadium in the wrong direction at the wrong altitude (not too high) and wrong airspeed (not too slow), and most importantly, wrong decibel level (not too quiet). Upon return to base, there was another note at the Ops desk that said simply “Major Kern, Call Wing Commander,” which led to a memorable and unpleasant conversation. To this day, I occasionally hear Falcon Fans remark of the day that “B-1 bomber did an insane flyby.” I never let on that I know the culprit. To hear the story told from ground level, cadets cheered, civilians screamed in fear, babies cried, and if I got the story correct from a third hand source, a few windows broke and a horse may have died off base. This was not cool – it was unprofessional – and it was directly the result of not comprehending the need for an actionable Plan C.

In a world ruled increasingly by dynamic change – Plan B is highly likely to become Plan A without much advance warning – and the word “tertiary” has real significance. Plan accordingly.