Did you know that a faulty soviet early warning system nearly caused World War III in 1983? Do you remember when 17,000 planes were grounded at Los Angeles International Airport because of a software problem? And who could ever forget the infamous Millennium Bug.
Through the years, the IT world has been plagued with software and hardware disasters. Fortunately, we worked through them all, but it’s interesting to take a look back. ZDNet.com contributor Colin Barker took a look at the top 10 IT disasters of all time, which should prompt us all to ask ourselves: “Is our recovery plan ready for the next disaster”?
- 1983 | Faulty Soviet Early Warning System Nearly Causes WWIII
As a result of a software bug in the Soviet early warning system, the Russians received an impossible-to-believe message that said the U.S. had launched five ballistic missiles toward Russia. However, a very clever duty officer had a “funny feeling” in his gut and figured that if the U.S. was really attacking Russia, then they would obviously launch more than just five missiles. This near-apocalyptic disaster was traced to a fault in software that was supposed to filter out false missile detections caused by satellites picking up sunlight reflections off cloud tops.
- 1990 | The AT&T Network Collapse
In 1990, 75 million phone calls across the U.S. went unanswered after a single switch at one of AT&T’s 114 switching centers suffered a minor mechanical problem, which caused a total shut down. It turned out that an error in a single line of code – not hackers, as some claimed at the time – had been added during a recent highly complex software upgrade. American Airlines alone estimated this small error cost it 200,000 reservations.
- 1996 | The Explosion of the Ariane 5
In 1996, Europe’s newest unmanned satellite-launching rocket, the Ariane 5, blew up just seconds after taking off on its maiden flight from Kourou, French Guiana. According to a piece in the New York Times Magazine, the self-destruction was triggered by software trying to stuff “a 64-bit number into a 16-bit space.” The number was too big, and an overflow error resulted. When the guidance system shut down, it passed control to an identical, redundant unit, which was there to provide backup in case of just such a failure. But the second unit had failed in the identical manner because it was running the same software.
- 1998 | Mars Climate Observer Metric Problem
Two spacecraft, the Mars Climate Orbiter and the Mars Polar Lander, were part of a space program that was supposed to study the Martian weather, climate and water/carbon dioxide content of the atmosphere. But a problem occurred when a navigation error caused the lander to fly too low in the atmosphere and it was destroyed. Unfortunately, a sub-contractor on the Nasa program had used imperial units (U.S.), rather than the NASA-specified metric units (Europe), which caused the disaster.
- 1999 | Siemens and the Passport System
Because the British Passport Agency had brought in a new Siemens computer system without sufficiently testing and training their staff, nearly half a million British citizens missed their vacations because their new passports couldn’t be issued on time.
- 1999/2000 | The Two-Digit Year-2000 Problem
In 1999, everyone was terrified of the Millennium Bug, fearing that all the predictions of IT doom and ruin would fall upon them. But when the sound of clocks striking midnight in time zones around the world was followed by… not panic, not crashing computer systems… but nothing more than new year celebrations. Disaster was averted, but the original decision to use double digits for the date field in computer programs cost developers billions of dollars worldwide to correct before the year 2000 was ever rung in.
- 2004 | EDS and the Child Support Agency
Business services giant EDS caused this spectacular disaster, which assisted in the destruction of the Child Support Agency (CSA) and cost the British taxpayer over a billion pounds. EDS’s CS2 computer system somehow managed to overpay 1.9 million people and underpay around 700,000, partly because the Department for Work and Pensions (DWP) decided to reform the CSA at the same time as bringing in CS2.
- 2006 | Airbus A380 Suffers From Incompatible Software Issues
The Airbus issue of 2006 highlighted the problem of what happens when one software program doesn’t talk to the other. The problem arose when the German system used an out-of-date version of CATIA and the French system used the latest version. So when Airbus was bringing together two halves of the aircraft, the different software meant that the wiring on one did not match the wiring in the other. The cables could not meet up without being changed. The problem was eventually fixed, but only at an enormous cost, putting the project back more than a year.
- 2006 | When the Laptops Exploded
It all began when a laptop manufactured by Dell burst into flames at a trade show in Japan. There had been rumours of laptops catching fire, but the difference here was that the Dell laptop managed to do it in the full glare of publicity and video captured it in all its horrible glory. An investigation began, and the problem was traced to an issue with the battery/power supply that had overheated and caught fire. As a result, Dell had to recall and replace 4.1 million laptop batteries.
- 2007 | LA Airport flights grounded
Some 17,000 planes were grounded at Los Angeles International Airport, because of a systems software problem at the United States Customs and Border Protection (USCBP) agency. The culprit was a network card that, instead of shutting down as it should have done, persisted in sending the incorrect data out across the network. The data then cascaded out until it hit the entire network at the USCBP and brought it to a standstill. Nobody could be authorized to leave or enter the U.S. through the airport for eight hours.