The Command Post
Iraq
March 25, 2003
US Col apologises for Tornado Downing

From the ABC (Australian Broadcasting Corporation)

An American colonel in charge of the Patriot battery that accidentally shot down a British Tornado jet, killing the two crew, has apologised for the blunder.

Colonel Tim Glaeser says a software glitch allowed the Patriot radar to read the Tornado as a missile.

He made the apology during a visit to the Tornado base at Ali Al Salem in Kuwait.

The RAF commander has also given an assurance that the mistake cannot happen again.

I'm a military software engineer, and have written an Op-Ed piece.

Posted By Alan E Brain at March 25, 2003 02:56 AM | TrackBack
Comments

"live and learn"?!?! Easy for you to say and the software programmer who is still alive. Not so easy for the two RAF pilots to say, huh?

Where was the Quality Control?!?!? Where was the testing and retesting?

As a hardware engineer (you know, a REAL engineer with a BSEE) I can tell you that no piece of military hardware goes out the door without extensive testing of a variety of different data patterns to ensure that this very fucking thing doesn't happen.

How about you software "engineers" start taking a few cues from us REAL engineers and adopt quality control protocols similar to those for hardware?

Posted by: notmyrealname at March 25, 2003 03:04 AM

"OK, live and learn..."

Are you suggesting that the software was going through a very real (and fatal!) beta-test stage?

Posted by: notmyrealnameeither at March 25, 2003 03:23 AM

now this is funny engineers ragging software progammers....
hell between the 2 of em were lucky the shuttle flew so many times before it exploded into atoms...lol

Posted by: aby normal at March 25, 2003 03:25 AM

what im really saying is in war SHIT happens...and you live with it...

Posted by: aby normal at March 25, 2003 03:30 AM
As a hardware engineer (you know, a REAL engineer with a BSEE) I can tell you that no piece of military hardware goes out the door without extensive testing of a variety of different data patterns to ensure that this very fucking thing doesn't happen.
If the requirements are wrong, it doesn't matter how well its manufactured. No test (except for trivial systems) can cover 100% of all possible inputs. Formal correctness proofs aren't feasible for complex systems. Remember the System Test of both Hardware and Software didn't pick this up. And finally...
How about you software "engineers" start taking a few cues from us REAL engineers and adopt quality control protocols similar to those for hardware?
We did that a long time ago (remember the amount of software in those JDAMs etc). E-mail me if you're interested in some of the techniques, you might learn something. Or more valuably to me, I might learn something.


Posted by: Alan E Brain at March 25, 2003 03:34 AM

Re Canbybears comment about the Shuttle: Military stuff is not the only thing I do.
Have a look here. Possibly the most complex micro satellite ever built - major experiments like a GPS receiver that could well be vital in making an autonomous camera to check future Shuttle missions. Some re-configurable hardware that's essential for the ESA's new Mars mission. A comms experiment that could save people's lives in the Outback. And other experiments too, less glamorous but just as important.
Now that's not directly safety-critical: but it was our first satellite in 25 years, and both national prestige and our whole space programme were riding on it. It just had to work, flawlessly, first time. Which it did.
I headed the spaceflight avionics software development for it. You'll find some of my comments on space systems here.
"In space, nobody can press CTRL-ALT-DEL"
So...care to bet your whole career, your personal reputation, your nation's reputation, and sometimes your life and that of others on your work?

Posted by: Alan E Brain at March 25, 2003 04:32 AM

Regarding "Live and Learn" - when the fuss is over, the people who screwed up can go away and have their nervous breakdowns. But not just yet, they've got to keep on going and fix the bloody thing so it doesn't happen again, document the problem and fix the whole development system to prevent similar errors. People's lives depend on it. Then they can go shoot themselves to stop the pain and guilt, not before.

Posted by: Alan E Brain at March 25, 2003 04:35 AM

I don't mean to belittle any deaths, but
regarding hardware vs. software: the
FACT is, software is much more complex
than hardware. For example, an
operating system, a compiler, or even
today's games contains significantly, by
factors over, code paths and states than
a CPU.

Furthermore, the two studies of EE and
CS are inherently quite different, on
theoretical terms. CS is more
theoritical while EE is more empirical.
CS by itself is not science per se and
not engineering either--it's pure
discrete math (i.e. axioms, corollaries
theorems, e.g. proove 1+1=2, proove
non-deterministic finite automata, etc)

And while it's theoritically possible to
proove everything done in software,
there are two hinderances: 1) obviously,
QA by proof is next to impossible in
practice 2) the proof, or any testing
for that matter, is only as good
as the rules it's based on i.e. the
requirements, which goes back to what
Alan was saying.

So you're left with just empirical
testing, which reflects the
impossibility of truly comprehensive
testing due to the complexity being much
more than would be in hardware--which
is of course, why such things are done
in software in the first place because
of hw's limitations.

And I can attest to that fact being a
test engineer at the OS level (writing
code to test someone else's kernel code)
People don't realize that test programs
can often be 10 or even much more times
larger than the code under test; and
that's being only 50-60% comprehensive!

Posted by: Jimmy Vetayases at March 25, 2003 05:02 AM

Please divert to the op-ed section from here, where you'll find my replies.

Posted by: Alan E Brain at March 25, 2003 06:11 AM
Post a comment









Remember personal info?