Flying Scotsman
Well Known Member
Just for context purposes, here's where we were on the other thread over in the Glass Cockpit area:
Originally Posted by rleffler
I choose not to have a software bug or hardware failure take out all the systems. Independent systems will still function in a known state.
This gets in to an area I work with quite a bit, safety-critical software. I'll spare you all the definitions of what constitutes a "hazard", etc., or even what we mean by "safety-critical" (versus "mission-critical", and so on).
This probably doesn't really mean much for what are pretty simple systems (EFISes) by comparison, but there have been considerable studies done on the idea of "different software done by different teams". What was found was that the teams made the same sorts of design "errors", or more properly, failed to deal with unsafe conditions *in the same way*.
Here is noted expert on software safety Nancy Leveson from MIT:
To cope with software design errors, ?diversity? has been suggested in the form of independent groups writing multiple versions of software with majority voting on the outputs (like modular redundancy in hardware). This approach is based on the assumption that such versions will fail in a statistically independent manner, but this assumption has been shown to be false in practice and to be ineffective in both carefully controlled experiments and mathematical analysis [14,15,16]. Common-cause (but usually different) logic errors tend to lead to incorrect results when the various software versions attempt to handle the same unusual or difficult-to-handle inputs. The lack of independence in the multiple versions should not be surprising as human designers do not make random mistakes; software engineers are not just monkeys typing on typewriters. As a result, versions of the same software (derived from the same requirements) developed by different people or groups are very likely to have common failure modes?in this case, common design errors.
In addition, such redundancy schemes usually involve adding to system complexity, which can result in failures itself. A NASA study of an experimental aircraft with two versions of the control system found that all the software problems occurring during flight testing resulted from errors in the redundancy management system (which was necessarily much more complex than the original control software). The control software versions worked perfectly [17].
Software Challenges in Achieving Space Safety by Nancy Leveson. Journal of the British Interplanetary Society, Vol. 62, 2009
Do not think that if you have some sort of non-internal fault as your "hazard" that *both* systems will not fail to handle the condition in a safe manner.
Now, a problem with an OS, or a circuitry problem, or something like that, yes...but then, that alone wouldn't require independent vendors, would it?
Further, safety is an *emergent property* of the system as a whole, and cannot be evaluated on a component-by-component basis. And that system includes your vehicle, you, the ground components, etc.
__________________
Steve
Santa Clarita, CA
PP-ASEL, ASES, Instrument Airplane
Empennage and wings done (except fiberglass)
Lyc YIO-360 (mounted) + Hartzell 74" CS prop
Starting cowling; avionics arriving!
--------------------------------------------------------------------------------
Next post:
----------------------------
Steve, you bring up an issue I've been looking at for some time and still don't have the correct words to verbalize it succinctly.
Here is what I?m thinking:
Which is safer, airplane A or airplane B?
Airplane A:
Primary and backup flight instruments, electrical system, etc.
Either two EFIS units or an EFIS unit with steam gauge backups and multiple electrical power sources, including dual batteries and generators. This includes the required wiring designed to limit back feeding/battery draining situations and the necessary switches to control it.
Airplane B:
One set of flight instruments, either steam gauges or an EFIS. Simple electrical system with no ?E-Buss? or backup battery.
When Airplane A has a problem, how much time does the pilot spend resolving the conflict and getting on with the business of flying vs. Airplane B?
Could the complexity of Aircraft A when something goes Tango Uniform actually make for a less safe aircraft when compared to the simple Aircraft B? When something goes wrong with Aircraft B, the pilot has to get on with flying whereas the pilot of Aircraft A may spend more time trying to debug a situation, which could be fatal.
Thoughts?