23
$\begingroup$

The following answers mention the use of parity bits in the Apollo guidance computer:

  • this answer to Bits per core for the different versions of the Apollo guidance computer core rope memory?
  • this answer to How did the Apollo computers evaluate transcendental functions like sine, arctangent, log?

This leads me to wonder:

  1. How did the Apollo guidance computer handle parity bit errors?
  2. Were these ever encountered during actual missions?
$\endgroup$
3

2 Answers 2

21
$\begingroup$

1. How did the Apollo guidance computer handle parity bit errors?

According to Apollo 15 Hardware by Delco Electronics,

Parity Alarm

Occurs if any accessed word in fixed or erasable memory whose address is $10_8$ or greater contains an even number of "ones." All locations of $10_8$ or greater are stored in fixed or erasable memory with odd parity.

$10_8$ is octal 10 or decimal 8.

This condition triggers an automatic hardware restart:

A RESTART (hardware) and subsequent AGC/LGC Warning is generated for the following alarms:

  • Oscillator Failure
  • Transfer Control (TC) Trap
  • Parity Alarm
  • Nightwatchman Fail
  • Interrupt (RUPT) Lock
  • Voltage Fail

The RESTART inhibits access to memory temporarily, freezes the computer, stores in process information and then transfers control to address 4000. This address has the information address for the next instruction after a RESTART that the software programmer has provided.


2. Were these ever encountered during actual missions?

According to the Apollo Program Summary Report, the most severe anomaly in the entire GN&C system was a transient voltage which gave an erroneous indication to the computer that the inertial attitude reference had been lost. It also states that an open gimbal rate feedback circuit caused unexpected oscillation of the redundant engine gimbal actuator assembly. However, of the computer itself, it unambiguously states:

The performance of the computer was flawless.

I would interpret that as no parity errors.


According to Recovery from Transient Failures of the Apollo Guidance Computer:

In a total of over 25 hours of space flight, the computer has yet to have a transient failure from which the restart feature could be called on to demonstrate its worth.

(credit to @aCVn) That report was published August 1968, before any of the lunar landings.

$\endgroup$
4
  • 3
    $\begingroup$ ibiblio.org/apollo/hrst/archive/1033.pdf (section XVI, PDF page 10) says that in 1968, in more than 25 flight hours (section XIX, PDF page 11), "No restart has occured in flight.". That's pretty definite, but of course doesn't cover the lunar landing missions. $\endgroup$
    – user
    Commented May 6, 2019 at 13:30
  • 2
    $\begingroup$ The comment above looks pretty definitive. Since comments are temporary and can be deleted at any time, would you consider moving that into your answer? $\endgroup$
    – uhoh
    Commented May 19, 2019 at 11:08
  • 2
    $\begingroup$ "The performance of the computer was flawless" - except for that little radar-related CPU overload problem, right? $\endgroup$ Commented Mar 17, 2020 at 23:11
  • 1
    $\begingroup$ @user253751: Technically, that was caused by spurious pulses from the signal processing equipment, not the computer itself. Nonetheless, it is desirable for the computer to safely handle malfunctions in other equipment. $\endgroup$
    – DrSheldon
    Commented Mar 17, 2020 at 23:26
14
$\begingroup$

What a fascinatingly obscure question :-) It took some digging, so perhaps someone who's actually seen an AGC might know better:

The parity bit was used to verify that data transferred correctly from memory to the registers. That is, the data in the memory was assumed to be correct, and the error was assumed to take place between the electronics that transfer from the core memory to the registers.

If a parity bit error were detected, then a parity alarm would happen. This actually was displayed on the DSKY and caused a restart of the programs currently running. It was so important that a parity circuit existed in the AGC. It was one of many hardware failure detection systems that protected the AGC during such failures. You can read a lot more about the restart system here here, which describes the parity stuff.

I cannot find any reference to a parity alarm ever occurring, but I haven't looked too hard at the moment. I expect it would be in the mission communications logs if it did.

$\endgroup$
5
  • 4
    $\begingroup$ Very important. Nowadays you get a nice screen from the OS/BIOS and something along the lines of "parity error - system halted" in correctly implemented hardware (single error correction, double error detection ... ) $\endgroup$ Commented May 5, 2019 at 12:50
  • $\begingroup$ Right! obscure question of the month :) $\endgroup$
    – Fattie
    Commented May 5, 2019 at 14:23
  • 1
    $\begingroup$ Love the old terminology for what we now know as a watchdog circuit ... the "Night Watchman" :) $\endgroup$ Commented May 5, 2019 at 14:34
  • 7
    $\begingroup$ As someone who has seen (and is restoring) an AGC, I can comment on this. Memory did have parity (15 bits data + 1 bit parity). There was no assumption that the data in memory was correct. $\endgroup$ Commented May 5, 2019 at 16:55
  • 1
    $\begingroup$ @DavidTonhofer A modern system will more likely make a system log entry when a correctable RAM ECC error is encountered. (Certainly Linux does it that way.) The assumption here is probably that if you care enough to even know that such an error occured, you care enough to have some kind of log monitoring in place, and since it's correctable, there's no reason to halt or reboot the system. $\endgroup$
    – user
    Commented May 6, 2019 at 12:34

Not the answer you're looking for? Browse other questions tagged or ask your own question.