5
$\begingroup$

I took a look at the Preliminary Radiation Analysis of the Total Ionizing Dose for the Resource Prospector Mission paper and was quite surprised to find how "low" total radiation doses absorbed during a trip to the Moon are. For instance, it looks like the 90 day trip to the moon surface + 30 days of operation on the lunar surface would result in about 40cGrays (40 rads) of total radiation dose, when we apply 1cm of Al coating, if I read the Figure 9 from the paper right (see the picture below).

If total radiation doses are so low, why CPUs used in spacecrafts such as LRO or Mars rovers are designed to withstand hundreds of thousands of rads? On the first glance it looks like a huge overkill, given the data from the paper. From what I read, radiation doses on Mars are similar, if not smaller.

I have the following hypotheses as to why TID tolerance is that high:

  1. TID is not a big problem itself, but the TID tolerance is somehow correlated with immunity to Single Event Effects - This hypothesis assumes that in fact we want to protect the spacecraft against deadly SEEs, and TID tolerance is just a handy proxy to estimate the tolerance of CPU against SEEs.
  2. It may be more cost effective to use more TID-tolerant chip rather than coat the spacecraft with thick layer of alluminium, due to mass and volume constraints.
  3. The space environment is very dynamic, and although the average daily radiation dose is low, there could be an unexpected solar / cosmic event that would temporarily increase the dose and cause malfunction. Large TID tolerance gives us a buffer for such events.
  4. Other

I would be grateful if you could clarify this for me and which of my hypothesis is correct.

Total mission doses

$\endgroup$
5
  • 1
    $\begingroup$ 1 cm is a pretty thick plate of aluminum. $\endgroup$ Commented Jun 5 at 19:16
  • $\begingroup$ Okay, I've just read that the typical thickness used in satellites is 1,5mm-3mm. In such case the dose for scenario mentioned above would be about 500-1000 rad, or about 1500-3000rad/year. It still seems far from 1Mrad TID tolerance of RAD-750 CPU. $\endgroup$
    – xfii
    Commented Jun 5 at 19:38
  • 3
    $\begingroup$ For a system that won't get any maintenance for its entire lifespan, where any breakdown is essentially permanent damage? I dunno, that seems pretty reasonable. Also your point 4 definitely has some truth: anything in space has to be ready to deal with radiation spikes from things like solar CMEs, so being able to handle 1000x the average dose is potentially critical. You don't want to lose a billion dollar project because the sun did something weird this week. $\endgroup$ Commented Jun 5 at 21:03
  • 1
    $\begingroup$ Don't forget that there are many satellite and interplanetary applications for which the mission length is measured in decades, not days. The TID really adds up. $\endgroup$
    – Dave Tweed
    Commented Jun 7 at 21:31
  • $\begingroup$ They must last much longer than 90 days if people like Matt Damon and Val Kilmer are to be able to repurpose them in the future, or if there's an unusually giant solar storm. $\endgroup$
    – uhoh
    Commented Jun 13 at 23:08

2 Answers 2

4
$\begingroup$

Not unlike living organisms, electronics have some ability to continuously self-repair. For example, dynamic random access memory stores energy in billions of tiny capacitors. These charges decay, and typically a self-refresh process continually sweeps through the memory reading and rewriting the rows of data before the charges decay too much.

Error correction codes are also used in some applications. Many can detect two-bit errors and correct one in a given chuck of data. If this technique is used, and a bad bit is detected and corrected before a second bit in the same chunk of data is corrupted, then the bad bits can be repaired.

The technique can even be used by striping the chunks of data across banks of parts, such as in the Redundant Array of Independent Disks (RAID) arrays used in the servers in most data centers. In this case, if a bad hard drive can be detected and swapped out quickly, before a second drive fails, then all of the data can be recovered from the remaining working drives.

Built-in-self-tests (automated versions of "running a diagnostic" in Star Trek) can also detect failures and bad silicon can be taken out of service by the operating system. This can be a useful strategy if a bad core is detected in a multi-core processor, for example. Or, bad sectors of a storage device can be detected and flagged as "do not use".

In some cases, such as what happened in April 2024 with Voyager 1, clever engineers can figure out what went wrong and figure out how to work around the problem. In the case of Voyager 1, it took them 5 months, but considering that the craft was 24 billion kilometers away and was designed in the 1970s, that's pretty impressive. Fortunately, errors that require this level of intervention are less frequent.

There are lots of techniques along these lines that can help to protect electronics, but most of them can be overwhelmed if the errors occur too frequently or if the self-repair techniques don't repair quickly enough.

Total Ionizing Dose is a good proxy for how frequently errors are likely to occur and could help an electronics engineer figure out if their hardware's various self-repair techniques would be overwhelmed.

As for why chips used in space are radiation-hardened, the reason is simply that none of the self-repair technologies developed for the commercial market are particularly foolproof, since it is relatively easy to swap out parts and back up data down here on Earth. Space is a different environment with different rules. The term "radiation hardened" really just means creating electronics that are better adapted to these different rules.

In practice, a company can design a chip that makes liberal use of all of the self-repair technologies we know of and even invent some new ones. Engineers will then need to do lots of testing in a synthetic radiation environment to verify that it is robust, diagnose why the chip fails despite their best efforts to make it bulletproof and try again. Repeating this process many times will lead to a radiation-hardened chip. But, it will be an expensive and time-consuming project. There will be very few customers over which to amortize the Research, Development, Testing, and Evaluation (RDT&E) costs.

So, a radiation-hardened computer can end up being crazy expensive (for example, $338,000 for a RAD750 which was used in JWST) and probably will not be very state-of-the-art when it comes to performance and modern features (other than self-repair features). But, it may well be worth it to reduce the overall risk of mission failure.

I think that all three of the hypothetical reasons you listed are correct.

Radiation levels can be correlated with chip error resilience (as explained above).

Extra mass for shielding can be incredibly expensive (around 1.2 M per kg for one-way trips to the Moon and Mars). But, due to cosmic ray interactions with shielding materials, shielding can cause a single particle to generate a whole shower of particle interactions whereas before there would be only one interaction per cosmic ray. So, the benefit of just a limited amount of shielding is not clear-cut.

Particle shower from a 100GeV Proton

A requirement such as "Shall survive worst-case space weather" is very likely to make it onto a product requirements document, which means the selected part will almost certainly be overdesigned for typical space weather conditions.

$\endgroup$
3
$\begingroup$

So a more pragmatic answer to "why can the chips withstand so much worse than the lunar mission" is that they are using chips with flight heritage. In particular, a lot of that flight heritage comes from satellites, which instead of a 90 day mission to the moon, spend 15 years in GEO, or worse, MEO. The TIDs at those missions are crazy high compared to a three month mission to the moon.

$\endgroup$
1
  • $\begingroup$ Exactly. The market for rad-hard chips is already super small (a typical smartphone does not need any), so it makes little sense to fragment it further by offering multiple versions of the same thing and then selling just a few dozen pieces of each at most. $\endgroup$
    – TooTea
    Commented Jun 17 at 10:38

Not the answer you're looking for? Browse other questions tagged or ask your own question.