How can we build a space probe's computer to survive centuries of interstellar travel?

Question

TL;DR: Can human build a computer that lasts for a few hundred years under active maintenance with a limited spare part supply, not too bulky or power hungry, and has the performance similar to a high-end desktop PC in the near future?

I'm aware of similar questions in this site, like Preserving electronics for hundreds of years and Computers lasting for centuries. However, I couldn't come up with a reasonable enough design based on the info they provided. The former has the requirement of using regular electronics (which we don't need here). The latter has provided some good ideas, but they are mostly for large bases on Earth.

Space faring robot functioning for 1G years is also a nice source of ideas, but I don't think it puts the computing power of the machine as a priority, while we do.

Full version with background

Let's say people are sending a space probe to some solar system nearby (think of Project Longshot but a longer trip). The trip will take roughly 300 years.

Since the communication delay will be multiple decades, the probe will have to handle any situation during the trip or inside the solar system on its own. So, we have to install a powerful computer on the probe to make it smart enough, and keep it running throughout the trip (or at least preserving it in a state that can be woken up any time).

Here I'm making numbers up. Let's assume the computer's total performance is comparable to a high-end desktop PC today (e.g. i7-12700K + RTX 3080, in any measurement you like (TFLOPS?); details can be adjusted), and its tasks can be distributed/parallelized however you want. It can be custom-built using technology available in the near future (say in the 2070s), and cost isn't a problem.

We also assume that any reasonable preservation environment is possible (proper heat management, vacuum, noble gas, constant temperature, anti-radiation, etc.). We can also have some spare part supplied in the probe with reasonable ways to use them. Robots that do simple maintenance can be available if needed (methods to preserve them are not discussed here).

There are limitation on mass and power because it's on an interstellar space probe. The computer and its shielding should weigh no more than a few tons so that the probe can fly fast enough. Relying on the probe's reactor, its power consumption should be less than 100kW. The lower the mass and power consumption, the better.

Are there reasonable explanations/designs for the setting above? What will the computer look like, or what should be considered when coming up with a design of it?

Rough ideas are perfectly okay! I'm not really designing an interstellar computer, but looking for a technical starting point to design other parts of the story.

Some limitations I'm aware of:

Diffusion and other degradation within semiconductor chips that makes transistors unusable. Running it continuously for a few hundred years makes matters worse. Is this a big problem?

It's my first time to ask a question here, so please tell me if I did anything wrong.

Edit: Explained what I expect to get nicer.

Vacuum is not a preserving environment for computers, it makes heat management more difficult — L.Dutch, Commented Nov 3, 2022 at 15:14
I'm an electrical engineer and I'd like to draw your attention to the following Meta post: Advice concerning questions asking HOW to implement a technological procedure or device. Why? Because if I could answer this question with any credibility I'd be working under NDA for Elon Musk or Richard Branson rather than posting the answer here. You need to set your expectations for the kinds of answers you'll receive: they'll be general, not specific. Why are you asking? What worldbuilding problem are you trying to solve? — JBH, Commented Nov 3, 2022 at 15:43
@JBH Thanks for your advice! I'm just looking for a reasonable explanation for the setting and get a rough technical idea on the computer (should it be a nicely-protected advanced space-grade computer, or a massive array of 10000 8086s connected in parallel?) for other settings in the story. — Rynco Maekawa, Commented Nov 3, 2022 at 16:02
:-) Ignoring the reality that an 8086 would have a bit of trouble long-term in space for a lot of reasons, it might be more valuable to help us understand what you felt was lacking in the questions you linked. It's reasonable to explain what methods are used to protect space-born electronics today and what engineers do to estimate their life span, but getting down to how, for example, we protect circuits from substrate diffusion, routing metal becoming brittle, or how we dope casings to keep them from boiling away in space is a pretty low level detail few would understand or appreciate. — JBH, Commented Nov 3, 2022 at 16:11
Another point: do you really need that computation power? For navigation purposes what you are proposing is an overkill. We reached the moon with a computer less powerful than one in todays oven. Because using a few nm tech for processors is not something you want to do for a centuries long journey. — Negdo, Commented Nov 4, 2022 at 14:13

Starfish Prime · Accepted Answer · 2022-11-03 22:01:27Z

Let's assume the computer needs to be roughly the same performance as a high-end desktop PC (e.g. i7-12700K + RTX 3080; can be adjusted). It can be custom-built using technology available in the near future (say in the 2070s),

This is a surprisingly tricky thing that depends on a lot of factors. A major problem with stuff in space is radiation, because there's no atmosphere to attenuate it and science-fictional environments often involve an awful lot of highly penetrating particle radiation emitted by nuclear reactors (or weapons) or from space dust hitting relativistic spacecraft, etc.

The 12700K uses a 10nm process, which means that it has very small features of which you can fit a lot on a chip, and which can work at convenient power and heat levels given the speeds and capabilities of the chips. Problem is, those little features are vulnerable in a high-radiation environment... one neutron or an HZE ploughing through the chip will likely damage many of them, and when you've damaged enough the chip will simply stop working.

Compare with a real-world radiation hardened CPU, the RAD750. It uses a CPU design that's 20 years older, and a process that produces features 15 times larger or more. It has a tenth of the clockspeed, and is only a single core as well. Those bigger, simpler components and careful choice of material mean that the RAD750 will be substantially more robust to background radiation... this will significantly improve its longevity

I posit that what your space probe will have will be stuff that is simple and very overbuilt, and because of the continuous risk of random, permanent damage caused by galactic cosmic rays which are effectively unshieldable without an enormous amount of mass, highly redundant.

In part, this will involve large numbers of simper devices, rather than a small number of more capable ones. They'd also be more likely to be things like FPGAs (field programmable gate arrays) which you can think of as a sort of reconfigurable microchip that isn't as capable as a purpose-built system but can be much faster than software and can be reconfigured to do the jobs of many different kinds of microchip. That way, you can have a large number of common devices which can be freely mixed and matched, and a standard replacement that can be used for almost any purpose. Even if an individual radiation-hardened FPGA receives serious amounts of radiation damage, it can be reconfigured to route around the damaged areas and continue to be useful in a way than a normal CPU could not.

So:

computer that lasts for a few hundred years

Potentially, yes.

under active maintenance with a limited spare part supply,

Yes. With clever system architecture, this can be made easier with hardware that can "self-heal" or at least be repurposed before needing to be replaced, and a common replacement module can be used for a wide variety of different tasks making maintenance easier and allowing the system to gracefully degrade rather than being rendered suddenly broken forever.

not too bulky or power hungry,

"Not too bulky" is hard to quantify. On the other hand, being low power is almost a requirement for something that needs to last a long time in space, because high power means high heat and that wears things out.

and has the performance similar to a high-end desktop PC in the near future?

I don't think it could be comparable. The total amount of "compute power" is likely to be higher, but it isn't necessarily going to be in a form where you could run the same kind of application on both. Advances in software and technology are going to be pretty unpredictable, but I think you could expect the future computer to give you a lot more bang per buck than today's equivalent even if it were otherwise similar in raw clock cycles or IO bandwidth or whatever.

So basically the computer will look like a lot of simpler chips interconnected into some large array, and each one of them can repurpose themselves into more specific tasks. Am I right? (It reminds me of cells inside an organism) — Rynco Maekawa, Commented Nov 4, 2022 at 2:12
@RyncoMaekawa As an embedded software engineer working with FPGAs, I don't really buy into the "repurposing" idea so much. Avoiding damaged sections (and how you avoid damaged sections having an impact) is seriously non-trivial. But it's perfectly possible to split your processing over a bunch of redundant cores and your storage over a bunch of redundant RAM/Flash chips, and there are voting schemes which make that work. This has been used on fly-by-wire controls for aircraft for 30 years or so, because whilst you can turn a spacecraft off and on again, it doesn't work well with planes. — Graham, Commented Nov 4, 2022 at 10:34
@Graham eh, it is clearly technically possible to the point where you could make an (expensive, limited) proof-of-concept using present day technology, and the OP's timeline allows for 50 years of technological advancements. 50 years ago we were still using core memory in spacecraft, for example. — Starfish Prime, Commented Nov 4, 2022 at 10:45
@Graham see also, Radiation Tolerant, FPGA-based SmallSat Computer System... "real-time partial reconfiguration to provide increased performance, power efficiency and radiation tolerance" — Starfish Prime, Commented Nov 4, 2022 at 10:49
@Michael the problem is a remarkably difficult one, but also one on which there's a lot of existing research. Shielding against HZE radiation is exceptionally challenging for a number of reasons, too many to fit into a comment ;-) — Starfish Prime, Commented Nov 6, 2022 at 14:10

jdunlop · Accepted Answer · 2022-11-03 22:09:35Z

12

In real life humanity launched Voyager I in September 1977. It's still performing reasonably well to this day (November 3, 2022 - 55 years and still going). It is very feasible and realistic to assume that a civilization capable of interstellar travel could one up scientists from the 1970's by at least one order of magnitude.

A few things to consider:

Have backups. The Voyager probes each have more than one onboard computer, so if one fails there is another one to cover for it. This is not just space vessel design - nuclear power plants and airplanes have multiple computers handling their main function too, in case one fails.
Solid state is the way to go. The fewer moving parts your computer has (fans and discs because they spin, for example), the less likely mechanical wear is to become an issue. You probably do have a mostly full solid state computer (except for the speakers and vibration) within grasp right now in the form of a smartphone, so that should be a no brainer for the spaceship engineers.

As for computing power, assume Moore's Law is still maintained - or that a breakthrough makes it even better. That's what most sci-fi writers implicitly do. In Old Man's War people have fully aware AI's running on a small chip in their brains, and it's kinda in the early 2100's.

edited Nov 3, 2022 at 22:09

jdunlop

32.4k5 gold badges79 silver badges121 bronze badges

answered Nov 3, 2022 at 15:22

The Square-Cube Law

142k29 gold badges265 silver badges587 bronze badges

12

$\begingroup$ Backups are key. So long as it's not outright impossible for a component to survive to some time point, you can make the probability of at least one surviving as close to 100% as you like just by including enough of them. Even if a probe has a 99% chance of failure before completing its mission, sending 1000 of them yields a 99.995% chance that at least one is successful. $\endgroup$
– Nuclear Hoagie
Commented Nov 3, 2022 at 16:04
6

$\begingroup$ I don't know that assuming Moore's law holds is valid, since there is now an active debate as to whether it holds now, in 2022. $\endgroup$
– jdunlop
Commented Nov 3, 2022 at 22:12
3

$\begingroup$ @jdunlop For science fiction, I think it's more important that your readers assume computers will become far more powerful and compact over the years than whether they actually are becoming more powerful and compact. $\endgroup$
– Cadence
Commented Nov 3, 2022 at 23:22
5

$\begingroup$ @NuclearHoagie That assumes failures are uncorrelated events. However, aging failures tend to be common-mode, for example, corrosion or materials degrading with time. If the plastic IC package disintegrates after 200 years, then the whole thing dies at that time, regardless of the number of backups. $\endgroup$
– user71659
Commented Nov 4, 2022 at 4:18
1

$\begingroup$ This assumes that failure rates to cosmic rays has stayed constant since voyager. However, in order to adhere to Moore's law, we have been forced to miniaturize, which makes silicon much more susceptible to cosmic rays. $\endgroup$
– Aron
Commented Nov 6, 2022 at 7:25

Add a comment |

Amadeus · Accepted Answer · 2022-11-04 19:08:59Z

You will have a problem with circuit aging. This is a well-known problem in chip manufacturing, and there is no known solution.

Basically the running computer has electricity running through transistors, and due to a phenomenon called ElectroMigration, the structure of the different materials in the transistor gets slowly corrupted, atoms of one will be carried from one side to other by the flow of electrons. This eventually slows down the switching speed of the transistor. It gets worse with higher voltages, higher frequencies, and higher temperatures. A good cooling system should severely limit the temperature but that will not address the voltage and frequency wear. (Edit: I was incorrect about space, apparently satellites use material that is highly conductive thermally to move heat away from hot ICs, this transfers to a larger panel to radiate away the heat at infrared frequencies.)

Basically, the chip will fail, if a circuit cannot complete its task within the cycle allotted, it can cause a crash.

This can be addressed with backup circuits; basically unused duplicate (or triplicate) circuits on the chip that can automatically kick in and replacea failing circuit. Or whole backup chips; when one gets iffy, we disconnect it and switch to a new one.

You can also cut power to the CPU except for brief intervals of operation. In most mission modes, there is no reason to be running the CPU for more than a few seconds per minute, or even per hour, in interstellar space.

Just be aware that transistors do age and fail, many modern fast consumer CPUs just are no longer designed to run continuously for decades. Power management is not just to save energy, even if you have unlimited energy, severely reducing power and frequency (and thus performance) when we don't need it is also extending the lifetime of the device.

Comments are not for extended discussion; this conversation has been moved to chat. — L.Dutch, Commented Nov 7, 2022 at 10:19

Goodies · Accepted Answer · 2022-11-05 13:26:31Z

Cluster many similar computers, avoid single point of failure

https://en.wikipedia.org/wiki/Single_point_of_failure

https://en.wikipedia.org/wiki/Computer_cluster

https://en.wikipedia.org/wiki/High-availability_cluster

With some effort in software design, it is possible to implement complicated systems on a space ship like life support or navigation as a distributed tasks, that is a designated computer nodes receive commands, which are dispatched to other computer nodes, that may be available, or not (anymore). If it is not, the task is dispatched to a node that is still available.

Redundant functionality

For max reliability, your computer cluster consists of X individual computers, all capable of doing the task. So.. your system will live as long as the longest lasting computer.

Keep lots of computers in store, to be merged into the cluster later on

You could keep some computers in reserve, to be switched on and merged into the cluster later, which will be decided by existing cluster nodes.. Tom's remark below made me avoid using a timer for this, that would introduce a point of failure. The "fresh computers" subscribe to the cluster.. take over critical tasks, like the dispatch..

Mind sustainability and durability, use a fridge

Even when a computer is switched off, it could still deteriorate. Remarks by user71659: in order to preserve plastic wire isolation and support components, consider using ceramics or glass instead of plastics. For certain materials such as plastics, it helps to store reserve components in cryogenic conditions.

Research properly before deciding the size of the cluster

Test it thoroughly. Set up a cluster of say, 1500 nodes. Let it run for long heavy tasks, and every day, destroy a few of your computers and assure it works in any foreseen circumstance. Use material analysis and statistics to predict fail / shut off for individual nodes. Then, determine the statistic for failure > 500 years.. with the data gathered, dimension your computer cluster. In fact, this research will be the main part of your project.

Instead of a timer (which produces false verdicts when real lifetime varies from prediction), use redundancy: two nodes active, as long as they agree they're fine. When they disagree, wake a new node and retire whichever of the first two disagrees with the new node. Repeat every time a disagreement occurs until you run out of replacement nodes. This should only go awry if both active nodes have their first failures at the exact same time, which is extremely unlikely. — Tom, Commented Nov 4, 2022 at 4:08
The problem is common-mode aging failures. If, for example, the plastic IC package falls apart 200 years after manufacture when idle, then the reserve computers will all die shortly thereafter. — user71659, Commented Nov 4, 2022 at 4:19
@user71659 maybe some of these issues may be overcome by using cyogenics to preserve components ? — Goodies, Commented Nov 4, 2022 at 12:57
@Tom yes, that's a better solution, I will change my answer accordingly. — Goodies, Commented Nov 4, 2022 at 12:59
@Goodies Yes, I assume plastics age less at cyro temperatures. Another solution may be replacing plastics with glass and ceramics. Mil-spec electronics used to do that, but I think those solutions are more rare these days. — user71659, Commented Nov 5, 2022 at 5:11

catalogue_number · Accepted Answer · 2022-11-06 14:51:12Z

Look not to the future, but to the past.

Current computer tech optimizes for smaller, faster, lower-power computers. The primary mechanism to achieve this is making the silicon features smaller, which is terrible for long-term space reliability. Ideally, you want your probe made out of huge chunks of silicon, so that any stray particles can only do limited damage to your device.

For reference, consider the curiosity rover (launched 2011) - it needs to do far more complex tasks than an orbital probe, but it gets by with 256MB of ram and a 200MHz clock speed. NASA's budget was clearly large enough to afford a more capable chip (even when the design process started), but the rover fundamentally did not need one.

As a general rule, you make things reliable by making them slow, large and simple. High clock speeds and small features are great for running desktop applications, but they're a liability in space. This fact will not change with time - it is a fundamental fact that smaller features are more radiation sensitive. The only possible room for advancement is greater redundancy, e.g. ECC memory, but even the best error code correction will be fried by a well-placed neutron to the memory controller.

What you really want are essentially five or six identical Voyager probes crammed into one case, which spend most of their lives in sleep mode. For operations, three cores should be activated under some kind of voting system, with failing nodes swapped out for trustworthy computers.

It's likely hopeless to attempt to shield your computer from radiation - there's probably a trade-off point at which it becomes worthwhile, but if the shielding weighs more than just adding three redundant systems, you may as well just add redundancy.

Mind you, redundancy does not solve everything - there was a famous incident in which the failure of a single switch caused all four redundant flight computers of an airbus A320 to fail. One way to mitigate this might be to have two dissimilar, independent computer systems on board, developed by two separate teams - with any luck, any holes in the logic of one team will be caught by the other.

Nav · Accepted Answer · 2022-11-06 05:10:41Z

Two centuries ago, nobody would have been able to conceptualize how a device like a smartphone would work using software. Similarly, there can be technologies that come up later which we cannot think of now.

As an Engineer, if I had to design such a space ship, I'd look for very sensitive materials that can absorb various kinds of radiation. So rather than shield the ship from radiation, I'd create layers that'd actively absorb radiation, to power the ship. It would also need a mechanism to let go of waste/excess radiation.
The computing circuitry need not be static in the way we currently design electronics. It should be designed to evolve and adapt, while retaining the ability to remember/honor its original purpose that was programmed into it.
Communication with Earth does not have to take multiple decades. If a wormhole or similar technology can be created, communication could happen instantly. This could also mean that a maintenance crew could easily reach the ship.

Nightrider · Accepted Answer · 2022-11-06 20:10:49Z

It is possible to imagine a FPGA that detects the damage and heals it by rerouting the damaged functionality over free reserve cells. It would stay operational as long as there are still free reserve cells to repair the damage. Parts that are not yet damaged would stay in use and the same reserve cells can be used to replace various failed parts. Hence on a long run the system of the same complexity may be able to take more punishment than just a highly redundant system where all unit fails if something gets broken in it.

The system may still have two or more redundant brains so that surviving brain could solve the problem how to reroute an fix another programmatically.

Sherwood Botsford · Accepted Answer · 2022-11-09 19:38:40Z

The first digital computers were vacuum tube units. A 'bit' was the vacuum tube equivalent of a flip flop. Normally this is done with 2 transistors.

But they knew that vacuum tubes break. So the circuit was designed with 7 tubes, and set up in a way that ANY two could fail and the circuit would still work. And each tube had an indicator light (in duplicate)

So a bit was a box the size of a small shoebox. A bunch fit in a file drawer style cabinet. Indicator ligths on the front showed if one had a problem.

Despite this, there was a whole group of guys with shopping carts full of tubes running around and replacing tubes on the fly.

So you build in redundancy. Everything is in multiplicate, with multiple interconnections.

Stack Exchange Network

How can we build a space probe's computer to survive centuries of interstellar travel?

Full version with background

8 Answers 8

Cluster many similar computers, avoid single point of failure

Look not to the future, but to the past.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
computers
interstellar-travel
.

Linked

Hot Network Questions

How can we build a space probe's computer to survive centuries of interstellar travel?

Full version with background

8 Answers 8

Cluster many similar computers, avoid single point of failure

Look not to the future, but to the past.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged computersinterstellar-travel.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
computers
interstellar-travel
.