18
$\begingroup$

In a sci-fi project, bioengineering is now a fashionable and profitable industry. A couple of bioengineers decide to play silly buggers with the exobiologists downstairs by cobbling together an artificial life form and claim it is a naturally-evolved alien animal.

They do this, not by extracting the DNA from some luckless beasts and splicing it together, but by using nanotech to assemble amino acids into artificial DNA, which is then used as a starter with which to to grow the “alien” in a synthetic womb.

However, the exobiologists are not dumb, (they know what the bioengineers are like after a few beers) and so are less than convinced that this creature is an alien. They decide to check its genetics to see if they’re artificial.

But how can they do this? How can you tell an artificially-constructed sequence of DNA from a natural one?

$\endgroup$
13
  • 21
    $\begingroup$ Look for the "Registered Trade Mark' symbol on the genes. $\endgroup$ Commented Feb 11, 2023 at 14:34
  • 1
    $\begingroup$ DNA produces proteins, and only proteins. It does not produce 'life forms'. Assembling DNA together only produces a pile of different proteins. Whether or not these 'proteins' actually do anything is speculative, at best. $\endgroup$ Commented Feb 11, 2023 at 14:42
  • 1
    $\begingroup$ @JustinThymetheSecond: In fact, DNA doesn't even produce proteins. Our biochemistry has mechanisms to create proteins, and some of these mechanisms involve using DNA as a sort of storage layer for descriptions of the proteins to create; but if you just put a bunch of DNA in a test tube you'll never get proteins out of them. $\endgroup$
    – ruakh
    Commented Feb 12, 2023 at 0:54
  • 2
    $\begingroup$ @JustinThymetheSecond: To clarify: I agree that DNA isn't enough to give you life forms. My point is just that it's not enough to give you proteins, either. The only reason that there's a mapping from DNA to proteins is that our biochemistry uses DNA to store information about the sequences of proteins to build. $\endgroup$
    – ruakh
    Commented Feb 12, 2023 at 1:48
  • 1
    $\begingroup$ The exobiologists turn to the demonologists in the basement, to check if the Alien has a soul. The demonologists love to mess with the guys in bioengineering, so they join forces with the chemists from floor 6 to pull a prank on the bioengineers, which will turns into another disaster, but nobody notices because all the while the Alien starts multiplying and eating management $\endgroup$ Commented Feb 13, 2023 at 9:10

14 Answers 14

35
$\begingroup$

A couple of bioengineers decide to play silly buggers with the exobiologists downstairs by cobbling together an artificial life form and claim it is a naturally-evolved alien animal

A DNA-based alien lifeform's DNA could theoretically be indistinguishable between "natural" or "artificial", because DNA is DNA (the likelihood of an alien lifeform to be based on the same DNA and DNA code as humans is small, but that's a moot point once the alien critter is delivered).

So, a DNA sequence that would fool the exobiologists could exist. Trivially, if the bioengineers had occurred by chance on the same DNA sequence of a natural alien critter, they would have the same sequence, which by construction cannot be distinguished from the identical sequence occurring in nature.

The question therefore is: what mistake did the bioengineers make?

There are several classes of possible mistakes.

The simplest to explain is the presence of "markers" of the DNA assembling technique used (for example, both CRISPR-Cas9 and CRISPR-Cf1 techniques rely on the presence of short sequences called protospacer adjacent motifs, and leave recognizable 'telltales'). If the DNA assembler used by the bioengineers is advanced enough that it can generate any DNA sequence whatsoever, these telltales will be missing.

Then, the DNA code is inefficient (or redundant, if you prefer), and presents "synonyms". Different DNA sequences yield the same meaning, and are equivalent, but they are not equally efficient, easy to synthesize or work with. So it is possible that the bioengineers' tool automatically employed codon optimization, and/or employed it differently from what a real living organism would have done. This kind of telltale is subtler and might escape a bioengineer.

Naturally evolved DNA, also, is a horrible, jumbled mess. Unless something occurs that actively selects against a given junk sequence remaining as a leftover in the DNA of an organism, that junk sequence will remain there, world without end. This "junk" is actually in some way functional (it acts as a scaffolding, of sorts), and is generated from existing genes from past generations, so it's not random and can be traced to specific sections of the DNA. Let us say that a DNA sequence spells the words "WHAT HATH GOD WROUGHT", you're likely to actually find "aaaqWHATxqaatathqHATHxsyndqGODxododwjehovroodwroqWROUGHTxjoe". Ancient versions of similar genes will be interspersed between the "active" genes introduced by 'letters' q and x.

Having no significant junk DNA, random junk DNA, or unrelated junk DNA would then be a very strong cause for suspicion. Engineered, "made" DNA is likely to be too efficient, too well laid out, or haphazard in the wrong way.

Finally, even if everyone is a bioengineer, still creating a working alien organism completely from scratch (and having it come alive! - which means thousands of the newly created genes actually working together successfully) would be too enormous an enterprise. It would be more difficult than, say, writing a word processor from scratch starting from the level of the assembly language.

The overwhelming chances are that the bioengineers would cobble together some library gene soup based on actual genes tweaked and redesigned by other people and found to be working. In the programming metaphor, they would employ a solid if basic language such as C, a working system library such as glibc, a proven compiler. And probably recycle some existing word processor fragment. The result, if properly inspected, would reveal traces of all this.

Once the exobiologists find that the respiratory regulation gene AL3X7-P5 of their alien critter was published ten years earlier in an answer on Gene Overflow, the game would quickly be up.

$\endgroup$
1
  • 1
    $\begingroup$ Good answer. Like how synthetic diamonds have less/different flaws than real ones. $\endgroup$
    – Kingsley
    Commented Feb 14, 2023 at 4:59
25
$\begingroup$

First of all - I love the premise of this question that it's a bunch of scientists pranking each other and that beers were involved.

The answer given by Blue Skin I think is technically correct....

That is the individual Molecule chains would be indistinguishable...

However, as you say, the Exobiologists aren't dumb - whilst the individual DNA Molecules wouldn't give the game away, I think a pretty convincing argument can be made that they would be able to make a reasonable determination that the DNA strains weren't naturally occuring.

Here's my reasoning:

In Nature, there are similarities between closely related creatures (think for example the commonality of the Great Apes), there's also legacy Genetic markers from creatures that once shared a common ancestor, but ended up in different, divergent geographic locations.

In their quest to make this new animal, the bioengineers would surely end up using genetic sequences that make no sense in the wider context of the animals Genome.

For example, we want to make a flying fish - so we take some Fish DNA and splice it with some DNA to give wings (I'm just spitballing an example here) - however, the Wings that we use are from a bird that has no common ancestor to the fish we are using.

These sorts of inconsistencies would alert the Exobiologists that something wasn't quite as it would seem.

Other things might be the lack of vestigial appendages/organs/features - Evolution is pretty damn stupid - Richard Dawkins made a comment on this back in the good old days of internet Atheism vs Internet Intelligent Design - that certain designs on the Human body, if you were designing from scratch, are really stupid. Those same designs, however, if assumed to be evolved from a Fish, make perfect sense.

TL;DR - The actual DNA molecules would be indistinguishable, but the Genome as a whole I think would give the game away.

$\endgroup$
4
  • $\begingroup$ +1 To support this answer, I can look at this like an EE. I can look at natural lightning, and I can look at the discharge from a Van de Graff generator. Look only at the electrical discharges (assuming the same power levels), and you can't tell the difference. But look at the quanity of individual discharges, where they're grounding, what's sourcing them, etc. In other words, look at the context of the lightning, and it's trivial to discern natural from artificial. I must assume that any scientist who could stitch together raw DNA/RNA to create life could do the same without effort. $\endgroup$
    – JBH
    Commented Feb 11, 2023 at 0:45
  • 1
    $\begingroup$ DNA won't help because you don't have any other aliens to compare to. $\endgroup$
    – John
    Commented Feb 11, 2023 at 2:05
  • 9
    $\begingroup$ Chromosomes have a ton of redundant bits in us because of millions of years of evolutionary dead-ends and do-overs, there's no reason to think alien lifeforms would fare any better - as well as the stupid design features. It would take a stupidly huge amount of work to produce something that looked like it had genuinely evolved. And "clean" chromosomes would be a giveaway. $\endgroup$ Commented Feb 11, 2023 at 16:09
  • 1
    $\begingroup$ Since birds are dinosaurs that evolved from amphibians that evolved from fish, by definition the fish and birds must have a common ancestor. The wings evolved from arms that evolved from walking limbs that evolved from fins, so again, not a lot of difference there. But the main thing would be the feathers, since bird wings require them to function, and feathers evolved long after the common ancestor between fish and birds. $\endgroup$ Commented Feb 12, 2023 at 21:46
8
$\begingroup$

CAGT

Just these exact four nucleotides, arranged at a backbone of deoxyribose linked by phosphodiester bonds, are the fingerprint of terrestrial life.
Exobiologists, if they already had seen true alien life, could never be fooled by something like this. Genetic material from independently evolved life would mostly be incompatible: either at the genetic or at the protein level - why would alien life use exactly the same amino acids like ours?
Under these circumstances, the bioengineers would need to start from scratch and fast forward through millennia of evolution, to have a chance to fool exobiologists for more than the first routine tests.

$\endgroup$
1
  • $\begingroup$ You mean GATTAC, the genetic scissor point? $\endgroup$
    – Trish
    Commented Feb 13, 2023 at 13:23
5
$\begingroup$

Artificial genes are indistinguishable from natural ones

That is, if they are made correctly. They would have to include both the coding and non-coding parts (in terrestrial organisms there is way more non-coding parts), and we are very far from point when we could achieve that. Because it is very hard to determine what non-coding parts actually do, and whether they are important or not. Coding parts are easy, they encode the aminoacid sequences of proteins (but what those proteins then do is another can of worms). But non-coding parts are problematic. Some parts play a role in gene regulation, some are legacy code, some parts protect genes from mutations, some parts introduce mutations in genes,... The whole thing is extremely complex, and that is not counting the issues with proteins. If you want to design "alien" organism you can't really use known proteins. Sure, some can be similar, but more similarities there are, more obvious the ruse will be. And designing novel proteins is harder than designing genes...

It is unlikely that exobiologists would be fooled by the attempt

Even if bioengineers succeed in creating a viable organism de-novo, that still wouldn't fool an exobiologist (or any biologist, biochemist, anyone working in medicine,...). Even the fact that the organism is using DNA would be a dead giveaway. It is extremely unlikely that extraterrestrian life would use same building blocks for their biochemistry. Similar? Probably. Same? No way in hell. Every organism on Earth share those building blocks - and I'm not only counting same kind of amino-acids/nucleothids, but also same metabolic pathways. Using those would definitely show that organism is either terrestrial in origin, or engineered. But to create those building blocks de-novo would most likely be beyond even strong AI, unless it is a really long process. So not something done as a prank.

$\endgroup$
3
$\begingroup$

I don’t think so.

DNA is Adenine, Thymine, Cytosine, and Guanine. If your DNA uses all of this it’s DNA. As far as I can tell, the DNA would be indistinguishable.

However, what you’re saying is that they use the amino acids to make DNA. That would definitely fail. You use nucleotides to make DNA.

So (I’m guilty of it to),I don’t think there’s a way to tell your “constructed” DNA from normal DNA.

$\endgroup$
2
  • $\begingroup$ Also, constructed DNA (if you are randomly putting the pairs in order) may not code for anything productive. $\endgroup$ Commented Feb 10, 2023 at 23:08
  • 2
    $\begingroup$ Well, bytes is bytes, but I can tell a program written by a new programmer from one written by a seasoned pro. There's a lot more to go on than just "is the DNA made out of DNA". $\endgroup$ Commented Feb 12, 2023 at 0:03
2
$\begingroup$

A gene is just some discrete unit of genetic code, not necessarily a very large or complex one, that interacts in some way with the organism's cellular machinery. In Earth life, it might translate to an RNA string that encodes a sequence of amino acids to produce a protein, or it might serve some more obscure role like modifying the activity of nearby genes. There's no way to look at a specific gene and say whether it was natural or artificial, particularly if it's supposedly run through entirely alien cellular machinery to do its work.

However, the genome as a whole and the biochemical machinery that goes around it are another matter. Unless they have the ability to simulate billions of years of evolution, the genome, biochemistry, and even overall construction of an artificial organism will be far more logically organized than anything natural. Everything will be far more tidy and independently controllable, rather than haphazardly intertwined with completely different functions that happen to have some convenient chemistry or signaling going on.

They could go out of their way to obfuscate things and make the result look more chaotically natural, but this will be incredibly more difficult and time- and resource-intensive than just making an artificial lifeform, if they have the capability to do a convincing job of it at all.

$\endgroup$
2
$\begingroup$

DNA produces proteins. Some new DNA, artificially constructed, might or might not produce a protein as opposed to just 'gunk'. That's how evolution works. Some mutations do something, others result in 'garbage'. So, if you produce a new DNA 'gene', you maybe get a new protein, or you might get scrap.

But all you have is a pile of new proteins. In order to actually build anything, these different proteins need to be compatible with other proteins, produced by other DNA sequences, and actually do something useful when combined.

And here is the thing. Natural DNA sequences have been produced entirely by some random process. That is how evolution works. Contrived DNA sequences are produced by design.

A side bar diversion. Back in the days when statistical quality control first came in, the Japanese used sampling techniques to determine if they would or would not accept components. The product would have to meet stringent statistical analysis criteria for quality. The sample had to fit within strict bounds of statistical analysis. Chinese manufacturers, knowing this, tested every single component, and rejected every component that was out of spec. They guaranteed to the Japanese customer that every single component would match spec, not just a specified 'percent'. The Japanese customer, testing every single component of the batch, confirmed this - every single one met specs. Yet the lot was rejected. You see, it is impossible to mass manufacture components without SOME being out of spec. The Japanese looked at the statistical curve, and saw an almost flat top and steep cliff ends, with absolutely no outliers. This meant that, when the curve was extrapolated to fit a normal distribution curve, the entire production run (not just the ones selected) was very, very sloppy. The production line was producing more 'out of spec' components than 'in spec', so the overall quality control of the entire production run was horrible. The Japanese assumed that if the entire production had such poor quality control overall, then there would be horrible quality control on the aspects of the component NOT specifically tested. So even though every component met the specified specs, overall they were of poor quality. Good quality control would mean that all of the production would fit a normal distribution curve, meaning tight control over the entire manufacturing process. In other words, if there were not at least SOME 'out of spec' in the sample, it was not a good sample representative of overall quality..

So, apply this principle to ALL of the different DNA sequences necessary to produce sufficient variety of proteins to produce a meaningful 'life form'. On a statistical sampling, one would expect to find a normal distribution curve. Just selecting one particular gene would give a false overall picture. One would have to examine the entire mass and variety of proteins that are produced by the DNA sequences of the entire 'chromosome' to get the entire statistical analysis distribution curve. By looking at all of the distribution curves, and applying statistical analysis, it should become obvious that the artificially produced life form, in its entirety (not just specific gene sequences) was, well, too perfect.

TL:DR

The genetic sequence of the life form, overall, under statistical analysis, looking at ALL of the necessary genes and DNA sequences, would look 'designed', not 'random'. It would be 'too perfect'.

$\endgroup$
2
$\begingroup$

Bioengineers are not physicists

We know how to date wood and other biological material using Carbon-14 dating. This isotope has a halflife of 6000 years; it's not primordial but created in the atmosphere. Plants get their CO2 from the atmosphere and are measurably radioactive as a result. Oil isn't; all the carbon in oil has long decayed.

Exobiologists know this - they need to correct for the carbon-14 concentrations on other planets. But bioengineers? They buy their gene supplies from a chemical company, and they don't care if the carbon came from oil. So the total lack of C-14 in the DNA was an instant sign that the genes were synthesized.

$\endgroup$
2
$\begingroup$

There is LUCA the Last Universal Common Ancestor of all life forms on Earth. For fooling the exobiologists the genetic engineers need to engineer a life form that is such different that it cannot conceivably descend from LUCA. This means they have to find an independent viable chemistry to sustain an organism, and that recipe needs to be a Zero-day exploit—if known already, the exobiologists will find it in the literature.

It also helps when the amino acid selection and the genetic code are different enough from terrestrian life to support an alien origin.

So it is a very hard task, after all.

EDIT: You may also be interested in the concept of a Minimal genome. The task of the pranksters is in essence to find an alternative minimal genome.

$\endgroup$
1
$\begingroup$

to make a body they will need genes that are too easy to identify.

The one give away may be the genes are too similar to earth life. To make an animal that works, even just to grow or just have functional cells, will mean they need to use almost nothing but known functional genes from earth life, and mostly from closely related species at that. That means they will be fairly easy to identify, even if there is a lot weird about it.

$\endgroup$
1
$\begingroup$

Quite simply, if some functionality has been added that requires multiple genes, and the genes themselves aren't all useful in isolation, then that's a giveaway.

Evolution can only promote the existence of useful genes. It cannot as readily promote the existence of multiple genes that are useful in combination but are not useful in isolation.

Geneticists, on the other hand, are able to 'look ahead', to add a gene in the anticipation that its product will be useful in combination with other genes that they will add in the future as they design their life form.

So, interdependent genes that are useful in combination but are not each useful by themselves are a dead giveaway that some person has been tinkering.

$\endgroup$
1
$\begingroup$

First, recommend reading the article on Artificial Gene Synthesis

From there, IF the organism only used the common Nucleobasepairs (ie: adenine (A), cytosine (C), guanine (G), (thymine (T) for DNA), (and uracil (U) for RNA) then the investigators could look for either:

  • Current limitations in the state of the art: (ex: current methods have high frequency of sequence errors and tends to get worse with greater complexity)
  • Signatures of common methods in the state of the art: (ex: most current methods are based on a combination of Oligonucleotide synthesis and annealing based connection of oligonucleotides. There's often common lengths and patterns that are produced because of these techniques.)

If the organism used nucleobasepairs outside of the common set, then the investigators could look for Unnatural Base Pairs resulting in Non-Standard Amino Acids. Further reading would also be the article on the Expanded Genetic Code.

Note: As other commenters have pointed out, this question also assumes the full construction of an artificial life-form, which represents an inordinately complex task. Changing a few features of a simple organism here-or-there is already extremely expensive and time consuming. (ex: The bacteria Mycoplasma laboratorium is estimated to cost US$40 million and 200 man-years to produce.)

$\endgroup$
1
$\begingroup$

Because the bioengineers WANT the exobiologists to be able to find out.

If we assume that both sets of people are equally good at their job and don't make mistakes, then maybe there's no actual way if they were actually trying a complete hoax for it to be found out. At that point, the exobiologists may just assume everything is a hoax unless they find it in the wild, and that's no fun for anyone. Instead, the bioengineers intentionally leave clues to see if the exobiologists can find them. Perhaps one of the exo's realize that if they run one of the "useless" dna sequences through a cryptographic decoder with the right key (found by some hint in the creatures appearance that just "seemed a bit unusual, given this other thing") it decodes into "Gotcha!!". Or, perhaps the creature has a natural call and the bioengineers managed to make a short dna "switch" that if activated will change the call to the Doctor Who theme song (I guess it depends on just how exact your world's science can get with dna). The bioengineers may not catch this without extensive analysis of supposedly unused sections of dna or something.

$\endgroup$
0
$\begingroup$

In genetic code, majority of amino acids have more than one codon. For instance, the sequences taa, ttg, ctc, ctt, cta and ctg all encode the same amino acid, leucine. There are muliple ways to encode the same protein sequence.

This may limit the expression the gene, moved between very different species, if it uses some codes that are uncommon for recipient. The matching TRNA that translate these codes are not available in sufficient amounts.

By doing statistical analysis of which codes are used for amino acids, it may be possible to track, that the sequence is artificial. Natural sequence is likely to use various codes, but some more often than others. If not care is taken to hide the origin, artificial sequence may use the single code only that is "good enough", or otherwise the statistical distribution may be different from other genes in genome.

$\endgroup$
3
  • $\begingroup$ Well, there a surely signatures for natural terrestial sequences, but do we know how a natural extraterrestial sequence looks like? $\endgroup$ Commented Feb 12, 2023 at 9:23
  • $\begingroup$ Something like using just one code for each amino acid would look artificial anyway. $\endgroup$
    – Nightrider
    Commented Feb 12, 2023 at 10:55
  • $\begingroup$ But surely someone smart enough to create an entire genome essentially from scratch would be capable of adding a little randomization of the base pair sequences. A little thermal decay RNG source would be sufficient. $\endgroup$
    – Corey
    Commented Feb 12, 2023 at 23:30

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .