49
\$\begingroup\$

I have been using Unity to create a 2D game which will be completely offline (which is the problem), the game-play needs you to enter certain strings at certain levels and Unity compiles to DLLs, which can be easily reverse engineered, so is there a way to protect those strings (the game is offline so I can't retrieve from other source)?

The game relies heavily on those strings, and yes I'm aware of Obfuscation but I want something more robust. And I know that the easy way out would be doing everything online from a data-source but I was wondering if it's possible.

It can be decompiled like this: enter image description here

\$\endgroup\$
11
  • 10
    \$\begingroup\$ While I agree it is a battle you can never truly win, you can definitely make it more difficult with little effort. obfuscar.codeplex.com \$\endgroup\$
    – user99319
    Commented Nov 24, 2017 at 20:07
  • 46
    \$\begingroup\$ @Gabriele Huh? Decompiling non-obfuscated C# code is about as easy as it gets. You get pretty much perfectly readable code that way, compare that to what IDA generates for optimised C or C++ code. That said with enough effort native code can just as much be understood, but it's orders of magnitudes harder. No idea how well obfuscation works - if it makes the usual decompilers (DotPeek, ILSpy, something else?) barf on the IL code that should introduce a barrier to keep the casual person away from it. \$\endgroup\$
    – Voo
    Commented Nov 24, 2017 at 22:04
  • 15
    \$\begingroup\$ @GabrieleVierti, pulling text out of a DLL is trivial: under Linux or Cygwin, you can do it just by pointing the strings program at it. \$\endgroup\$
    – Mark
    Commented Nov 25, 2017 at 2:16
  • 31
    \$\begingroup\$ What exactly are you trying to do here? This sounds like a rather severe case of the XY problem. meta.stackexchange.com/questions/66377/what-is-the-xy-problem Whatever you are trying to accomplish (from a gameplay perspective, not a technical perspective) is almost certainly not best served by trying to hide and encrypt these strings. \$\endgroup\$ Commented Nov 25, 2017 at 14:24
  • 36
    \$\begingroup\$ I wonder if hiding the strings actually serves a purpose: Once some players have solved them, they will very likely be spread in wikis or similar, thus anyone wanting to know them will easily be able to search for them. Yes, by obfuscating them, the player can't just open the dll in a text editor and look for strings, but most will consult google (or their search engine of choice) first... \$\endgroup\$
    – hoffmale
    Commented Nov 25, 2017 at 21:10

4 Answers 4

161
\$\begingroup\$

Do not store those strings, store the (cryptographic) hash of them.

A (cryptographic) hash function, like encryption, is a way to turn a string into "gibberish" (called hash), but unlike encryption, you cannot get the original string from this hash (unless you can brute-force it or the hash function is broken). Most (if not all) hash functions takes a string of arbitrary length and returns a string of a constant length (depends on the function).

How do you check that a string that the user has entered is the correct one? Since you cannot get the valid string from the hash, the only thing you can do is to hash the user's guess and compare it to the correct hash.

Warning (by Eric Lippert): DO NOT USE the built in function GetHashCode as such a function - its result can differ between different .NET versions and platforms, making your code work only on specific .NET framework versions and platform.

\$\endgroup\$
15
  • 83
    \$\begingroup\$ This is indeed the best answer. But remember that you absolutely positively must not use the built in hash algorithm on strings. it is designed to do one thing and one thing only and that is to balance a hash table. You cannot store hashes of strings and use them as authenticators of a shared secret because the .NET runtime authors reserve the right to change the string hashing algorithm at any time for any reason and in fact they have done so in the past. Use a crypto-strength standard hash, or implement your own simple hash. \$\endgroup\$ Commented Nov 24, 2017 at 21:28
  • 4
    \$\begingroup\$ This. Exactly. If you use an algorithm like sha256 (there are many libraries that implement this so you don't have to write your own) then you'll have something that can't be broken easily and is very reliable. \$\endgroup\$ Commented Nov 24, 2017 at 21:33
  • 12
    \$\begingroup\$ @Michael Johnson using a salt will make it much harder to use rainbow tables, and using a password hash like bcrypt will make it much slower and therefore harder to break. But ultimately at the end of the day this seems like much too much effort to go to; the user has bought the game if they want to cheat that is their own choice \$\endgroup\$
    – Melkor
    Commented Nov 24, 2017 at 22:06
  • 2
    \$\begingroup\$ @MichealJohnson: Key stretching could make such brute force attacks a bit harder (say, by a factor of a billion or so). But the real problem with this answer is that, presumably, at some point the game needs to be able to tell the player which string they need to enter. Unless those strings are actually solutions to some kind of a puzzle that the player needs to solve, I guess. Or unless the strings are provided online, even though the game is offline, kind of like old-style license keys. \$\endgroup\$ Commented Nov 24, 2017 at 22:55
  • 5
    \$\begingroup\$ @Sentinel That means that the gameplay might be different for different players. We really need to know what kinds of strings we're talking about, if they're contained in books/signs/dialog within the game, if they're solutions to puzzles where the actual answer isn't given to the player directly, or if they're randomly-generated. And if they're words, sentences, or random characters. \$\endgroup\$ Commented Nov 26, 2017 at 13:55
106
\$\begingroup\$

What you are trying to do is both futile and pointless.

It is futile because there is no way to properly hide information which is on the user's machine. Anyone dedicated enough will find it. You can make it harder, but you can never prevent it. If you encrypt it, then you need to store the encryption key and algorithm somewhere. No matter how many layers of encryption you add, the outmost layer will always need to be unencrypted in order for your game to run.

It is also pointless, because we are living in the internet age. When your game becomes just remotely popular, then those passcodes will be posted all over the web.

All you can do is trust the player to not ruin their own game experience by looking up information the game isn't supposed to tell them yet. The vast majority of players will not start reverse-engineering your game anyway. And if the few who have the necessary skills do this, then it's their own fault.

\$\endgroup\$
13
  • 38
    \$\begingroup\$ Reverse engineering is a game of its own! This is the only correct answer - one should not hide information in single player games nowadays, players will access it if they want to. \$\endgroup\$
    – Mephy
    Commented Nov 24, 2017 at 20:36
  • 27
    \$\begingroup\$ @TheBinaryGuy Safety from what? Your users are playing an offline game. What threat does exposing the code present? The player is eventually supposed to discover it anyway to be able to play the game! An important rule of security is that you must identify the treat from which you are protecting; this is referred to as a "threat model." \$\endgroup\$
    – jpmc26
    Commented Nov 25, 2017 at 9:21
  • 7
    \$\begingroup\$ @TheBinaryGuy In addition to the other points, I'd question the sensibility of using "fixed" passcodes - even without looking up the codes online/through reverse engineering, just playing the game a second time will already allow players to just skip straight through sections of the game (as they already know the codes). If you really want to "force" players to earn these codes you need to make them change on each playthrough (e.g.: have some form of randomization) \$\endgroup\$ Commented Nov 25, 2017 at 12:42
  • 6
    \$\begingroup\$ This answer has some good points but the second paragraph is incorrect. You don't have to encrypt the string. You can hash it. If a player can break the hash, then instead he will be robbing bank accounts, will be hired by FBI, or something like that. The other points are valid though. \$\endgroup\$
    – Pedro A
    Commented Nov 26, 2017 at 16:09
  • 5
    \$\begingroup\$ @Hamsterrific I guess the answer assumes you need the actual strings to be stored, e.g. to show them to the player at some point, which means hashing won't do. \$\endgroup\$ Commented Nov 26, 2017 at 20:48
5
\$\begingroup\$

If such a thing is really desired, then instead of hashing, you might consider building the strings from a numeric input value at runtime.

The advantage is that as pointed out by @Philipp, it is somewhat pointless to try and hide codes in the executable if you can expect them being posted on the internet anyway. Hashed or not, the same word found on the internet and entered into the game will give the same hash and will work either way.

Except... except if someone else's code doesn't work for you. Which you can trivially do -- not 100% tamper-proof but reasonably hard to work around for the average user. Anything as simple as the "Online Elven name generator" will do (can be arbitrarily simple, really doesn't need much of a markov text gen engine, pulling 4-5 syllables from a random list is good enough).

Just generate a somewhat user-specific or machine-specific number, it doesn't even have to be perfectly unique or very tamper-resistant. Something that is likely different for most people, and unlikely to change regularly, e.g. the computer's network name, the MAC address, or the GUID of the system disk drive, whatever (the GPU serial number might be a very bad idea since users are likely to upgrade GPUs). Add to that the numeric code the unlock code refers to, and feed that into your word generator. But be prepared to answer support queries when players use two computers or change their network card (which is unusual, but not impossible). It might be a good plan to only generate the random ID once, and store it with the game's settings. That way, at least it doesn't break existing installations on the same machine if something changes.

Or, you might just use the game's serial number which is unique and will work if the user changes hardware (ironically, however, this might promote pirating since shared unlock codes work for pirated serials but not for legitimate customers!).

Note that preventing users from cheating is not necessarily a good thing. In an offline (i.e. non-competetive game) it's usually no problem if the user cheats and gets the codes from somewhere rather than from playing. He is only cheating himself. Who cares.
On the other hand, getting too much in their way if they really wish to cheat is a great opportunity for completely pissing off paying customers.

So... before you do something that way, think very thoroughly whether you really want that, and what you want. Quite possibly, having human readable strings (or trivially made "unreadable" with xor) is just good enough and indeed preferrable.

\$\endgroup\$
4
  • \$\begingroup\$ What process do you envision? If the program generates the hash value after being downloaded, then it has to have the unhashed string when downloaded. So will the server query the client for identifying information and then generate the hash server-side? \$\endgroup\$ Commented Nov 30, 2017 at 3:38
  • \$\begingroup\$ @Acccumulation: What server? Q says "completely offline". Which probably means a program on a DVD (or maybe a download) plus a serial number. So you have events 1,2,3,4 for which a passcode must exist. You calculate for example hash(serial + 1) to get a number corresponding to the first code. Then feed that into your word generator, which pulls, say, one syllable from a list of 16 for each 4 bits of input. There you go, individual "words" for every user. \$\endgroup\$
    – Damon
    Commented Dec 1, 2017 at 12:06
  • \$\begingroup\$ If it's a download, then it's being downloaded from a server. If the program calculates hash(serial+1) after it's downloaded, what's to stop the user from calculating hash(serial+1)? Once the program is downloaded, the user has access to everything the program does. \$\endgroup\$ Commented Dec 1, 2017 at 16:23
  • \$\begingroup\$ @Acccumulation: Nothing prevents the user from first decompiling the program and then calculating hash(serial + 1). So what? This is not a problem. Look, if someone invests one to two hours (probably 6-8 hours for a typical "user" without software developer background) just to cheat himself, well... let him. The thing is, this works for one person, but not for everyone, and it's not competitive... so, no problem. \$\endgroup\$
    – Damon
    Commented Dec 1, 2017 at 19:03
3
\$\begingroup\$

If you don't need to show the strings, then the hashing idea is probably the way to go. If, on the other hand, you do need to show them to the user, there are some other ways you could avoid them showing up in your DLL directly.

One way to deal with this besides encrypting or otherwise obfuscating the strings is to break them up. Maybe just a have an alphabetically sorted dictionary of all the possible words from all the strings in the game. Then have an array somewhere that allows you to piece together the words into the string you need by indexing into the array of words. This way you don't have the complete strings anywhere in the game. There's no master key needed to decrypt the strings. And the data that tells their order could be all over your source, if, for example, each function that used a string just had an array of indexes local to the function. I'm not sure how practical that is in your specific case, but it's one way of doing it.

You might even have string composed of a few words, but they end up put in different orders. For example, you might need the following 2 strings:

  1. A bear walked through my house
  2. My house protected me from a bear

Your list of phrases would contain both "a bear" and "my house", but you'd have a large list of other phrases that could be put in between them, so figuring out which one would be just as hard as actually figuring out the puzzle in the game (or whatever). For example, the action phrases could be "walked through", "burnt down", "pushed over", "protected me from", "separated me from", "magically produced", etc.

You could work this into your game by making the indexes be based on something the player has collected or done. So there'd be no master list of indexes anywhere inside the game. They would be generated by the player playing the game.

\$\endgroup\$
5
  • 4
    \$\begingroup\$ So instead of finding the function that references a specific string, you find the function that references an array with indices and then recreate the string. That doesn't seem more than an extra 30 seconds protection. What it does protect against is someone just using strings against the executeable though. \$\endgroup\$
    – Voo
    Commented Nov 24, 2017 at 22:12
  • \$\begingroup\$ As I said, if the arrays that reference the strings are distributed to all the functions that need them, then it's a bit harder than just finding a single array in a single function. Not impossible, but covers a larger area without a lot of extra work. \$\endgroup\$ Commented Nov 25, 2017 at 0:09
  • \$\begingroup\$ Isn't this just a lesser form of encryption? \$\endgroup\$ Commented Nov 25, 2017 at 2:10
  • 2
    \$\begingroup\$ Wait, not even encryption. Just encoding. \$\endgroup\$ Commented Nov 25, 2017 at 2:10
  • 2
    \$\begingroup\$ I've updated the answer to be more clear. Basically, if the indexes are based on something the player has done, then they aren't actually stored in the game. Again, it might not be fool-proof, or perfect, but I'm putting it here as an option that people might like to explore. \$\endgroup\$ Commented Nov 25, 2017 at 4:11

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .