Since OpenAI publicly launched its ChatGPT application online in December, there has been talk of how bad actors can exploit artificial intelligence (AI) to create malicious code. But even good actors can pose security risks to their chain of supply when using AI to generate code.
Other researchers We combined ChatGPT with Codex, another OpenAI product that converts natural language into code, to create a more sophisticated attack vector. Without him writing a single line of code, they combined the attack with a phishing email and a malicious Excel file armed with a macro that downloads Reverse His Shell, a payload favored by hackers.
Those same researchers also found activity generated by ChatGPT on the dark web. In one forum there is a discussion of how to create a Python-based stealer that searches for common file types on your computer, copies them to random folders in your Temp folder, zips them, and uploads them to a hardcoded FTP server. A thread surfaced.
In another forum case, I used ChatGPT to create a Java snippet that I downloaded to the target’s PuTTY (a popular SSH and telnet client). Once on the system, the client runs covertly using Powershell. The script was written for his PuTTY, but the researchers note that it could be modified to run any program, including programs from common malware families.
AI can be a boon to developers under pressure to speed up their workflows. However, its output should be continuously scrutinized for potential risks. Here’s what you need to know about the state of security in generative AI and the software supply chain.
Buggy code is normal
AI can be used to create harmful code, but it can also be used in beneficial ways, as users of Codex and its brother GitHub Copilot can attest. However, there are risks in using AI-generated code, especially when used by inexperienced developers.
AI-generated code should be considered the product of novice or junior programmers. This is especially true for code written in ChatGPT, which may omit error handling, security checks, encryption, authentication, authorization, etc.
The code can also have bugs, as the Stack Overflow community discovered shortly after ChatGPT became available on the net.
Stack Overflow is an online community for developers. At the heart of the forum is trust. Members rely on us to submit information that reflects what other Members know to be accurate and that can be verified and verified by their peers.
“Currently, most GPT-generated contributions do not meet these standards and therefore do not contribute to the trusted environment,” Stack Overflow declared in a statement. Policy statement:
“This trust is broken. If a user copies and pastes information into an answer without validating that the answer provided by GPT is correct, the sources used in the answer are properly cited. (a service not provided by GPT) and ensure that the answers provided by GPT are clear and concise: answer the questions asked.”
ChatGTP Ban on Stack Overflow
so post Stack Overflow, which announced the temporary banning of ChatGPT output from its forums, explained that ChatGPT’s information had a high error rate but seemed accurate until closer inspection.
“Many people post many answers because such answers are very easy to create,” continued Stack Overflow. “The volume of these responses (thousands) and the fact that they often require someone with at least some expertise to read them in detail to determine if they are actually bad or not. The fact is that the volunteer-based quality curation infrastructure is effectively overwhelmed.”
The Stack Overflow situation shows what happens when an inexperienced developer gets an AI tool. Buggy code not only disrupts your development pipeline, but can expose your pipeline and its products to security risks during and after development.
Data poisoning poses a serious threat
Using AI to generate code can also provide another attack point for hackers in the software supply chain. For example, researchers at Microsoft, the University of California, and the University of Virginia recently paper About the Poisoning Language Model dataset.
They explained that tools like GitHub Copilot are trained from massive code corpora mined from unverified public sources that are susceptible to data poisoning. Such attacks can train machine language models to suggest unsafe code payloads at runtime. To do this, attackers have injected payload code directly into the training data. This code can be detected and removed by static analysis tools. But researchers have found ways to avoid such exposure.
One method, called Covert, prevents malicious code from appearing in the training data. Instead, they are hidden in comments or Python docstrings that static detection tools typically ignore. However, since this attack still requires the malicious code to appear verbatim in the training data eventually, signature-based techniques can be used to discover it.
But another method, which they call the Trojan Horse Puzzle, is even more evasive. The training data never contains suspicious code, but the model can be induced to suggest malicious payloads in code recommendations. This technique is robust to signature-based dataset cleansing methods that identify and exclude suspicious sequences from training data.
Zero Trust is the key to modern software security
AI seems like a win for developers under pressure to build new features and release them faster, but AI output needs serious scrutiny by DevSecOps and app security teams will be Without such scrutiny, organizations can increase the opportunities for attackers to attack their software supply chain.
A broader problem with ChatGTP and other generative AI platforms is that the data they collect is either good or bad. Generative AI is expected to fuel the next wave of “fake news,” allowing bad actors to automatically spread misinformation and disinformation on social media.
In its current state, malicious actors can circumvent the accuracy of generated AI, such as ChatGTP code, by compromising the source of the reference data itself. This fundamental weakness of today’s generative AI platforms is more of a problem than a solution.
*** This is a Security Bloggers Network syndicated blog from the ReversingLabs Blog written by Matt Rose. Read the original post: https://www.reversinglabs.com/blog/generative-ai-like-chatgtp-unleashes-the-next-generation-of-software-supply-chain-attacks