CO #5 - Signed Prompts, PyTorch Supply Chain Attack, LLM-Powered Honeypot and more

Exploring the Intersection of AI and Cyber Security

Jan 23, 2024

Hey friends, Samy here! Welcome back to another edition of ContextOverflow.
This week, we're diving deep into the world of AI in cybersecurity - both the threats and the tools. It's been a very stressful week, but I’m keeping the ship steady and sailing forward, despite that, I'm currently in the middle of crafting three exciting posts for our upcoming issues. Here's a sneak peek:

The AI Effect: I'll be delving into how the rapid developments in AI are reshaping the cybersecurity job landscape. It's fascinating to see the shifts and turns in our field.
Adversarial Frontiers: A technical deep-dive into adversarial attacks against machine learning models and how to perform them. This one's going to be packed with insights for those who love the nitty-gritty details.
Home Lab Adventures: Ever thought about building a small home lab for testing Large Language Models (LLMs)? I'm putting together some practical tips and experiences to guide you through it.

Stay tuned for these in-depth explorations in our next editions! 🚀

Now, let's dive into this week's insights. 🛠️🔍

What's Inside?

Signed Prompt: A New Defense Against Prompt Injections?
Linus Torvalds on AI and Code Generation
'Damn Vulnerable LLM Agent'
A Critical Supply Chain Attack on PyTorch
Portswigger's New LLM Section
Galah: An LLM-Powered Web Honeypot

🛡️ 1. Signed Prompt: A New Defense Against Prompt Injections?

A new paper called “Signed-Prompt: A New Approach to Prevent Prompt Injection Attacks Against LLM-Integrated Applications” has come out that claims to prevent 100% of Prompt Injections, which is a pretty big claim.

This strategy involves substituting commands with random keywords, making unauthorized commands ineffective, here’s an example:

The author used the term “sign” for the substitution action which threw me off when I first saw the title. I was expecting a cryptographic signature of some sort.

Anyway, I’m seeing a few problems:
First, this is security through obscurity - it only works as long as the attacker doesn’t know what her payloads are being replaced with, the very moment she finds out that delete is mapped to toewox, it’s game over.
How can she find out? Well, the simplest answer would be brute force, a bit smarter answer would be … another prompt injection to extract these mappings.

Second, we know that sanitizing the input doesn’t work well in web apps (except for rare and tightly defined cases), and as I see it, this is just like that so I do not see a reason why this can’t be bypassed with techniques that are similar in spirit to the ones used to bypass input sanitization in web apps.
Well, we can not find out either because the author didn’t release the dataset (or the source code).
Bernhard has written an article about it which is worth a read (this is where I first saw this paper).

🤖 2. Linus Torvalds on AI and Code Generation

Linus Torvalds, the creator of Linux, shares his thoughts on AI-generated code.
I’m not going to summarize it here, you have to hear him say it himself, also it’s already pretty short (4 minutes from the marker).
Check out his talk here for some valuable insights.

🕵️ 3. 'Damn Vulnerable LLM Agent'

Remember Damn Vulnerable Web Application? Just like that but for LLMs!
An intriguing project for those interested in prompt injection attacks.
It's a hands-on learning tool to understand and experiment with these vulnerabilities.
I tested it myself, running it is straight forward too.
All you need is to follow the instructions and an OpenAI API Key.
Check out the Github repo and give it a try.

💣 4. A Critical Supply Chain Attack on PyTorch

A fascinating and detailed write-up of how two researchers managed to hack PyTorch, one of the most widely used packages in machine learning and got access to the library’s Github repo and AWS account.
Dive into the details here.

🌐 5. Portswigger's New LLM Section

The awesome Web Security Academy by Portswigger has added a new section on LLMs.
Don't just bookmark it; start exploring it today.

🪤 6. Galah: An LLM-Powered Web Honeypot

Meet Galah, an LLM-powered web honeypot that can mimic various applications and respond dynamically to HTTP requests. It's a project by Adel Karimi that showcases the capabilities of LLMs in cybersecurity.
Check it out here.

📢 Call to Action

That's a wrap for this issue!
If you found these insights valuable, please share this newsletter with your peers. And don't forget, next Monday, we'll be back with more exciting content.
Stay safe and stay informed!

Until next time,
Samy Ghannad 🚀

Context Overflow

Discussion about this post