CO #2 - Christmas Edition: Unwrapping Insights and Threats in the AI World
From Cheerful Bots to Serious Threats
π°π― This Week's Headlines:
π€π AI & Cybersecurity Deep Dive with Meisler & Bernadett-Shapiro
π¨ Prompt Injection: AI's Hidden Danger by Joseph Thacker
π When AI Goes Awry: The $1 Chevy Tahoe
π CAMLIS 2023: Uncovering Gems in InfoSec
π΅οΈββοΈ Data Poisoning: AI's Undercover Threat
π Extracting Training Data from ChatGPT
π‘οΈ NIST's AI Risk Management Framework: A First Look
π The Real Danger of Deepfakes: Beyond Satire
π£οΈ AI in Cybersecurity: Daniel Meisler & Gabe Bernadett-Shapiro's Deep Dive π€π
In this insightful conversation, Daniel Meisler and Gabe Bernadett-Shapiro delve into how AI is reshaping cybersecurity. They cover key topics like Accelerationism vs Deaccelerationism, AGI vs. ASI , automating security processes using AI, the role of AI in enhancing threat intelligence, and a proof of concept for auto-analyst, a tool that, as the name implies, automatically analyzes a collection of intelligence.
This discussion is a must-watch for anyone interested in the intersection of AI and security.
π Key Moments:
Auto-Analyst POC: A Cautionary Tale
π‘ Remember: Understand risks and compliance first!
π POC Source Code
π¨ Prompt Injection: A Hidden Danger in AI
Joseph Thacker discusses AI Security, misconceptions about AI security, and Prompt Injection attacks and mitigations in AI applications.
AI Application Security by Joseph Thacker
π The $1 Chevy Tahoe Deal
ChrisJBakke's humorous interaction with GM's AI bot, which agreed to sell a Chevy Tahoe for just $1, highlights the vulnerabilities and unpredictable nature of AI. This incident, while amusing, underscores the importance of safeguarding AI against manipulation.
π§ CAMLIS 2023: A Treasure Trove of InfoSec Knowledge
CAMLIS, a lesser-known but outstanding conference on applied machine learning in information security, featured a range of engaging talks. Some of my favourite talks cover diverse topics like using SQL for cybersecurity ML operations, detecting insider threats, binary code similarity search, and the risks of model leeching in LLMs.
It was 100% to attend and watch the talks over Zoom too.
I watched all the talks online this year, but Iβm planning to go in person for the next one!
All recordings: Conference Recordings
π Some of My Favourite Talks:
π΅οΈββοΈ Data Poisoning: Like DDOS, but for AI
Data Poisoning is a way to taint the training data. Basically, it's about slipping bad data into the training data, so the model learns the wrong things. It's hard to spot and even harder to fix. It can easily fly under the radar due to its distributed nature.
I came across a case where artists used this to keep their work from being used for training AI models which is an understandable rationale (however, in a perfect world I much rather the artists be able to opt in and get paid, or opt-out without resorting to these types of measures, but I digress); It got me thinking: This same trick could be used to skew any model to push biases or tipping decisions a certain way.
Assuming you somehow detect the attack (How?), fixing it is going to be very hard and very expensive. Why?
First off, AI models arenβt exactly open books β understanding their reasoning and how and why they came up with a specific response is already a tough nut to crack. The βexplainability problemβ makes it exponentially harder to pinpoint the root of any issue.
Then there's Machine Un-learning which involves the removal of these biases or corrupted data from AI models. This task is not just challenging but potentially prohibitively expensive.
Several resources explore this issue, including methods like "Nightshade," which targets generative models with a small number of poison samples.
π Stealing Knowledge
Researchers extracted a few megabytes of ChatGPT's training data for $200. While this might seem pricey, it's actually quite a feat and likely to become cheaper as others refine the method. They state:
Our attack circumvents the privacy safeguards by identifying a vulnerability in ChatGPT that causes it to escape its fine-tuning alignment procedure and fall back on its pre-training data.
For more, read the article: Extracting Training Data from ChatGPT
π‘οΈ NIST's Blueprint for AI Risk Management
NIST's AI Risk Management Framework is designed to handle risks in AI technologies. I've gone through it and, to be honest, it feels a bit general and light on practical advice. However, given the emerging nature of this field, it's a solid first effort.
AI Risk Management Framework - Home Page
AI Risk Management Framework - PDF
π Stop Worrying About Deepfakes? I donβt think so.
This article kind of brushes off deepfakes as just modern satire. I disagree. Deepfakes are more than jokes; they're serious threats and they pose real risks, they may have βfakeβ in their name, but they will affect some people in real ways.
Deepfakes can be used to blackmail people, even leading to tragic outcomes, there are cases of honor killings in some cultures because of fake, scandalous photos.
They can also fuel violence in already tense and polarized areas leading to massive loss of life - Imagine the 2017 Rohingya genocide in Myanmar that used social media as a propagation platform, but this time with "video evidenceβ.
Plus, Deepfakes are a handy tool for bad actors running elaborate scams. The article touches on a very important question: βWhere do you draw the line between a harmless fake and a dangerously deceptive one?β but leaves that question hanging.
Deepfakes are yet another reason why we need to figure out AI security and safety.
π£ Share & Anticipate!
If you found this edition of ContextOverflow helpful, please share it with your network! Stay tuned for our next edition, where we'll dive even deeper into the world of AI and cybersecurity. Your feedback and suggestions are always welcome.
See you next Monday! π
Samy Ghannad