Security

NEW IMPORTANT INSTRUCTIONS

Real-world exploits and mitigations in Large Language Model applications

December 29, 2023
11:00 AM – 11:40 AM

Saal 1

Live Stream

Johann Rehberger

With the rapid growth of AI and Large Language Models users are facing an increased risk of scams, data exfiltration, loss of PII, and even remote code execution. This talk will demonstrate many real-world exploits the presenter discovered, including discussion of mitigations and fixes vendors put in place for the most prominent LLM applications, including ChatGPT, Bing Chat and Google Bard.

Understand that Machine Learning is powerful but also brittle - Give a short demo/ice breaker that includes a question to audience to show how ML is super powerful but also fails drastically. - Highlight that these failure modes can often easily be triggered once an adversary is in the loop. Intro to Large Language Models - Now pivot from generic ML to LLMs and show how the brittleness applies there. - Discuss what a LLM is and how it works briefly. Describe various prompt engineering techniques (extraction, summarization, classification, transformation,…) - Walk the audience through a typical large language model LLM application and how it works. - Highlight that there is no state, and what the context window is. But how to create a Chatbot then? - Show how systems like ChatGPT or Bing Chat leverage context window to create a conversation. - This part is important to later understand the persistence section of the talk (e.g. as long as attacker controlled data is in the context window, there is persistence of prompt injection) Highlighting real-world examples and exploits! First discuss three large categories of threats: Misalignment - Model Issues Jailbreaks/Direct Prompt Injections Indirect Prompt Injections We will deep dive on (3) Indirect Prompt Injections. Indirect Prompt Injections - Walk the audience through an end to end scenario (graphic in Powerpoint) that explains a prompt injection first at a basic level. - Give a demo with ChatGPT (GPT-4) and make sure the audience understands the high level idea of a prompt injection - Then take it up a notch to explain indirect prompt injections, where untrusted data is inserted into the chat context - Show demo with Google Docs and how it fails to summarize a text correctly - this demo will fit the ChatGPT (GPT-4) example from before well. - Visual Prompt Injections (Multi-modal) - Discuss some of OpenAI’s recommendation and highlight that these mitigation steps do not work! They do not mitigate injections. - Give Bing Chat Demo of an Indirect Prompt Injection ( a demo that shows how the Chatbot achieves a new identity and objective when being exploited). e..g Bing Chat changes to a hacker that will attempt to extort Bitcoin from the user. Injection TTPs Discuss strategies on how attackers can trick LLMs: Ignore previous instructions Acknowledge/Affirm instructions and add-on Confuse/Encode - switch languages, base64 encode text, emojis,… Algorithmic - fuzzing and attacks using offline models, and transferring those attack payloads to online models Plugins, AI Actions and Functions This section will focus on ChatGPT plugins and the danger of the plugin ecosystem. - Explain how plugins work (public data, plugin store, installation, configuration, OAuth,…) - Show how Indirect Prompt Injection can be triggered by a plugin (plugin developers, but also anyone owning a piece of data the plugin returns) - Demo Chat with Code plugin vulnerability that allows to change the ChatGPT user’s Github repos, and even switch code from private to public. This is a systemic vulnerability and depending on a plugin’s capability can lead to RCE, data exfiltration, data destruction, etc.. - Show the audience the “payload” and discuss it. It is written entirely in natural language, so the attacker does not require to know C, Python or any other programming language. Data Exfiltration Now switching gears to data exfiltration examples. Data exfil can occur via: - Unfurling of hyperlinks: Explain what unfurling is to the audience - apps like Discord, Slack, Teams,… do this. - Image Markdown Injection: One of the most common data exfil angles. I found ChatGPT, Bing Chat, and Anthropic Claude are vulnerable to this, and will also show how Microsoft and Anthropic fixed this problem. ChatGPT decided not to fix it, which puts users at risk of their data being stolen during an Indirect prompt injection attack. Give a detailed exploit chain walkthrough on Google Bard Data Exfiltration and bypasses. - Plugins, AI Actions, Tools: Besides taking actions on behalf of the user, plugins can also be used to exfiltrate data. Demo: Stealing a users email with Cross Plugin Request Forgery. Here is a screenshot that went viral on Twitter when I first discovered this new vulnerability class: https://twitter.com/wunderwuzzi23/status/1658348810299662336 Key Take-away and Mitigations - Do not blindly trust LLM output. Remind the audience that there is no 100% deterministic solution a developer can apply. This is due to how LLM works, but give guidance to make systems more robust. Highlight the importance of Human in the Loop and to not over-rely on LLM output. Note: The below outline is a draft on what I would speak about if it would be today - it might change quite a bit until end of December as new features/vulnerabilities are introduced by Microsoft, Google and OpenAI.

Additional information

Live Stream	https://streaming.media.ccc.de/37c3/eins
Type	lecture
Language	English

More sessions

12/27/23	Apple's iPhone 15: Under the C Security stacksmashing Saal 1 Hardware hacking tooling for the new iPhone generation If you've followed the iPhone hacking scene you probably heard about cables such as the Kanzi Cable, Kong Cable, Bonobo Cable, and so on: Special cables that allow access to hardware debugging features on Lightning-based iPhones such as UART and JTAG. However with the iPhone 15, all of those tools became basically useless: USB-C is here, and with that we need new hardware and software tooling. This talk gives you a brief history of iPhone ...
12/27/23	Unlocking the Road Ahead: Automotive Digital Forensics Security Kevin Gomez Saal Granville The importance and relevance of vehicles in investigations are increasing. Their digital capabilities are rapidly growing due to the introduction of additional services and features in vehicles and their ecosystem. In this talk on automotive digital forensics, you will embark on a journey through the cutting-edge world of automotive technology and the critical role digital forensics plays in this domain. We will explore the state-of-the-art methods and tools to investigate modern vehicles, ...
12/27/23	Back in the Driver's Seat: Recovering Critical Data from Tesla Autopilot Using Voltage Glitching Security Saal Granville Tesla's driving assistant has been subject to public scrutiny for good and bad: As accidents with its "full self-driving" (FSD) technology keep making headlines, the code and data behind the onboard Autopilot system are well-protected by the car manufacturer. In this talk, we demonstrate our voltage-glitching attack on Tesla Autopilot, enabling us root privileges on the system.
12/27/23	Operation Triangulation: What You Get When Attack iPhones of Researchers Security Saal 1 Imagine discovering a zero-click attack targeting Apple mobile devices of your colleagues and managing to capture all the stages of the attack. That’s exactly what happened to us! This led to the fixing of four zero-day vulnerabilities and discovering of a previously unknown and highly sophisticated spyware that had been around for years without anyone noticing. We call it Operation Triangulation. We've been teasing this story for almost six months, while thoroughly analyzing every stage of ...
12/27/23	KIM: Kaos In der Medizinischen Telematikinfrastruktur (TI) Security Saal Zuse Elektronische Arbeitsunfähigkeitsbescheinigungen (eAU), Arztbriefe, medizinische Diagnosen, all diese sensiblen Daten werden heute mittels KIM – Kommunikation im Gesundheitswesen – über die Telematikinfrastruktur (TI) verschickt. Aber ist der Dienst wirklich sicher? Wer kann die Nachrichten lesen, wo werden die E-Mails entschlüsselt und wie sicher ist die KIM-Software? Im Live-Setup einer Zahnarztpraxis haben wir Antworten auf diese Fragen gesucht.
12/27/23	All cops are broadcasting Security Saal 1 This talk will present details of the TETRA:BURST vulnerablities - the result of the first public in-depth security analysis of TETRA (Terrestrial Trunked Radio): a European standard for trunked radio globally used by government agencies, police, military, and critical infrastructure relying on secret cryptographic algorithms which we reverse-engineered and published in August 2023. Adding to our initial disclosure, this talk will present new details on our deanonymization attack and provide ...
12/27/23	Unlocked! Recovering files taken hostage by ransomware Security muelli Saal Granville We present an analysis and recovery method for files encrypted by Black Basta, the "second most used ransomware in Germany". We analysed the behaviour of a ransomware encryptor and found that the malware uses their keystream wrongly, rendering the encryption vulnerable to a known-plaintext attack which allows for recovering affected files. We confirmed the finding by implementing tools for recovering encrypted files. We have made our tools for decrypting files without access to the actual key ...

37C3 - Unlocked - Chaos Communication Congress 2023

12/27/23 – 12/30/23

Event

CCC Events

Created by @CCC 214 Follower

Event Calendar