I recently gave a guest lecture on AI Agent Security in MIT's 6.566 class (a survey course on computer systems security). We covered:
- Foundations of LLMs, from next-token prediction to conversational chat and tool use
- Foundations of agents, including ReAct and CodeAct
- AI agent security
- Simon Willison's dual LLM pattern
- CaMeL's capability system
The GitHub repo has lecture notes plus code demos for all the concepts covered (in my opinion, code makes things really concrete, and writing the code helped me better understand Dual LLM and CaMeL). The full lecture is on YouTube: https://www.youtube.com/watch?v=w0oGeKxD5Fc.
Good one. One thing that's becoming clear is that agent security is less about jail break prompts and more about permission boundries, hidden context flow and un intended tol behavior.
I recently gave a guest lecture on AI Agent Security in MIT's 6.566 class (a survey course on computer systems security). We covered:
- Foundations of LLMs, from next-token prediction to conversational chat and tool use - Foundations of agents, including ReAct and CodeAct - AI agent security - Simon Willison's dual LLM pattern - CaMeL's capability system
The GitHub repo has lecture notes plus code demos for all the concepts covered (in my opinion, code makes things really concrete, and writing the code helped me better understand Dual LLM and CaMeL). The full lecture is on YouTube: https://www.youtube.com/watch?v=w0oGeKxD5Fc.
Good one. One thing that's becoming clear is that agent security is less about jail break prompts and more about permission boundries, hidden context flow and un intended tol behavior.
great