This course on AI safety focuses on fundamental safety challenges in modern AI systems, including those arising in deep neural networks, large language models, and AI agent systems. It examines key safety dimensions such as robustness, backdoor resistance, fairness, privacy, safety alignment, hallucination, interpretability, and agentic AI safety. It is part of the undergraduate program at the School of Computing and Information Systems at Singapore Management University.
For each safety dimension, the course introduces the underlying concepts and definitions, presents systematic methods for evaluating whether an AI system satisfies the required safety properties and to what extent, and explores techniques for improving these aspects of safety.
Week 1: Introduction
Week 2: AI Robustness
Week 3: Improving AI Robustness
Week 4: AI Backdoors
Week 5: Mitigating AI Backdoors
Week 6: AI Fairness
Week 7: Improving AI Fairness
Week 8: Recess
Week 9: AI Privacy
Week 10: Safety Alignment
Week 11: Hallucination
Week 12: Interpretability
Week 13: Agentic AI Safety
Week 14: Project Presentation
Each class contains 5 in-class exercises and programming examples to reinforce the concepts through hands-on practice.