CS427: AI Safety

This course on AI safety focuses on fundamental safety challenges in modern AI systems, including those arising in deep neural networks, large language models, and AI agent systems. It examines key safety dimensions such as robustness, backdoor resistance, fairness, privacy, safety alignment, hallucination, interpretability, and agentic AI safety. It is part of the undergraduate program at the School of Computing and Information Systems at Singapore Management University.

For each safety dimension, the course introduces the underlying concepts and definitions, presents systematic methods for evaluating whether an AI system satisfies the required safety properties and to what extent, and explores techniques for improving these aspects of safety.

Week 1: Introduction

Week 2: AI Robustness

Week 3: Improving AI Robustness

Week 4: AI Backdoors

Week 5: Mitigating AI Backdoors

Week 6: AI Fairness

Week 7: Improving AI Fairness

Week 8: Recess

Week 9: AI Privacy

Week 10: Safety Alignment

Week 11: Hallucination

Week 12: Interpretability

Week 13: Agentic AI Safety

Week 14: Project Presentation

Each class contains 5 in-class exercises and programming examples to reinforce the concepts through hands-on practice.