This course focuses on a range of safety-issue of AI systems, such as robustness, backdoor-freeness, fairness, privacy and interpretability. What is covered include systematic ways of evaluating whether a given AI system (typically in the form of a neural network or a large lanuage model) satisfies different quality metrics and how to improve the system’s robustness, backdoor-freeness, fairness, privacy, interpretability and safety in general.
This course is a part of the MITB program at SCIS, Singapore Management University.
Agenda
Week 1: Introduction to AI Safety
Week 2: AI Robustness
Week 3: AI Backdoor
Week 4: AI Fairness
Week 5: AI Privacy
Week 6: Safety Alignment
Week 7: Hallucination
Week 8: Interpretability
Week 9: Agentic AI Safety
Week 10: Project Presentation
This course comes with many in-class exericses and programming examples (links provided in the slides).