AI Ethics

Ethical use and development of AI is one of the core values UVic AI seeks to promote. Visit our ethics channel on discord for links and discussion on ethics topics. Visit our announcements channel for upcoming AI Ethics events including online meetings and in person events on campus.

Reading Group

We have been hosting a reading group since the summer of 2023 focusing on AI safety research, including value alignment, interpretability, AI governance, and societal impact. You can view a list of our topics below, and browse our reading-group channel to view our notes from each session.

Reading Session History:

Session 42 - Mon, Jun 3: Discussion: consciousness, in machines and people.
Session 41 - Mon, May 27: Intro to Intelligence Explosions
Session 40 - Mon, May 20: A Mathematical Framework for Transformer Circuits
Session 39 - Mon, May 13: “When will we know AI is concious?” by Exurbia
Session 38 - Mon, Mar 25: AI Policy
Session 37 - Wed, Mar 20: Intro to Agent Foundations
Session 36 - Mon, Mar 4: Free topic.
Session 35 - Wed, Feb 28: Explaining grokking through circuit efficiency
Session 34 - Mon, Feb 26: Gentleness and the Artificial Other
Session 33 - Wed, Feb 21: Mathematical Framework for Transformer Circuits
Session 32 - Mon, Feb 19: We’re reading and discussing the Alberta plan for AI
Session 31 - Wed, Feb 14: High-resolution image reconstruction with latent diffusion models from human brain activity
Session 30 - Mon, Feb 12: Section 2 of Coherent Extrapolated Volition
Session 29 - Wed, Feb 7: Goal Misgeneralisation: Why Correct Specifications Aren’t Enough For Correct Goals
Session 28 - Mon, Feb 5: Open discussion on AI Ethics topics
Session 27 - Wed, Jan 31: “Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets”
Session 26 - Mon, Jan 29: Overview and discussion of AI Ethics & Alignment
Session 25 - 2024, Dec 4th: Reading about the UK “AI Safety Summit 2023”
Session 24 - Nov 27th: CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features
Session 23 - Nov 20th: We’re reading about transformers and precursors / prerequisites leading up to them using this post as a jumping off point.
Session 22 - Nov 13th: A second week of discussing A Mechanistic Interpretability Analysis of Grokking
Session 21 - Nov 6th: A Mechanistic Interpretability Analysis of Grokking
Session 20 - Oct 30th: Curiosity: Exploration by Random Network Distillation (paper) with 2min papers summary vid
Session 19 - Oct 23th: MuZero: MuZero blog article, MuZero paper, and DreamerV3 paper
Session 18 - Oct 16th: Zoom In: An Introduction to Circuits (from Claim 2 until the end)
Session 17 - Oct 9th: Zoom In: An Introduction to Circuits (until the end of Claim 1)
Session 16 - Oct 2nd: 200 Concrete Open Problems in Mechanistic Interpretability: Introduction
Session 15 - Sept 25th: Preventing an AI-related catastrophe (sections 4 - 6)
Session 14 - Sept 18th: Preventing an AI-related catastrophe (sections 1 - 3)
Session 13 - Sept 11th: An AI Pause Is Humanity’s Best Bet For Preventing Extinction
Session 12 - Aug 8th: OpenAI’s Superalignment team/goal (This week we will discuss this topic generally, so choose whichever reading or podcast you wish in order to learn more about it)
Session 11 - Aug 1st: (My understanding of) What Everyone in Technical Alignment is Doing and Why (round 2)
Session 10 - July 25th: (My understanding of) What Everyone in Technical Alignment is Doing and Why
Session 9 - July 18th: Self-appointed skim reading from our list of suggested readings. This week we will all individually choose a few suggested readings from this document to skim read through. During our discussion we will each pitch some of the readings we surveyed and collectively decide on which one sounds the most interesting to fully read for the following week.
Session 8 - July 11th: LOVE in a simbox is all you need
Session 7 - July 4th: Self-appointed AI-optimist reading(s). This week we will all individually choose 1 or more readings that go against the “AI doomer” belief that AGI poses a sufficient risk of existential catastrophe such that we need to take AI safety very seriously.
Session 6 - June 27th: How likely is deceptive alignment?
Session 5 - June 20th: Mechanistic Interpretability, Variables, and the Importance of Interpretable Bases
Session 4 - June 13th: Core Views on AI Safety: When, Why, What, and How
Session 3 - June 6th: Seeing is Believing: Brain-Inspired Modular Training for Mechanistic Interpretability (round 2) and Alignment of Language Agents
Session 2 - May 30th: Seeing is Believing: Brain-Inspired Modular Training for Mechanistic Interpretability
Session 1 - 2023, May 23rd: Paul Christiano: Current work in AI alignment