Discussion
Forum
The SAIRC Discussion Forum is a space for student AI scholars to share what they're thinking about. No formal research paper required. Posts can be submitted anonymously and span a wide range of formats:
- Tutorials or deep-dives on AI topics
- Fresh perspectives: reframings or new ways of looking at something
- Novel research results communicated in plain language
- Resources and opportunities in AI (summer programs, tools, datasets)
- Thought experiments and speculative ideas
- Notes or study guides from courses
Recent Posts
(When) Is Mechanistic Interpretability Identifiable?
I recently finished a paper, "Characterizing Mechanistic Uniqueness and Identifiability Through Circuit Analysis," alongside a group of three others and a mentor. This post discuss…
Would a Language Model Push You Off A Bridge? Pt. 2
Utilitarianism, to recap, is a consequentialist decision-making framework which states that the best actions produce the most 'pleasure' for the greatest number of people. Deontolo…
Would a Language Model Push You Off A Bridge?
In the context of this post, 'utilitarianism' is a consequentialist decision-making framework which operates under the idea that the best action produces the most pleasure for the …
Comparison of Convolutional & Feed-Forward Architectures on MNIST Digit Classification
I was unable to use Google Colaboratory for quite a bit, so it took me much longer than necessary to make this post. However, I'm finally able to log back in! This is the experimen…
The Convolutional Neural Network
The last few posts I've written about AI consciousness and infinite suffering have been fairly dire, so I decided to switch things up and write about something more practical: the …
AI Consciousness: A Biological Perspective
Most policy debates about AI revolve around its potential upsides: whether AI as an augmented decision-maker can solve existential risks like climate change or pandemics. But a dif…
Autoregression & Next-Token Prediction
Every time a language model generates text, it's doing something surprisingly simple: predicting one token at a time, with each choice shaped by everything that came before. This p…