Hi, I’m Stephen Casper. I just graduated from the Harvard College, and this fall, I’ll be a new Ph.D student at MIT in Computer Science. Currently, I work with the Harvard Kreiman Lab, but I’m also a former intern with the Center for Human-Compatible AI. My main interests are machine learning and technical AI alignment. Research interests of mine include interpretability, adversaries, robust reinforcement learning, and decision theory. I’m also an Effective Altruist trying to do the most good I can.

You’re welcome to email me. Also find me on Google Scholar, Github, LinkedIn, or LessWrong. I also have a personal feedback form. Feel free to use it to send me anonymous, constructive feedback about how I can be a better person. For now, I’m not posting my resume/CV here, but please email me if you’d like to talk about projects or opportunities.

Also feel free to ask me about hissing cockroaches, a bracelet I made for a monkey, what I learned from getting my genome sequenced, a time I helped someone with a “mammoth” undertaking, or the jar I keep in my windowsill.


Casper, S., Boix, X., D’Amario, V., Guo, L., Schrimpf, M., Vinken, K., & Kreiman, G. (2021). Frivolous Units: Wider Networks Are Not Really That WideIn Proceedings of the AAAI Conference on Artificial Intelligence (Vol 35,)

Filan, D., Casper, S., Hod, S., Wild, C., Critch, A., & Russell, S. (2021). Clusterability in Neural NetworksarXiv preprint arXiv:2103.03386.

Casper, S. (2020). Achilles Heels for AGI/ASI via Decision Theoretic AdversariesarXiv preprint arXiv:2010.05418.

Saleh, A., Deutsch, T., Casper, S., Belinkov, Y., & Shieber, S. (2020). Probing Neural Dialog Models for Conversational UnderstandingarXiv preprint arXiv:2006.08331.


I’m working on a few projects involving modularity in neural networks, adversarial policies in reinforcement learning, and feature-level adversaries for image classifiers. Feel free to reach out.


