interpret ai
Our goal is bringing cutting-edge interpretability techniques to model capabilities research. Interpretability as a research direction was started by Chris Olah and the OpenAI team back in 2016, and since then it has been mainly maintained under the AI safety/alignment agenda, being quite abstract and useless research per se. However, recent advances in understanding complex model internal mechanisms have proven to be effective in all sorts of capabilities-related tasks. We aim to provide a proving ground that not only demonstrates interpretability can be useful in frontier model development, but it can help us answer fundamental first-principles questions like: How do we merge models most effectively? How do we extract information from the model? Can we put memory inside a language model?
And it's not just buzzwords - we do stuff! Go look at our Research and Product sections and check our repo .
Work at interpret ai
We are not hiring at the moment; however, we are looking for strong technical contributors. We are a team of practically oriented, academic-minded people who really think about how the world works. If you feel the same, email us at interpretai.capabilities@gmail.com
Team
We are a team of 4 contributors with academic and competitive math backgrounds. We have combined experience from companies such as Jane Street, Meta, Citadel, XTX Markets and Millenium. We are affiliated with universities like the University of Cambridge, EPFL, and Harvard Kreiman Lab, and have won competitions like IMO and ICPC. However, we try not to take ourselves too seriously, remaining humble and curious, and just do things that work.
FAQ
- Why interpretability to capabilities? Are you sure it's going to work?
- We were not sure ourselves until we saw it working: merging models is way more efficient with sparse autoencoders coming from the "Towards Monosemanticity" Anthropic paper.
- What is your business plan? Who are your potential clients?
- We do not have a clear vision of product-market fit, and we believe we need to scale our research first to produce state-of-the-art results for large models. This is the primary reason for seeking VC funding. We believe that scaling models in the future would require a deeper level of understanding of model internals, or at least it will be extremely useful. But not only that - we can produce exciting results even now and even with smaller models!