Decode Research's mission is to to improve understanding of AI models and to accelerate interpretability research.
We currently work on:
- SAELens - Open-source library for training and analyzing Sparse Autoencoders (SAEs).
- Neuronpedia - Interpretability platform for understanding, visualizing, testing, and searching AI internals.
- SAEDashboard - Generate dashboards for visualizing SAE features.
Our approach is roughly:
- Interpretability Tooling: Create and maintain libraries like SAELens, making state-of-the-art interpretability techniques more accessible.
- Infrastructure and Platform: Neuronpedia serves as a central hub for hosting, testing, visualizing, and understanding SAEs for LLM interpretability for both independent researchers and larger companies and labs.
- Democratizing Access: By releasing open-source SAEs for popular models, we lower the barriers to entry for AI safety research and enable a broader community to contribute.
Technical staff:
- Joseph Bloom - Independent AI alignment researcher, former TransformerLens maintainer, MATS graduate, LASR mentor, with a background in data science.
- Johnny Lin - Ex-Apple engineer, product maker. Builds Neuronpedia.
- Curt Tigges - Mechanistic interpretability researcher, former data scientist and startup founder. Combines research experience with a background in industry to bring mechanistic interpretability expertise, software development skills, and machine learning knowledge to Decode.