The 2nd New England Mechanistic Interpretability (NEMI) Workshop

About The Meeting

The New England Mechanistic Interpretability (NEMI) workshop aims to bring together academic and industry researchers from the New England and surrounding regions who are advancing the field of mechanistic interpretability in machine learning systems. The workshop will serve as a forum to share recent progress, challenges, and ideas in reverse-engineering, circuit analysis, and other techniques that seek to understand how models compute internally. NEMI seeks to foster a participatory and collaborative environment where researchers at all levels—including graduate students, early-career scientists, and established experts—can engage in discussion and feedback. We particularly encourage submissions from rising researchers currently enrolled in graduate programs at New England-based universities. Topics of interest include, but are not limited to, interpretability of neural circuits, activation patching, probe-based analysis, feature attribution methods, model simplification, scaling laws and applications as applied to interpretability. The workshop will feature a dynamic program including invited keynote speakers, selected oral presentations, interactive poster sessions, and opportunities for open discussion.

Livestream

Schedule

Time	Event	Venue
08:30 AM - 9:00 AM	Setup time for Poster Session 1	CSC Second Floor Suites
09:00 AM - 09:30 AM	Breakfast & Registration	CSC Ballroom
09:30 AM - 09:40 AM	Opening Remarks	CSC Ballroom
09:40 AM - 10:00 AM	Keynote 1: Lee Sharkey: "Mech Interp: Where should we go from here?"	CSC Ballroom
10:00 AM - 10:10 AM	Student Talk 1: Amil Dravid"Vision Transformers Don't Need Trained Registers"	CSC Ballroom
10:10 AM - 10:20 AM	Student Talk 2: Andrew Lee"Shared Global and Local Geometry of Language Model Embeddings"	CSC Ballroom
10:20 AM - 10:40 AM	Keynote 2: Tamar Rott Shaham: "Can Language Models Interpret Humans?"	CSC Ballroom
10:40 AM - 11:45 AM	Round 1 of LLM Roundtables	CSC Ballroom
11:45 AM - 1:00 PM	Poster Session 1 + Coffee Break	Poster Session at CSC Second Floor Suites, Coffee Break at CSC Ballroom
01:00 PM - 02:00 PM	Lunch + Group Photo + Continued LLM Roundtables	CSC Ballroom
01:45 PM - 02:00 PM	Setup time for Poster Session 2	CSC Second Floor Suites
02:00 PM - 03:15 PM	Poster Session 2	CSC Second Floor Suites
03:15 PM - 04:15 PM	NDIF/NNsight	CSC Ballroom
04:15 PM - 04:25 PM	Coffee Break	CSC Ballroom
04:25 PM - 04:45 PM	Keynote 3: Aaron Mueller: "Beyond Human Concepts: Evaluating and Applying Unsupervised Interpretability"	CSC Ballroom
04:45 PM - 04:55 PM	Student Talk 3: Helena Casademunt"Steering Out-of-Distribution Generalization with Concept Ablation Fine-Tuning"	CSC Ballroom
04:55 PM - 05:05 PM	Student Talk 4: Michael Hanna"Circuit-tracer: A New Library for Feature Circuits"	CSC Ballroom
05:05 PM - 05:25 PM	Keynote 4: Ekdeep Singh Lubana: "Looking Inwards: Implicit Assumptions Formally Constrain Mechanistic Interpretability"	CSC Ballroom
05:25 PM - 05:55 PM	Panel Discussion: "Bridging the Gap: From Lowest-Level Mechanisms to High-Level Behaviors"	CSC Ballroom
05:55 PM - 06:00 PM	Closing Remarks	CSC Ballroom
06:00 PM+	Optional Social	CSC Ballroom

Registration

Submission Guidelines

We invite submissions for the NEMI 2025 workshop, a one-day event dedicated to exploring the latest developments in mechanistic interpretability research. We welcome submissions on all aspects of interpretability. Some of them will be selected for oral presentations and the remaining will be presented as posters. We encourage submissions from rising researchers who are enrolled in graduate programs at universities located in the New England region.