We’re looking for a PhD student to help us extend WeightsLab to support LLM development, while also shipping a slim, high-quality LLM. You’ll contribute to both platform capabilities (e.g., transformer-level introspection, attention analysis) and model design.
This opportunity combines practical engineering with mechanistic interpretability: the goal is to understand the internals of small LLMs, intervene during training, and produce a model that’s not only efficient but also transparent and editable.
What You’ll Do
Extend Graybox to support transformer-based models and LLM-specific insights
Train and iteratively refine a sub-256M parameter LLM
Apply mechanistic interpretability tools and concepts to guide architecture choice
Enable interactive model operations: neuron pruning, attention head control, layer freezing
Benchmark performance vs. transparency trade-offs
Deliver a documented, reusable model with accompanying evaluation and tooling