Mechanistic Interpretability
An independent reproduction of Anthropic's emotion vector research using the open-weight Llama 3.1 8B model. We confirm 10 of 11 verification criteria, with the causal steering correlation (r=0.955) closely matching the original (r=0.85).