Collections

Discover the best community collections!

Collections including paper arxiv:2409.04185
Single-Layer SAEs with Transformers
TopK SAEs trained on the residual stream activation vectors from a single transformer layer, including the transformers.
Multi-Layer SAEs with Tuned Lens and Transformers
Single SAEs trained on the residual stream activation vectors from every layer simultaneously using tuned lenses, including the transformers.
Multi-Layer SAEs with Transformers
Single SAEs trained on the residual stream activation vectors from every transformer layer simultaneously, including the transformers.
🔍 Interpretability & Analysis of LMs
Outstanding research in LM interpretability and evaluation, summarized
Single-Layer SAEs with Transformers
TopK SAEs trained on the residual stream activation vectors from a single transformer layer, including the transformers.
Multi-Layer SAEs with Tuned Lens and Transformers
Single SAEs trained on the residual stream activation vectors from every layer simultaneously using tuned lenses, including the transformers.
Multi-Layer SAEs with Tuned Lens
Single SAEs trained on the residual stream activation vectors from every transformer layer simultaneously using tuned lenses.
Multi-Layer SAEs with Transformers
Single SAEs trained on the residual stream activation vectors from every transformer layer simultaneously, including the transformers.
🔍 Interpretability & Analysis of LMs
Outstanding research in LM interpretability and evaluation, summarized