Clustering Head: A Visual Case Study of the Training Dynamics in Transformers Ambroise Odonnat](https://ambroiseodt.github.io/), [Wassim (Wes) Bouaziz, Vivien Cabannes Previous Next