Alexandre Variengien

About

I am a researcher working on leveraging AI to improve collective epistemics and coordination.

I am currently working as a PIBBSS research fellow, where I explore how AI can help us understand and explain complex ideas through better analogies and metaphors. You can find a short intro to my project here.

I am a research enthusiast, developer, and graphic designer for fun. My interests include AI for epistemics and coordination, interpretability, biology, self-organized systems, and bio-inspired AI.

Resume

Experience

PIBBSS Research Fellow

June 2025 - Present

San Francisco, USA

I explore how AI can help us understand and explain complex ideas through better analogies and metaphors.

Technology Specialist, EU AI Office

October 2024 - May 2025

Brussels, Belgium

I worked at the European Commission, in the AI safety unit of the EU AI Office. I contributed to preparing the enforcement of the EU AI Act obligations for the providers of general-purpose AI models with systemic risks.

Co-founder & Researcher

December 2023 - July 2024

Paris, France

I co-founded the CeSIA (Centre pour la Sécurité de l'IA). I led the research project BELLS, a benchmark to evaluate safeguards used to filter input and outputs of LLMs.

Master's Thesis at Conjecture

February 2023- August 2023

Conjecture, London, UK

I was working on scalable mechanistic interpretability, looking for macroscopic universal motifs inside LLMs.

Internship at Redwood Research

August 2022 - February 2023

Redwood Research, Berkeley, US

Research on mechanistic interpretability of language models. I also worked as a research manager, leading a team of 10 residents during the REMIX residency program.

Research Internship at the Living Technology Lab

April - July 2021

OsloMet, Oslo, Norway

Self-organizing systems engineering. The goal of the project was to design a neural cellular automaton to solve a control task.

Research Internship in the Mnemosyne Team

May - July 2020

IMN Bordeaux, Bordeaux, France

Comparison and visualization of recurrent neural networks solving a language processing task.

Education

Master of Computer Science (Second Year)

2021 - 2022

EPFL, Lausanne, Switzerland

Double degree program between ENS de Lyon and EPFL. Courses focused on machine learning and data science.

Master of Computer Science (First Year)

2020 - 2021

ENS de Lyon, Lyon, France

Optional courses from the biology department (neurology and evolution) and in physics (dynamical systems).

Bachelor of Computer Science

2019 - 2020

ENS de Lyon, Lyon, France

Courses about theoretical computer science. Graduated with 18.13/20

Preparatory Classes MPSI/MP*

2017 - 2019

Lycée Champollion, Grenoble, France

Two years of intensive courses in mathematics, physics, and computer science to prepare for engineering school competitive exams.

Publications

BELLS: A Framework Towards Future Proof Benchmarks for the Evaluation of LLM Safeguards

Diego Dorn, Alexandre Variengien, Charbel-Raphaël Segerie, Vincent Corruble

2024

This paper proposes a framework to evaluate the generalization capabilities of LLM input-output safeguards like Llama Guard in detecting unknown failure modes. We presented this work at an oral session of the NextGen AI Safety Workshop at ICML 2024.

Look Before You Leap: A Universal Emergent Decomposition of Retrieval Tasks in Language Models

Alexandre Variengien, Eric Winsor

2023

In the search for "units of interpretability,"" I decided to zoom out instead of zooming in, looking for universal macroscopic motifs in LLMs. In other words, are there such things as "organs" inside LLMs? This work suggests that the answer is yes! Preprint available on ArXiv. This work was part of my Master's thesis, available here. This work received a spotlight at the Mechanistic Interpretability Workshop at ICML 2024.

How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model

Michael Hanna, Ollie Liu, Alexandre Variengien

2023

Paper accepted at NeurIPS 2023. I supervised this research project during the REMIX residency program.

Interpretability in the Wild: A Circuit for Indirect Object Identification in GPT-2 Small

Kevin Wang, Alexandre Variengien, Arthur Conmy, Buck Shlegeris, Jacob Steinhardt

2022

Accepted as a poster at the ICLR 2023 conference. I recommend reading the ArXiv version for an up-to-date version of this work.

Towards self-organized control: Using neural cellular automata to robustly control a cart-pole agent

Alexandre Variengien, Stefano Nichele, Tom Glover, Sidney Pontes-Filho

2021

This paper was published in the IMI journal. Inspired by the Distill thread on differentiable self-organizing systems, I also developed an interactive article.

Recurrent Neural Networks Models for Developmental Language Acquisition: Reservoirs Outperform LSTMs

Xavier Hinaut, Alexandre Variengien

2020

Poster accepted to the 12th Annual Meeting of the Society for the Neurobiology of Language.

A Journey in ESN and LSTM Visualisations on a Language Task

Alexandre Variengien, Xavier Hinaut

2020

Paper available as an arXiv preprint. We compared two architectures of recurrent neural networks on a language task. We also presented a new tool to visually grasp the inner representation of the sentences learned by the models.

Projects

AI safety distillation contest

May 2022

As part of EA UC Berkeley's contest, I wrote a distillation of the ELK report to make its core ideas easier to understand. My submission was awarded a prize.

SACCHA

2020-2021

Interdisciplinary project in collaboration with VetAgro Sup. The goal is to create an educational simulator for dog auscultation to be used by veterinary students. I'm involved in hardware and UI design.

Hackaton Hack COVID19

April 2020

Creation of an online interactive epidemiological model to evaluate the impact of social tracing applications on the propagation of SARS-CoV-2.