avatar
Andrei Alexandru @inwaves
AI safety research

Hi! I'm Andrei Alexandru. I'm passionate about making the future go well for humans, especially in light of increasing AI capabilities.

I work on AI safety research, focusing on evaluating large language models' capabilities and understanding the alignment problem.

On evals May 1, 2024 Understanding mixture-of-depths Apr 5, 2024 What sorts of systems can be deceptive? Oct 31, 2022 Why you might expect homogeneous takeoff: evidence from ML research Jul 17, 2022 Inductive bias of neural networks on 1D regression: an empirical examination Jun 1, 2022