cv

Basics

Name Leonard Friedrich Bereska
Label AI Safety Researcher
Email leonard [dot] bereska [at] uva [dot] nl
Url https://leonardbereska.github.io/
Summary PhD Candidate at the University of Amsterdam, specializing in AI Safety and Mechanistic Interpretability. Focused on making AI systems more transparent, interpretable, and aligned with human values.

Work

  • 2021.10 - Present
    PhD Candidate
    University of Amsterdam
    Pioneering transformer model interpretability through monosemanticity engineering for enhanced AI safety. Focused on AI Alignment strategies to ensure long-term value preservation.
    • AI Safety
    • Mechanistic Interpretability
    • Transformer Models
    • Monosemanticity Engineering
  • 2019.02 - 2021.09
    Research Assistant
    University of Heidelberg
    Infused dendritic computation principles into neural networks. Explored novel optimization criteria for dynamical systems.
    • Dendritic Computation
    • Neural Networks
    • Dynamical Systems
  • 2017.08 - 2017.10
    Research Intern
    Central Institute of Mental Health
    Investigated initialization schemes for a piecewise-linear recurrent neural network using expectation-maximization.
    • Recurrent Neural Networks
    • Initialization Schemes
    • Expectation-Maximization

Volunteer

  • 2023.09 - Present

    Amsterdam, Netherlands

    Co-founder and Core Team Member
    AI Safety Initiative Amsterdam
    Co-founded and actively contribute to a group dedicated to promoting AI safety research and awareness in Amsterdam.
    • Organized OpenAI Talk and Q&A on AI and Existential Risk
    • Coordinated Panel Discussion on AI Risks: From Today to Doomsday
    • Facilitated reading groups on AGI Safety Fundamentals

Education

  • 2021.10 - Present

    Amsterdam, Netherlands

    PhD
    University of Amsterdam
    Artificial Intelligence
    • Continual Learning
    • Mechanistic Interpretability
  • 2016.09 - 2019.02

    Heidelberg, Germany

    MSc
    University of Heidelberg
    Computational Physics
    • Visual Learning and Computer Vision
    • Machine Learning
    • Artificial Intelligence
    • Time Series Analysis
  • 2014.09 - 2015.07

    Taipei, Taiwan

    Exchange Student
    National Taiwan University
    Mandarin Chinese
    • Advanced-level Mandarin Chinese
  • 2012.09 - 2016.07

    Heidelberg, Germany

    BSc
    University of Heidelberg
    Physics
    • Analysis
    • Linear Algebra
    • Statistical Physics
    • General Relativity
  • 2006.09 - 2012.07

    Celle, Germany

    High School
    Gymnasium Ernestinum
    Abitur
    • Physics, Mathematics, Chemistry
    • Latin, History

Awards

Certificates

ML Safety Course
Dan Hendrycks, Center for AI Safety 2023-08

Publications

Skills

AI Safety Research
Mechanistic Interpretability
AI Alignment
Transformer Models
Monosemanticity Engineering
Programming
Python
JAX
PyTorch
Functional Programming
Git
Bash
Linux
LaTeX
Machine Learning
Deep Learning
Reinforcement Learning
Computer Vision
Natural Language Processing
Dynamical Systems

Languages

German
Native
English
Fluent
Dutch
Conversational
Mandarin
Conversational
French
Basic
Italian
Basic
Latin
Advanced Latinum
Ancient Greek
Graecum
Old Hebrew
Hebraicum

Interests

AI Safety
Alignment
Robustness
Transparency
Value Learning
Mechanistic Interpretability
Neural Networks
Feature Visualization
Circuit Analysis
Dynamical Systems
Nonlinear Dynamics
Chaos Theory
Time Series Analysis