I am a PhD student at the University of Manchester and the NaCTeM group, supervised by Prof. Sophia Ananiadou. Previously, I worked as an NLP researcher at Tencent Technology in Shanghai. I received my Bachelor’s and Master’s degrees from Shanghai Jiao Tong University, supervised by Prof. Gongshen Liu.

I am actively seeking a Research Scientist position starting in September 2026. Please feel free to contact me at zepingyu@foxmail.com if you have any suitable openings!

My research lies in understanding the inner mechanisms of LLMs and multimodal LLMs (MLLMs). I believe deeper understanding of these models can inform the design of more robust, controllable, and efficient architectures, and guide practical techniques to improve model performance. In particular, I work on:

a) Understanding LLMs and MLLMs through mechanistic interpretability

I develop and apply interpretability techniques to investigate how LLMs and MLLMs perform a wide range of tasks using a shared architecture. Through these efforts, I aim to establish principled insights that inform and guide the future development of these models.

  1. Understanding fundamental capabilities

    I analyze how core skills emerge in LLMs, including factual knowledge, arithmetic reasoning, and in-context learning.

    • EMNLP 2024-a: Understanding the mechanism of factual knowledge in LLMs.
    • EMNLP 2024-b: Understanding the mechanism of arithmetic operations in LLMs.
    • EMNLP 2024-c: Understanding the mechanism of in-context learning in LLMs.
  2. Understanding higher-order capabilities

    I investigate more complex behaviors in LLMs and MLLMs, such as visual question answering and latent multi-hop reasoning.

    • arXiv 2024.11: Understanding the mechanism of visual question answering in MLLMs.
    • arXiv 2025.02: Understanding why LLMs fail on latent multi-hop reasoning.

b) From understanding to improvement: Enhancing LLM and MLLM capabilities through interpretability-driven techniques

Beyond understanding model behavior, I aim to improve the performance and reliability of LLMs and MLLMs. My current efforts focus on three directions:

  1. Creating interpretability tools to help users understand and trust LLM outputs

    I build interpretability tools that expose how LLMs and MLLMs reason internally, helping users understand the rationale behind model responses. These tools aim to improve transparency and increase user confidence in the model’s outputs, especially in high-stakes or ambiguous scenarios.

    • arXiv 2024.11: Creating an interpretability tool for identifying important image patches and understanding why MLLMs generate false answers in visual question answering.
    • arXiv 2025.02: Creating an interpretability tool for analyzing neuron-level information flow and understanding why LLMs cannot perform latent multihop reasoning well.
  2. Analyzing model failures to inform architectural design

    I use interpretability techniques to analyze how and why models fail, guiding the design of new architectures, modules, and strategies.

    • arXiv 2025.02: Designing a new module, back attention, to improve LLMs’ latent multi-hop reasoning ability.
  3. Improving LLM/MLLM capability via model editing and model merging

    I develop methods to identify specific parameters responsible for different capabilities, enabling targeted parameter change such as model editing and model merging.

    • arXiv 2025.01: Locate-then-edit for neuron-level model editing, to reduce gender bias and improve LLMs’ general ability.
    • arXiv 2025.05: Locate-then-merge for merging base model and post-trained model, to mitigate catastrophic forgetting and improve MLLMs’ language ability and multimodal ability.

🔥 News

📝 Publications

* Equal contribution

Locate-then-Merge: Neuron-Level Parameter Fusion for Mitigating Catastrophic Forgetting in Multimodal LLMs

Zeping Yu, Sophia Ananiadou [arxiv 2025.5]

Back Attention: Understanding and Enhancing Multi-Hop Reasoning in Large Language Models

Zeping Yu, Yonatan Belinkov, Sophia Ananiadou [arxiv 2025.2]

Understanding and Mitigating Gender Bias in LLMs via Interpretable Neuron Editing

Zeping Yu, Sophia Ananiadou [arxiv 2025.1]

Understanding Multimodal LLMs: the Mechanistic Interpretability of Llava in Visual Question Answering

Zeping Yu, Sophia Ananiadou [arxiv 2024.11]

Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis

Zeping Yu, Sophia Ananiadou [EMNLP 2024 (main)]

Neuron-Level Knowledge Attribution in Large Language Models

Zeping Yu, Sophia Ananiadou [EMNLP 2024 (main)]

How do Large Language Models Learn In-Context? Query and Key Matrices of In-Context Heads are Two Towers for Metric Learning

Zeping Yu, Sophia Ananiadou [EMNLP 2024 (main)]

CodeCMR: Cross-modal retrieval for function-level binary source code matching

Zeping Yu, Wenxin Zheng, Jiaqi Wang, Qiyi Tang, Sen Nie, Shi Wu [NeurIPS 2020]

Order matters: Semantic-aware neural networks for binary code similarity detection

Zeping Yu*, Rui Cao* , Qiyi Tang, Sen Nie, Junzhou Huang, Shi Wu [AAAI 2020]

Adaptive User Modeling with Long and Short-Term Preferences for Personalized Recommendation

Zeping Yu, Jianxun Lian, Ahmad Mahmoody, Gongshen Liu, Xing Xie [IJCAI 2019]

Sliced recurrent neural networks

Zeping Yu, Gongshen Liu [COLING 2018]

📖 Educations

  • 2023.09 - 2026.09 (Expected), PhD of Computer Science, University of Manchester.
  • 2017.09 - 2020.02, Master of Engineering, Shanghai Jiao Tong University.
  • 2013.09 - 2017.06, Bachelor of Engineering, Shanghai Jiao Tong University.