I am a PhD student at University of Manchester, supervised by Prof. Sophia Ananiadou. Previously, I worked as a deep learning researcher at Tencent Technology, where I used deep learning models to improve the accuracy of binary code similarity detection and binary source code matching.
Currently, my research focuses on:
a) Understanding LLMs and MLLMs. Identifying important paramaters in LLMs and understanding how LLMs work. My research has investigated the underlying mechanisms of factual knowledge, arithmetic, latent multi-hop reasoning, in-context learning, and visual question answering.
b) LLM post-training. Analyzing LLMs and designing methods to enhance LLMs’ capabilities (knowledge, arithmetic, reasoning, multimodal) during post-training. I have developed the back attention module to improve the latent multi-hop reasoning ability of LLMs.
c) Model editing for LLM. Identifying and editing the important parameters to reduce hallucination, unfairness, toxicity, and bias in LLMs. I designed the neuron-level model editing technique to mitigate gender bias without hurting the LLM’s existing capabilities.
My email is zepingyu@foxmail.com.
🔥 News
-
2025.2: New preprint: Back Attention: Understanding and Enhancing Multi-Hop Reasoning in Large Language Models. This work investigates the mechanism of latent multi-hop reasoning and propose the back attention module to enhance the latent multi-hop reasoning ability in LLMs.
-
2025.1: New preprint: Understanding and Mitigating Gender Bias in LLMs via Interpretable Neuron Editing. This work investigates the mechanism of gender bias and proposes a neuron-level model editing method to reduce gender bias in LLMs without hurting the existing abilities.
-
2024.12: I’ve compiled paper lists of SAE and neuron in LLMs.
-
2024.11: New preprint: Understanding Multimodal LLMs: the Mechanistic Interpretability of Llava in Visual Question Answering. This work explores the mechanism of Llava in visual question answering.
-
2024.09: Our work is accepted by EMNLP 2024 (main): Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis. This work explores the neuron-level information flow of arithmetic mechanism in LLMs and proposes a model pruning method for arithmetic tasks.
-
2024.09: Our work is accepted by EMNLP 2024 (main): Neuron-Level Knowledge Attribution in Large Language Models. This work introduces how to identify important neurons in LLMs, and explores the neuron-level information flow of factual knowledge mechanism.
-
2024.09: Our work is accepted by EMNLP 2024 (main): How do Large Language Models Learn In-Context? Query and Key Matrices of In-Context Heads are Two Towers for Metric Learning. This work explores the mechanism of in-context learning in LLMs.
-
2024.04: I’ve compiled a paper list for those interested in exploring the mechanisms of LLMs.
📝 Publications
Back Attention: Understanding and Enhancing Multi-Hop Reasoning in Large Language Models
Zeping Yu, Yonatan Belinkov, Sophia Ananiadou [arxiv: 2502.10835]
Understanding and Mitigating Gender Bias in LLMs via Interpretable Neuron Editing
Zeping Yu, Sophia Ananiadou [arxiv: 2501.14457]
Zeping Yu, Sophia Ananiadou [arxiv: 2411.10950]
Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis
Zeping Yu, Sophia Ananiadou [EMNLP 2024 (main)]
Neuron-Level Knowledge Attribution in Large Language Models
Zeping Yu, Sophia Ananiadou [EMNLP 2024 (main)]
Zeping Yu, Sophia Ananiadou [EMNLP 2024 (main)]
CodeCMR: Cross-modal retrieval for function-level binary source code matching
Zeping Yu, Wenxin Zheng, Jiaqi Wang, Qiyi Tang, Sen Nie, Shi Wu [NeurIPS 2020]
Order matters: Semantic-aware neural networks for binary code similarity detection
Zeping Yu, Rui Cao, Qiyi Tang, Sen Nie, Junzhou Huang, Shi Wu [AAAI 2020]
Adaptive User Modeling with Long and Short-Term Preferences for Personalized Recommendation
Zeping Yu, Jianxun Lian, Ahmad Mahmoody, Gongshen Liu, Xing Xie [IJCAI 2019]
Sliced recurrent neural networks
Zeping Yu, Gongshen Liu [COLING 2018]
📖 Educations
- 2023.09 - 2027.02, PhD, Computer Science, University of Manchester.