About me

I am a fourth-year PhD student at Tsinghua University. I am fortunately advised by Prof. Juanzi Li and also work closely with Prof. Zhiyuan Liu. Previously, I received my B.E. in Computer Science and Technology from Tsinghua University in 2020. In 2019, I visited Mila and worked with Prof. Jian Tang. You can find my CV here.

My research interest lies in natural language processing and knowledge engineering. The research directions I am fascinated in and working on are:

  1. Understanding Lanaguge Models (Mechanistic Interpretability, Probing, etc.)
  2. Event Understanding (Event Extraction, Event Relation Extraction, etc.)

News

  • [Jun. 2023] Check out KoLA, our new evolving world knowledge benchmark for LLMs.
  • [Oct. 2022] Release a nice event extraction toolkit OmniEvent. Welcome to try it!

Highlighted Publications

Please refer to publications or my Google Scholar profile for the full list.

  • Xiaozhi Wang*, Kaiyue Wen*, Zhengyan Zhang, Lei Hou, Zhiyuan Liu, Juanzi Li. Finding Skill Neurons in Pre-trained Transformer-based Language Models. The Conference on Empirical Methods in Natural Language Processing (EMNLP 2022). [pdf] [code]
  • Xiaozhi Wang*, Yulin Chen*, Ning Ding, Hao Peng, Zimu Wang, Yankai Lin, Xu Han, Lei Hou, Juanzi Li, Zhiyuan Liu, Peng Li, Jie Zhou. MAVEN-ERE: A Unified Large-scale Dataset for Event Coreference, Temporal, Causal, and Subevent Relation Extraction. The Conference on Empirical Methods in Natural Language Processing (EMNLP 2022). [pdf] [code] [CodaLab]
  • Hao Peng*, Xiaozhi Wang*, Shengding Hu, Hailong Jin, Lei Hou, Juanzi Li, Zhiyuan Liu, Qun Liu. COPEN: Probing Conceptual Knowledge in Pre-trained Language Models. The Conference on Empirical Methods in Natural Language Processing (EMNLP 2022). [pdf] [code] [CodaLab]
  • Xiaozhi Wang, Tianyu Gao, Zhaocheng Zhu, Zhengyan Zhang, Zhiyuan Liu, Juanzi Li, Jian Tang. KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation. Transactions of the Association for Computational Linguistics (TACL), 2021. [pdf] [code] [dataset]
  • Xiaozhi Wang, Ziqi Wang, Xu Han, Wangyi Jiang, Rong Han, Zhiyuan Liu, Juanzi Li, Peng Li, Yankai Lin, Jie Zhou. MAVEN: A Massive General Domain Event Detection Dataset. The Conference on Empirical Methods in Natural Language Processing (EMNLP 2020). [pdf] [code] [CodaLab] [leaderboard]

Professional Services