About me

I am a 3rd-year PhD candidate in the Department of Computer Science and Technology at Tsinghua University. I am supervised by Prof. Juanzi Li and also work closely with Prof. Zhiyuan Liu. I visited Mila and did research with Prof. Jian Tang. You can find my CV here.

My research interests lie in deep learning methods on Natural Language Processing and Knowledge Graph. My research goal is to bridge machine learning models and symbolic human knowledge.

News

  • [Oct. 2022] Got three papers accepted at EMNLP 2022. See you online:)
  • [Oct. 2022] Released a nice event extraction toolkit OmniEvent. Welcome to try it!
  • [May. 2021] Got one paper accepted at ACL-IJCNLP 2021. See you online!

Professional Services

  • Program Committee Member/Reviewer (Conference): AAAI/IJCAI/COLING 2020, AAAI/ACL/EMNLP 2021, AAAI/COLING/SIGIR/CCKS/EMNLP 2022, AAAI 2023, ACL Rolling Review.
  • Reviewer (Journal): Neurocomputing, Complex & Intelligent Systems, AI Open

Preprint

  • Yujia Qin*, Xiaozhi Wang*, Yusheng Su, Yankai Lin, Ning Ding, Zhiyuan Liu, Juanzi Li, Lei Hou, Peng Li, Maosong Sun, Jie Zhou. Exploring Universal Intrinsic Task Subspace via Prompt Tuning. [arxiv]
  • Yuan Yao, Qingxiu Dong, Jian Guan, Boxi Cao, Zhengyan Zhang, Chaojun Xiao, Xiaozhi Wang, Fanchao Qi, Junwei Bao, Jinran Nie, Zheni Zeng, Yuxian Gu, Kun Zhou, Xuancheng Huang, Wenhao Li, Shuhuai Ren, Jinliang Lu, Chengqiang Xu, Huadong Wang, Guoyang Zeng, Zile Zhou, Jiajun Zhang, Juanzi Li, Minlie Huang, Rui Yan, Xiaodong He, Xiaojun Wan, Xin Zhao, Xu Sun, Yang Liu, Zhiyuan Liu, Xianpei Han, Erhong Yang, Zhifang Sui, Maosong Sun. CUGE: A Chinese Language Understanding and Generation Evaluation Benchmark. [arxiv]
  • Chenglei Si*, Zhengyan Zhang*, Yingfa Chen*, Xiaozhi Wang, Zhiyuan Liu, Maosong Sun. READIN: A Chinese Multi-Task Benchmark with Realistic and Diverse Input Noises. [arxiv]
  • Ming Li*, Yusheng Su*, Hsiu-Yuan Huang, Jiali Cheng, Xin Hu, Xinmiao Zhang, Huadong Wang, Yujia Qin, Xiaozhi Wang, Zhiyuan Liu, Dan Zhang. Human Emotion Knowledge Representation Emerges in Large Language Model and Supports Discrete Emotion Inference. [arxiv]

Publications

* indicates equal contribution, and see here for details.

2023

  • Chenglei Si*, Zhengyan Zhang*, Yingfa Chen*, Fanchao Qi, Xiaozhi Wang, Zhiyuan Liu, Yasheng Wang, Qun Liu, Maosong Sun. Sub-Character Tokenization for Chinese Pretrained Language Models. Transactions of the Association for Computational Linguistics (TACL), 2023. [pdf] [code]
  • Ning Ding*, Yujia Qi*, Guang Yang, Fuchao Wei, Zonghan Yang, Yusheng Su, Shengding Hu, Yulin Chen, Chi-Min Chan, Weize Chen, Jing Yi, Weilin Zhao, Xiaozhi Wang, Zhiyuan Liu, Hai-Tao Zheng, Jianfei Chen, Yang Liu, Jie Tang, Juanzi Li, Maosong Sun. Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models. Nature Machine Intelligence, 2023 [pdf] [code]

2022

  • Xiaozhi Wang*, Yulin Chen*, Ning Ding, Hao Peng, Zimu Wang, Yankai Lin, Xu Han, Lei Hou, Juanzi Li, Zhiyuan Liu, Peng Li, Jie Zhou. MAVEN-ERE: A Unified Large-scale Dataset for Event Coreference, Temporal, Causal, and Subevent Relation Extraction. The Conference on Empirical Methods in Natural Language Processing (EMNLP 2022). [pdf] [code] [CodaLab] (oral)
  • Xiaozhi Wang*, Kaiyue Wen*, Zhengyan Zhang, Lei Hou, Zhiyuan Liu, Juanzi Li. Finding Skill Neurons in Pre-trained Transformer-based Language Models. The Conference on Empirical Methods in Natural Language Processing (EMNLP 2022). [pdf] [code]
  • Hao Peng*, Xiaozhi Wang*, Shengding Hu, Hailong Jin, Lei Hou, Juanzi Li, Zhiyuan Liu, Qun Liu. COPEN: Probing Conceptual Knowledge in Pre-trained Language Models. The Conference on Empirical Methods in Natural Language Processing (EMNLP 2022). [pdf] [code] [CodaLab]
  • Kaijie Shi, Xiaozhi Wang, Jifan Yu, Lei Hou, Juanzi Li, Jingtong Wu, Dingyu Yong, Jinghui Xiao, Qun Liu. CStory: A Chinese Large-scale News Storyline Dataset. The 31st ACM International Conference on Information and Knowledge Management (CIKM 2022). [pdf] [code & data]
  • Yusheng Su*, Xiaozhi Wang*, Yujia Qin, Chi-Min Chan, Yankai Lin, Huadong Wang, Kaiyue Wen, Zhiyuan Liu, Peng Li, Juanzi Li, Lei Hou, Maosong Sun, Jie Zhou. On Transferability of Prompt Tuning for Natural Language Understanding. The 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2022). [pdf] [code]
  • Feng Yao, Chaojun Xiao, Xiaozhi Wang, Zhiyuan Liu, Lei Hou, Cunchao Tu, Juanzi Li, Yun Liu, Weixing Shen, Maosong Sun. LEVEN: A Large-Scale Chinese Legal Event Detection Dataset. Findings of ACL 2022. [pdf] [code]

2021

  • Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun. CPM: A Large-scale Generative Chinese Pre-trained Language Model. AI Open. [pdf] [code] [homepage]
  • Ziqi Wang*, Xiaozhi Wang*, Xu Han, Yankai Lin, Lei Hou, Zhiyuan Liu, Peng Li, Juanzi Li and Jie Zhou. CLEVE: Contrastive Pre-training for Event Extraction. The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2021). [pdf] [code]
  • Xiaozhi Wang, Tianyu Gao, Zhaocheng Zhu, Zhengyan Zhang, Zhiyuan Liu, Juanzi Li, Jian Tang. KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation. Transactions of the Association for Computational Linguistics (TACL), 2021. [pdf] [code] [dataset]
  • Yuan Yao, Haoxi Zhong, Zhengyan Zhang, Xu Han, Xiaozhi Wang, Chaojun Xiao, Guoyang Zeng, Zhiyuan Liu, Maosong Sun. Adversarial Language Games for Advanced Natural Language Intelligence. AAAI Conference on Artifical Intelligence (AAAI 2021). [arxiv]

2020

  • Xiaozhi Wang*, Shengyu Jia*, Xu Han, Zhiyuan Liu, Juanzi Li, Peng Li, Jie Zhou. Neural Gibbs Sampling for Joint Event Argument Extraction. The 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing (AACL-IJCNLP 2020). [pdf] [code]
  • Xiaozhi Wang, Ziqi Wang, Xu Han, Wangyi Jiang, Rong Han, Zhiyuan Liu, Juanzi Li, Peng Li, Yankai Lin, Jie Zhou. MAVEN: A Massive General Domain Event Detection Dataset. The Conference on Empirical Methods in Natural Language Processing (EMNLP 2020). [pdf] [code] [CodaLab] [leaderboard]
  • Yuxian Gu, Zhengyan Zhang, Xiaozhi Wang, Zhiyuan Liu, Maosong Sun. Train No Evil: Selective Masking for Task-guided Pre-training. The Conference on Empirical Methods in Natural Language Processing (EMNLP 2020). [pdf] [code]

2019

  • Xiaozhi Wang*, Ziqi Wang*, Xu Han, Zhiyuan Liu, Juanzi Li, Peng Li, Maosong Sun, Jie Zhou, Xiang Ren. HMEAE: Hierarchical Modular Event Argument Extraction. The Conference on Empirical Methods in Natural Language Processing (EMNLP 2019). [pdf] [code] (oral) (short)

  • Xiaozhi Wang*, Xu Han*, Zhiyuan Liu, Maosong Sun, Peng Li. Adversarial Training for Weakly Supervised Event Detection. The 2019 Conference of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL-HLT 2019). [pdf] [code] (oral)

2018

  • Xiaozhi Wang*, Xu Han*, Yankai Lin, Zhiyuan Liu, Maosong Sun. Adversarial Multi-lingual Neural Relation Extraction. The 27th International Conference on Computational Linguistics (COLING 2018). [pdf] [code] (oral)