I am a 2nd-year PhD student in the Department of Computer Science and Technology at Tsinghua University. I am supervised by Prof. Juanzi Li. I also work closely with Prof. Zhiyuan Liu. At the summer of 2019, I visited Mila and did research with Prof. Jian Tang. You can find my CV here.
My research interests lie in deep learning methods on Natural Language Processing and Knowledge Graph. My research goal is to bridge machine learning models and symbolic human knowledge.
- [May. 2021] Got one paper accepted at ACL-IJCNLP 2021. See you online!
- [Aug. 2020] Got two papers accepted at EMNLP 2020. See you online:)
- [Sep. 2019] Released a pre-trained language model reading list with Zhengyan Zhang.
- Program Committee Member: AAAI/IJCAI/COLING 2020, AAAI/ACL/EMNLP 2021, AAAI 2022
- Review Assistant: COLING/EMNLP 2018, IJCAI/SIGIR/ACL 2019
- Chenglei Si, Zhengyan Zhang, Yingfa Chen, Fanchao Qi, Xiaozhi Wang, Zhiyuan Liu, Maosong Sun. SHUOWEN-JIEZI: Linguistically Informed Tokenizers For Chinese Language Model Pretraining. [arxiv]
* indicates equal contribution, and see here for details.
- Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun. CPM: A Large-scale Generative Chinese Pre-trained Language Model. AI Open. [pdf] [code] [homepage]
- Ziqi Wang*, Xiaozhi Wang*, Xu Han, Yankai Lin, Lei Hou, Zhiyuan Liu, Peng Li, Juanzi Li and Jie Zhou. CLEVE: Contrastive Pre-training for Event Extraction. The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2021). [pdf] [code]
- Xiaozhi Wang, Tianyu Gao, Zhaocheng Zhu, Zhengyan Zhang, Zhiyuan Liu, Juanzi Li, Jian Tang. KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation. Transactions of the Association for Computational Linguistics (TACL), 2021. [pdf] [code] [dataset]
- Yuan Yao, Haoxi Zhong, Zhengyan Zhang, Xu Han, Xiaozhi Wang, Chaojun Xiao, Guoyang Zeng, Zhiyuan Liu, Maosong Sun. Adversarial Language Games for Advanced Natural Language Intelligence. AAAI Conference on Artifical Intelligence (AAAI 2021). [arxiv]
- Xiaozhi Wang*, Shengyu Jia*, Xu Han, Zhiyuan Liu, Juanzi Li, Peng Li, Jie Zhou. Neural Gibbs Sampling for Joint Event Argument Extraction. The 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing (AACL-IJCNLP 2020). [pdf] [code]
- Xiaozhi Wang, Ziqi Wang, Xu Han, Wangyi Jiang, Rong Han, Zhiyuan Liu, Juanzi Li, Peng Li, Yankai Lin, Jie Zhou. MAVEN: A Massive General Domain Event Detection Dataset. The Conference on Empirical Methods in Natural Language Processing (EMNLP 2020). [pdf] [code] [CodaLab] [leaderboard]
- Yuxian Gu, Zhengyan Zhang, Xiaozhi Wang, Zhiyuan Liu, Maosong Sun. Train No Evil: Selective Masking for Task-guided Pre-training. [pdf] [code]
Xiaozhi Wang*, Ziqi Wang*, Xu Han, Zhiyuan Liu, Juanzi Li, Peng Li, Maosong Sun, Jie Zhou, Xiang Ren. HMEAE: Hierarchical Modular Event Argument Extraction. The Conference on Empirical Methods in Natural Language Processing (EMNLP 2019). [pdf] [code] (oral) (short)
Xiaozhi Wang*, Xu Han*, Zhiyuan Liu, Maosong Sun, Peng Li. Adversarial Training for Weakly Supervised Event Detection. The 2019 Conference of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL-HLT 2019). [pdf] [code] (oral)