Shiyuan (Sean) Zhang

Hi! I am Shiyuan, currently an AI Engineer based in Seattle. I received my bachelor's and master's degree in Statistics and Computer Science from the University of Illinois Urbana-Champaign, where I worked closely with Prof. Jiaqi Ma. My research interests lie in data-centric machine learning (e.g., data attribution), trustworthy NLP, machine unlearning, and vision-language models.

Outside of research, I enjoy exploring trading, where I find the unpredictability of markets both humbling and exciting. During late-night coding and research time, I am often accompanied by Lulu, a quiet white rag who has been a witness to many of my projects.

đź“§ shiyuanzhang.ml AT gmail DOT edu

Google Scholar  /  GitHub  /  LinkedIn

profile photo
News
  • [2025-09] 🔥 Check our new survey paper on data attribution: A Survey of Data Attribution: Methods, Applications, and Evaluation in the Era of Generative AI.
  • [2025-09] One paper accepted to NeurIPS 2025.
  • [2025-09] ArXived a new preprint “Exploring Training Data Attribution under Limited Access Constraints”. [ArXiv]
  • [2025-05] ArXived a new preprint about hyperparameters selections for TDA.
  • [2025-05] Completed my Master’s degree in CS @ UIUC 🎉.
  • [2025-05] Released a new preprint: TimeCausality, a benchmark for temporal reasoning in VLMs.
  • [2024-10] One paper accepted to PACLIC 2024 (Oral Presentation).
  • [2024-09] Our paper dattri accepted as a Spotlight at NeurIPS 2024!
  • [2024-07] One paper accepted to ICML 2024 GenLaw Workshop.
  • [2023-05] Graduated from UIUC with a B.S. in Statistics (Summa Cum Laude).
Publications
  • A Survey of Data Attribution: Methods, Applications, and Evaluation in the Era of Generative AI
    Junwei Deng*, Yuzheng Hu*, Pingbang Hu*, Ting-Wei Li*, Shixuan Liu*, Jiachen T. Wang, Dan Ley, Qirun Dai, Benhao Huang, Jin Huang, Cathy Jiao, Hoang Anh Just, Yijun Pan, Jingyan Shen, Yiwen Tu, Weiyi Wang, Xinhe Wang, Shichang Zhang, Shiyuan Zhang, Ruoxi Jia, Himabindu Lakkaraju, Hao Peng, Weijing Tang, Chenyan Xiong, Jieyu Zhao, Hanghang Tong, Han Zhao, Jiaqi W. Ma.
    Preprint, 2025.
    [Paper]

  • Exploring Training Data Attribution under Limited Access Constraints
    Shiyuan Zhang*, Junwei Deng*, Juhan Bae, Jiaqi W. Ma.
    Preprint, 2025.
    [Paper]

  • Taming Hyperparameter Sensitivity in Data Attribution: Practical Selection Without Costly Retraining
    Weiyi Wang, Junwei Deng, Yuzheng Hu, Shiyuan Zhang, Xirui Jiang, Runting Zhang, Han Zhao, Jiaqi W. Ma.
    NeurIPS, 2025.
    [Paper] [Code]

  • TimeCausality: Evaluating the Causal Ability in Time Dimension for Vision Language Models
    Zeqing Wang*, Shiyuan Zhang*, Chengpei Tang, Keze Wang.
    Preprint, 2025.
    [Paper] [Code]

  • dattri: A Library for Efficient Data Attribution
    Junwei Deng*, Ting-Wei Li*, Shiyuan Zhang, Shixuan Liu, Yijun Pan, Hao Huang, Xinhe Wang, Pingbang Hu, Xingjian Zhang, Jiaqi Ma.
    NeurIPS, 2024. Spotlight Paper.
    [Paper] [Code] [Blog]

  • Computational Copyright: Towards A Royalty Model for Music Generative AI
    Junwei Deng, Shiyuan Zhang, Jiaqi Ma.
    ICML 2024 GenLaw Workshop; ICLR 2024 DPFM Workshop, Best Paper Award.
    [Paper]

  • Nuanced Multi-class Detection of Machine-Generated Scientific Text
    Shiyuan Zhang, Yubin Ge, Xiaofeng Liu.
    PACLIC, 2024. (Oral Presentation)
    [Paper] [Code]
Industrial Experience
Armada.AI
Bellevue, WA, USA
2025.08 - Present

AI Engineer
Mentors: Sina Ehsani
Education
University of Illinois Urbana-Champaign
Champaign, IL, USA
2023.08 - 2025.05

Master of Computer Science
GPA: 4.00 / 4.00
University of Illinois Urbana-Champaign
Champaign, IL, USA
2019.08 - 2023.05

B.S. in Statistics, Minor in Computer Science
GPA: 3.96 / 4.00, Magna Cum Laude
Teaching Experience
  • Jan 2025 – May 2025 (Spring), Teaching Assistant, CEE 202, CEE 330 – SIIP Program, UIUC
  • Sep 2024 – Dec 2024 (Fall), Teaching Assistant, CEE 202 – Engineering Risk & Uncertainty, UIUC
  • May 2024 – Aug 2024 (Summer), Teaching Assistant, CEE 340 – SIIP Program, UIUC
  • Jan 2024 – May 2024 (Spring), Teaching Assistant, CEE 340 – SIIP Program, UIUC
  • May 2022 – Sep 2022 (Summer), Course Assistant, CS 411 – Database Systems, UIUC
Service
  • Conference Reviewer for: ICLR 2026, ICML 2025, ARR 2025.05 (EMNLP)
  • Workshop Reviewer for: Fifth Workshop on Scholarly Document Processing @ ACL 2025, Fourth Workshop on Scholarly Document Processing @ ACL 2024