A Survey of Data Attribution: Methods, Applications, and Evaluation in the Era of Generative AI
Junwei Deng*, Yuzheng Hu*, Pingbang Hu*, Ting-Wei Li*, Shixuan Liu*, ..., Shiyuan Zhang, ..., Jiaqi W. Ma. Preprint, 2025.
[Paper]
Exploring Training Data Attribution under Limited Access Constraints Shiyuan Zhang*, Junwei Deng*, Juhan Bae, Jiaqi W. Ma. Preprint, 2025.
[Paper]
Taming Hyperparameter Sensitivity in Data Attribution: Practical Selection Without Costly Retraining
Weiyi Wang, Junwei Deng, Yuzheng Hu, Shiyuan Zhang, Xirui Jiang, Runting Zhang, Han Zhao, Jiaqi W. Ma. NeurIPS, 2025.
[Paper]
[Code]
TimeCausality: Evaluating the Causal Ability in Time Dimension for Vision Language Models
Zeqing Wang*, Shiyuan Zhang*, Chengpei Tang, Keze Wang. CVPR 2026 DataMFM Workshop.
[Paper]
[Code]
dattri: A Library for Efficient Data Attribution
Junwei Deng*, Ting-Wei Li*, Shiyuan Zhang, Shixuan Liu, Yijun Pan, Hao Huang, Xinhe Wang,
Pingbang Hu, Xingjian Zhang, Jiaqi W. Ma. NeurIPS, 2024. Spotlight Paper.
[Paper]
[Code]
[Blog]
Computational Copyright: Towards A Royalty Model for Music Generative AI
Junwei Deng, Xirui Jiang, Shiyuan Zhang, Shichang Zhang, Himabindu Lakkaraju, Ruijiang Gao, Chris Donahue, Jiaqi W. Ma. ICML 2024 GenLaw Workshop; ICLR 2024 DPFM Workshop, Best Paper Award.
[Paper]
Nuanced Multi-class Detection of Machine-Generated Scientific Text Shiyuan Zhang, Yubin Ge, Xiaofeng Liu. PACLIC, 2024. (Oral Presentation)
[Paper]
[Code]