Tsai-Shien Chen

Captured at the stunning Santa Monica Beach, where I spent the summers of 2023 through 2025.

Biography

Welcome! I am a fourth-year Ph.D. candidate at UC Merced, advised by Prof. Ming-Hsuan Yang. I am also a research intern at the Snap Creative Vision team, where I am privileged to work with Aliaksandr Siarohin, Sergey Tulyakov, Jun-Yan Zhu, and Kfir Aberman. Previously, I did my M.S. and B.S. at National Taiwan University.

My research centers on foundational and personalization models in Generative AI, which I see as cornerstones for bridging the power of AI agents with human creativity. Ultimately, I aim to enable more immersive and intuitive content creation.

2023 May - Now

Research Intern @ Snap

Creative Vision Team

2022 Aug. - Now

Ph.D. @ UC Merced

Vision and Learning Lab

2019 Sep. - 2022 March

M.S. @ NTU

Media IC & System Lab

News

July 2025 NEW!

Invited talk at Anuttacon on video datasets, foundation and application models

May 2025 NEW!

Started my third research internship with the Snap Creative Vision team

Feb 2025

Video Alchemist was accepted to CVPR 2025

June 2024

Awarded the Graduate Student Opportunity Program fellowship at UC Merced (one-year funded fellowship)

Sep 2024

VIMI was accepted to EMNLP 2025 in the main conference track

Feb 2024

Panda-70M was accepted to CVPR 2024

Feb 2024

Snap Video was accepted to CVPR 2024 as a Spotlight Presentation

Selected Publications

Check the full publication list in CV

Multi-subject Open-set Personalization in Video Generation

Tsai-Shien Chen, Aliaksandr Siarohin, Willi Menapace, Yuwei Fang, Kwot Sin Lee, Ivan Skorokhodov, Kfir Aberman, Jun-Yan Zhu, Ming-Hsuan Yang, Sergey Tulyakov

CVPR 2025

website arXiv code video slides poster

Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

Tsai-Shien Chen, Aliaksandr Siarohin, Willi Menapace, Ekaterina Deyneka, Hsiang-wei Chao, Byung Eun Jeon, Yuwei Fang, Hsin-Ying Lee, Jian Ren, Ming-Hsuan Yang, Sergey Tulyakov

CVPR 2024

website arXiv code video slides poster

Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis

Willi Menapace, Aliaksandr Siarohin, Ivan Skorokhodov, Ekaterina Deyneka, Tsai-Shien Chen, Anil Kag, Yuwei Fang, Aleksei Stoliar, Elisa Ricci, Jian Ren, Sergey Tulyakov

CVPR 2024 [Highlight]

website arXiv video

Motion-Conditioned Diffusion Model for Controllable Video Synthesis

Tsai-Shien Chen, Chieh Hubert Lin, Hung-Yu Tseng, Tsung-Yi Lin, Ming-Hsuan Yang

arXiv 2023

website arXiv

Incremental False Negative Detection for Contrastive Learning

Tsai-Shien Chen, Wei-Chih Hung, Hung-Yu Tseng, Shao-Yi Chien, Ming-Hsuan Yang

ICLR 2022

OpenReview arXiv slides poster

Orientation-aware Vehicle Re-identification with Semantics-guided Part Attention Network

Tsai-Shien Chen, Chih-Ting Liu, Chih-Wei Wu, Shao-Yi Chien

ECCV 2020 [Oral]

website arXiv code video slides

Viewpoint-Aware Channel-Wise Attentive Network for Vehicle Re-Identification

Tsai-Shien Chen, Man-Yu Lee, Chih-Ting Liu, Shao-Yi Chien

CVPR Workshops 2020

arXiv video slides