At Microsoft AI, I work on the internal training infrastructure for LLM. I currently focus on pretraining and RL numerics, ensuring correctness, stability and efficiency at scale.
Previously, I was a member of the Seed-Infra-Training team at ByteDance, where I built distributed training systems for multimodal and video generation models.
I am interested in building scalable, efficient, correct and fault-tolerant heterogeneous computing systems to facilitate the pursuit of machine intelligence.
I graduated from Shanghai Jiao Tong University with a B.Eng. in Computer Software Engineering. I was interested in the following topics when I was a Ph.D. student at UC Santa Barbara:
1) privacy-preserving algorithm design and system implementation with trusted hardware like SGX, differential privacy or modern cryptographic primitives, especially zero knowledge proofs.
2) build scalable heterogeneous systems to accelerate the generation of zero knowledge proof for extremely large circuits.