Jingren Zhou: “Experience and Challenges in Building Large Language Models at Alibaba”

Jingren Zhou shares his experiences during the development of an LLM in industry.

Georg Gottlob (left) and Jingren Zhou (right)
Georg Gottlob (left) and Jingren Zhou (right)

On March 14, 2024, the seminar with Jingren Zhou took place at TU Wien.


In this talk, Jingren Zhou shared his experience and lessons learned during the development of Tongyi Qianwen, also known as Qwen, a state-of-the-art large language model at Alibaba Cloud. He first outlined the key steps taken to construct a high-performing model with the ability to generate creative text, comprehend intricate instructions, tackle mathematical problems, and more. Subsequently, He described a variety of systems challenges in building large language models and present our innovative design in the areas of distributed storage, high performance networking, resource scheduling, and execution frameworks. Such techniques significantly enhance the efficiency of handling complex AI workloads in a distributed environment.

About the Speaker

Jingren Zhou currently holds the position of Chief Technology Officer at Alibaba Cloud, where he plays a pivotal role in driving technology innovation and product development. His responsibilities also include leading the development of AI foundational models and their applications in various business scenarios at Alibaba Cloud. Before taking on this role, he made significant contributions by leading the development of advanced techniques for personalized search, product recommendation, and advertising at Alibaba’s e-commerce platform and Alipay’s online payment platform. Prior to his time at Alibaba, he was a veteran at Microsoft, where he led Big Data and AI research and development. His research interests span across cloud computing, databases, and large-scale machine learning systems. He holds a B.S. in Computer Science from the University of Science and Technology of China, and a Ph.D. in Computer Science from Columbia University. He is a Fellow of IEEE.


This talk was supported by ZIF (Zentrum für Informatikforschung).