【SDS Topical Seminar Series】Towards End-to-End Optimization of LLMs and Their Applications
Dear all,
You are cordially invited to the School of Data Science Topical Seminar on Towards End-to-End Optimization of LLMs and Their Applications. Detailed information is as follows:
SDS Topical Seminar Series |
|
Topic |
Towards End-to-End Optimization of LLMs and Their Applications |
Speaker |
Hong XU, Associate Professor, Department of Computer Science and Engineering at the Chinese University of Hong Kong |
Host |
Minchen YU, Assistant Professor, School of Data Science, CUHK-Shenzhen |
Date |
15 November (Friday), 2024 |
Time |
11:00 AM- 12:00 PM, Beijing Time |
Format |
Onsite |
Venue |
Room 111, Zhi Xin Building |
Zoom Link |
https://cuhk-edu-cn.zoom.us/j/93671212290?pwd=KRNcWBnKdFyOMbrBWFblsVIQEC8a8F.1 Meeting ID 936 7121 2290, Password: 113579 |
Language |
English |
Abstract |
|
A plethora of LLM-based applications have emerged after the explosive development in generative AI. How to serve them efficiently is clearly paramount. This talk will introduce our attempts to optimize LLM-based applications from an end-to-end perspective. For applications, we build Teola to orchestrate their workflows holistically. Instead of using a simple chain of coarse modules (indexing, search, etc.), Teola represents a workflow using a fine-grained primitive-level data graph, thus exposing a larger design space with more opportunities to exploit the end-to-end optimizations such as parallelization, pipelining, and application-aware execution scheduling. Zooming into LLMs, we again take a holistic approach and build LLEGO to serve them using fine-grained blocks which are sharable across different LLMs, instead of treating them as individual silos. Both systems demonstrate the potential of fine-grained representations that open up a much larger optimization space towards end-to-end metrics, with many exciting new opportunities. |
|
Biography |
|
Hong Xu is an Associate Professor in Department of Computer Science and Engineering, The Chinese University of Hong Kong. His research area is computer networking and systems, particularly big data systems and data center networks. From 2013 to 2020 he was with City University of Hong Kong. He received his B.Eng. from The Chinese University of Hong Kong in 2007, and his M.A.Sc. and Ph.D. from University of Toronto in 2009 and 2013, respectively. His work has received best paper awards from ACM SIGCOMM 2022, IEEE ICNP 2023 and 2015, among others. He was the recipient of an Early Career Scheme Grant from the Hong Kong Research Grants Council in 2014. He is a senior member of IEEE and ACM. |