Living Lab Lecture No. 30: Compressing Large Language Models

Date

Mar 6, 2025

Time

11:00 AM - 12:00 PM

Speaker

Aaron Klein

Affiliation

ScaDS.AI Dresden/Leipzig

Series

ScaDS.AI Lecture Series

Language

Main Topic

Informatik

Host

ScaDS.AI Dresden/Leipzig

Description

Large Language Models (LLMs) mark a new era in Artificial Intelligence. However, their large size poses significant challenges for inference in real-world applications due to substantial GPU memory requirements and high inference latency. In this talk, we discuss techniques to compress pre-trained LLMs, reducing their resource consumption during inference while maintaining their performance. More specifically, we approach the problem from a multi-objective Neural Architecture Search (NAS) perspective to jointly optimize performance and efficiency. By considering the LLM as a super-network consisting of a large but finite number of sub-networks, we can identify a set of Pareto-optimal sub-networks that balance parameter count and validation performance. We empirically demonstrate that using NAS techniques for fine-tuning enhances the prunability of pre-trained LLMs and explore how this impacts real-world applications.

Links

More information

Last modified: Feb 17, 2025, 4:08:07 PM

iCal

Location

Online, please follow the internet link. (https://tud.link/i8zf)

Organizer

Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI)Chemnitzer Straße46b, 2. OG01187Dresden

Phone: +49 351 463-40900
E-Mail: ScaDS.AI
Homepage: https://scads.ai

Scan this code with your smartphone and get directly this event in your calendar. Increase the image size by clicking on the QR-Code if you have problems to scan it.

Legend

BiBiology
ChChemistry
CiCivil Eng., Architecture
CoComputer Science
EcEconomics
ElElectrical and Computer Eng.
EnEnvironmental Sciences
Sfor Pupils
LaLaw
CuLinguistics, Literature and Culture
MtMaterials
MaMathematics
McMechanical Engineering
MeMedicine
PhPhysics
PsPsychology
SoSociety, Philosophy, Education
SpSpin-off/Transfer
TrTraffic
TgTraining
WlWelcome