Co

Living Lab Lecture No. 30: Compressing Large Language Models

Date
Mar 6, 2025
Time
11:00 AM - 12:00 PM
Speaker
Aaron Klein
Affiliation
ScaDS.AI Dresden/Leipzig
Series
ScaDS.AI Lecture Series
Language
en
Main Topic
Informatik
Host
ScaDS.AI Dresden/Leipzig
Description
Large Language Models (LLMs) mark a new era in Artificial Intelligence. However, their large size poses significant challenges for inference in real-world applications due to substantial GPU memory requirements and high inference latency. In this talk, we discuss techniques to compress pre-trained LLMs, reducing their resource consumption during inference while maintaining their performance. More specifically, we approach the problem from a multi-objective Neural Architecture Search (NAS) perspective to jointly optimize performance and efficiency. By considering the LLM as a super-network consisting of a large but finite number of sub-networks, we can identify a set of Pareto-optimal sub-networks that balance parameter count and validation performance. We empirically demonstrate that using NAS techniques for fine-tuning enhances the prunability of pre-trained LLMs and explore how this impacts real-world applications.
Links

Last modified: Feb 17, 2025, 4:08:07 PM

Location

Online, please follow the internet link. (https://tud.link/i8zf)

Organizer

Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI)Chemnitzer Straße46b, 2. OG01187Dresden
Phone
+49 351 463-40900
E-Mail
ScaDS.AI
Homepage
https://scads.ai
Scan this code with your smartphone and get directly this event in your calendar. Increase the image size by clicking on the QR-Code if you have problems to scan it.
  • BiBiology
  • ChChemistry
  • CiCivil Eng., Architecture
  • CoComputer Science
  • EcEconomics
  • ElElectrical and Computer Eng.
  • EnEnvironmental Sciences
  • Sfor Pupils
  • LaLaw
  • CuLinguistics, Literature and Culture
  • MtMaterials
  • MaMathematics
  • McMechanical Engineering
  • MeMedicine
  • PhPhysics
  • PsPsychology
  • SoSociety, Philosophy, Education
  • SpSpin-off/Transfer
  • TrTraffic
  • TgTraining
  • WlWelcome