Instructor-Led Workshop
Efficient Large Language Model (LLM) Customization

Enterprises need to execute language-related tasks daily, such as text classification, content generation, sentiment analysis, and customer chat support, and they seek to do so in the most cost-effective way. Large language models can automate these tasks, and efficient LLM customization techniques can increase a model’s capabilities and reduce the size of models required for use in enterprise applications.

In this course, you'll go beyond prompt engineering LLMs and learn a variety of techniques to efficiently customize pretrained LLMs for your specific use cases—without engaging in the computationally intensive and expensive process of pretraining your own model or fine-tuning a model's internal weights. Using NVIDIA NeMo™ service, you’ll learn various parameter-efficient fine-tuning methods to customize LLM behavior for your organization.

 

Learning Objectives
 

By participating in this workshop, you’ll learn how to:
  • Apply parameter-efficient fine-tuning techniques with limited data to accomplish tasks specific to your use cases
  • Use LLMs to create synthetic data in the service of fine-tuning smaller LLMs to perform a desired task
  • Leverage the NVIDIA NeMo service to customize models like GPT and LLaMA-2 with ease

Datasheet (PDF 92 KB)

Workshop Outline

Introduction
(15 mins)
Parameter-Efficient Fine-Tuning Essentials
(115 mins)

    Investigate PEFT techniques like low-rank adaptation (LoRA) and p-tuning to tailor LLMs for specific tasks using limited data:

  • Grasp the principles of PEFT techniques like LoRA and p-tuning and how they efficiently fine-tune LLMs without direct modification to the LLM.
  • Learn how to acquire and prepare data for use in parameter-efficient fine-tuning.
  • Perform LoRA and p-tuning on a variety of GPT LLMs while quantitatively analyzing fine-tuned model performance.
Break (45 mins)
PEFT for Reduced Model Sizes
(115 mins)

    Learn a set of techniques for using larger prompt-engineered LLMs to generate synthetic data, which can be used to create smaller parameter-efficient fine-tuned models capable of the same task:

  • Create LLM functionality specifically suited for common synthetic data generation tasks.
  • Perform LoRA using synthetic data from a larger model’s responses to fine-tune a smaller model capable of the same task.
  • Create a virtuous cycle of using LLMs to create synthetic data to fine-tune smaller LLMs capable of creating more synthetic data.
Break (30 minutes)
Engineering With Customized Models
(115 mins)

    Practice engineering application code that combines multiple fine-tuned models:

  • Create multiple fine-tuned small LLMs for sentiment analysis, extractive question answering, text generation, and persona creation.
  • Learn techniques for creating LLMs capable of speaking in your own style of writing.
  • Compose multiple small fine-tuned LLMs into meaningful application code.
Assessment and Q&A
(45 mins)

    Use the techniques from this workshop to create an application that uses several small fine-tuned LLMs to perform detailed analysis of customer emails and generate specific automatic responses.

 

Workshop Details

Duration: 8 hours

Price: Contact us for pricing.

Prerequisites:

Tools, libraries, and frameworks: Python, NVIDIA NeMo Service, GPT, LLaMA-2

Assessment type: Skills-based coding assessments evaluate the learner’s ability to efficiently customize and compose pretrained LLMs.

Certificate: Upon successful completion of the assessment, participants will receive an NVIDIA DLI certificate to recognize their subject matter competency and support professional career growth.

Hardware Requirements: Desktop or laptop computer capable of running the latest version of Chrome or Firefox. Each participant will be provided with dedicated access to a fully configured, GPU-accelerated workstation in the cloud.

Languages: English

Upcoming Public Workshops

If your organization is interested in boosting and developing key skills in AI, accelerated data science, or accelerated computing, you can request instructor-led training from the NVIDIA DLI.

Continue Your Learning with These DLI Trainings

Getting Started with Image Segmentation

Modeling Time-Series Data with Recurrent Neural Networks in Keras

Building Transformer-Based Natural Language Processing Applications

Building Intelligent Recommender Systems

Data Parallelism: How to Train Deep Learning Models on Multiple GPUs

Questions?