How are large language models trained? Pre-training

From the course: Introduction to Large Language Models

Start my 1-month free trial Buy for my team

How are large language models trained? Pre-training

“

- [Instructor] We've seen an example of a large language model at work and the results are pretty impressive. But how do you go about training a large language model? And that's what we're going to look at in this video. Initially, the language model has random weights, and at this point, the model has no knowledge of language. Now, if you were to prompt it, it would just return gibberish. But if you train the model and pass it a large corpus of data, it adjusts these weights as part of the training process. And this pre-training stage is very resource heavy. So you need lots of data, and this includes a variety of different types of data like books and articles and websites. Let me give you an example. LLaMA was a group of language models that were released in 2023 by Meta. And this is a data mixture of Meta used for pre-training. So Common Crawl and C4 are web scripts of the internet that have been cleaned and filtered.…

Unlock the full course today

Join today to access over 23,200 courses taught by industry experts.

How are large language models trained? Pre-training

From the course: Introduction to Large Language Models

How are large language models trained? Pre-training

Download courses and learn on the go

Contents

Explore Business Topics

Explore Creative Topics

Explore Technology Topics