LoRA: Reducing Trainable Parameters
2

Introduction

We started this course by discussing the basics of fine-tuning, before fine-tuning our own LLM on top of OpenAI's hosted fine-tuning service. However, the fact of the matter is that OpenAI's hosted platform is a black box and lacks flexibility. For most serious fine-tuning use cases, we'll want precise control over which models we're using and how they're trained. If you're new to the world of LLMs, this might seem like a daunting task. But, as I hope you'll realize by the end of the next few lessons, it's not as off-limits as it seems.

Before we can dive into our custom fine-tuning implementations, there are two key concepts that we need to learn about: low-rank adaptation (LoRA) and quantization. Both of these concepts revolve around one central theme: making LLMs more accessible by minimizing the required resources to fine-tune and run them.

LoRA

In this lesson, we'll start by understanding the problem of LLM accessibility and then we'll dive deep into the math and insights behind LoRA. In the next lesson, we'll learn all about quantization. Let's get started!