LoRA 101
Start with the mental model, then learn why low-rank adapters work, how LoRA training differs from full fine-tuning, and when QLoRA or full fine-tuning makes more sense.
Start guide →What LoRA Is
Start with the core LoRA mental model: a frozen base model plus a small trainable adapter.
The Low-Rank Trick
Understand how two small matrices can represent a useful model update and why rank controls adapter capacity.
How LoRA Training Works
Follow the training loop and see why only adapter weights update while the frozen base model still runs.
Using LoRA In Practice
Learn how adapters are saved, loaded, swapped, merged, and placed into target modules.
QLoRA And Tradeoffs
Learn how quantization changes the memory picture and when LoRA is a strong fit versus the wrong tool.
Each lesson has a quick usefulness check. I only show the public useful count; written notes stay private and help shape future revisions.
LoRA: Low-Rank Adaptation of Large Language Models ↗
The original LoRA paper from Hu et al.
QLoRA: Efficient Finetuning of Quantized LLMs ↗
The QLoRA paper from Dettmers et al.
Hugging Face PEFT LoRA Guide ↗
Practical implementation documentation for PEFT LoRA.
Thinking Machines LoRA Primer ↗
Product-oriented LoRA guidance on rank, capacity, target modules, and tuning.
Thinking Machines: LoRA Without Regret ↗
A deeper discussion of when LoRA can match full fine-tuning.