Llama-3.2-1B Instruction Fine-Tuning with QLoRA

This repository contains a full-stack LLM fine-tuning project where Llama-3.2-1B-Instruct was optimized for high-fidelity instruction following using the Alpaca-Cleaned dataset.

🚀 Live Demo

[Link to your Hugging Face Space]

🎯 Project Overview

The goal was to fine-tune a lightweight LLM to follow complex instructions while maintaining a minimal memory footprint. Using Unsloth and 4-bit QLoRA, I reduced VRAM usage by 75% and accelerated training by 2x compared to standard LoRA implementations.

🛠️ Technical Stack & Methods

Base Model: Llama-3.2-1B-Instruct
Fine-Tuning Technique: QLoRA (Quantized Low-Rank Adaptation)
Quantization: 4-bit NormalFloat (NF4)
Optimization Library: Unsloth
Hardware: NVIDIA Tesla T4 GPU (16GB VRAM)

Hyperparameters

Parameter	Value
LoRA Rank (r)	16
LoRA Alpha	16
Learning Rate	2e-4
Batch Size	2
Gradient Accumulation	4
Optimizer	AdamW 8-bit

📊 Results & Evaluation

1. Training Convergence

The model was trained for 60 steps, showing a steady decline in cross-entropy loss from 2.03 to 1.39, indicating successful alignment without overfitting.

2. Quantitative Metrics (ROUGE)

Evaluated on a held-out test set of 15 samples:

ROUGE-1: 0.464
ROUGE-2: 0.262
ROUGE-L: 0.386 (Standard metric for instruction following)

3. Qualitative Comparison

Prompt	Fine-Tuned Response
Explain Recursion to a 5-year old	"Recursion is a super cool way that computers solve big problems by breaking them into smaller, identical pieces..."
3 Healthy Breakfast Ideas	"1. Oatmeal with berries... 2. Greek yogurt parfait... 3. Avocado toast..."

📦 Deployment & Portability

The model was exported into GGUF (Q4_K_M) format. This allows the model to run on consumer CPUs (Mac, Windows, Linux) via Ollama or LM Studio.

File size reduction:

Original FP16: ~2.5 GB
Quantized GGUF: ~700 MB

🔧 How to Use

Clone the repo.
Install dependencies: pip install -r requirements.txt.
Run inference: python scripts/inference.py.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
outputs		outputs
.gitattributes		.gitattributes
.gitignore		.gitignore
Llama-3.2-1B-Instruct.Q4_K_M.gguf		Llama-3.2-1B-Instruct.Q4_K_M.gguf
README.md		README.md
app.py		app.py
llama_finetune.ipynb		llama_finetune.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Llama-3.2-1B Instruction Fine-Tuning with QLoRA

🚀 Live Demo

🎯 Project Overview

🛠️ Technical Stack & Methods

Hyperparameters

📊 Results & Evaluation

1. Training Convergence

2. Quantitative Metrics (ROUGE)

3. Qualitative Comparison

📦 Deployment & Portability

🔧 How to Use

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Llama-3.2-1B Instruction Fine-Tuning with QLoRA

🚀 Live Demo

🎯 Project Overview

🛠️ Technical Stack & Methods

Hyperparameters

📊 Results & Evaluation

1. Training Convergence

2. Quantitative Metrics (ROUGE)

3. Qualitative Comparison

📦 Deployment & Portability

🔧 How to Use

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages