An ML-powered web application that automates resume screening and candidate ranking using TF-IDF and K-Nearest Neighbors.
- Overview
- Objectives
- Core Features
- System Architecture
- Ranking Algorithm
- Tech Stack
- Setup & Installation
- Usage
- Screenshots
Finding the right candidate from hundreds of applications is one of the most time-consuming challenges facing HR teams today. Manual resume review is slow, inconsistent, and expensive.
The Smart Recruitment and Resume Ranking System is a Django-based web application that automates the end-to-end screening process. Recruiters post job vacancies with requirements; applicants apply and upload their CVs. The system then:
- Parses CV text and optional GitHub/LinkedIn profile data
- Processes that data through an NLP pipeline (tokenization, stemming, stop-word removal)
- Scores each candidate using TF-IDF weighted vectors + KNN cosine similarity
- Ranks applicants and surfaces the top matches to the recruiter
The result: faster, fairer, data-driven hiring decisions.
| Goal | Description |
|---|---|
| π Find the Best Candidates | Identify the most qualified applicants for each vacancy automatically |
| π Realistic Ranking | Score candidates on real skills and experience, not just keyword density |
| π Flexible Data Sources | Ingest CVs, GitHub profiles, and LinkedIn profiles for a complete picture |
| β±οΈ Save Time & Cost | Eliminate manual screening and reduce recruiter workload significantly |
- Post Job Vacancies β Create detailed listings with required skills, experience levels, and qualifications
- Unified Dashboard β Review all applications and submissions in one place
- Automatic Ranking β Candidates are scored and sorted by CV match and skill alignment
- Shortlist & Invite β Easily shortlist top-ranked candidates for interview rounds
- Browse & Filter Jobs β Search by location, salary, and employment type
- Apply with CV & Cover Letter β Submit applications directly through the platform
- Link External Profiles β Connect GitHub and LinkedIn profiles for a richer application
- Take Assessments β Participate in job-specific assessments that factor into the final score
- Multi-source data extraction (CV, GitHub, LinkedIn)
- Automated skill identification and job-requirement matching
- TF-IDF + KNN intelligent candidate scoring
- Filtering by CGPA, degree, and years of experience
Candidates
β
ββ Search Jobs
ββ Apply & Upload CV
ββ Take Assessments
β
βΌ
βββββββββββββββββ
β CVs Database β
βββββββββ¬ββββββββ
β
βΌ
βββββββββββββββββββββ
β CV Ranking Engine β
β (Parse & Analyse) β
βββββββββ¬ββββββββββββ
β
βΌ
Shortlisted CVs βββΊ Recruiter Dashboard
The ranking pipeline processes candidate data from three sources through a series of NLP and ML stages:
Input Sources NLP Pipeline ML Scoring
ββββββββββββββββ ββββββββββββββββββββββ ββββββββββββββββββββ
β CV Upload β β 1. Parse & Extract β β TF-IDF Vectors β
β GitHub URL βββββΊβ 2. Clean Text βββββΊβ + β
β LinkedIn URLβ β 3. Tokenise (NLTK) β β KNN Cosine Sim β
ββββββββββββββββ β 4. Remove Stopwordsβ ββββββββββ¬ββββββββββ
β 5. Stem & Lemmatiseβ β
ββββββββββββββββββββββ βΌ
Ranked Candidate List
(Top K recommendations)

Before scoring, candidates are pre-filtered against hard requirements:
- Minimum CGPA / degree qualification
- Minimum required years of experience
Raw text from CVs and profiles is cleaned and normalised:
- Remove special characters, punctuation, and numbers
- Apply word stemming (Porter Stemmer)
- Apply verb lemmatisation for consistent word forms
Each keyword in a resume is weighted using TF-IDF:
weight(keyword) = TF(keyword) Γ IDF(keyword)
Where:
TF(keyword) = frequency of the keyword in the resume
IDF(keyword) = 1 for required skills
= 0 for unwanted / irrelevant skills
This ensures the model boosts candidates who demonstrate the exact skills the role demands.
TF-IDF weighted vectors from each CV are compared to the job description vector using cosine similarity via the K-Nearest Neighbors algorithm. The closer the angle between vectors, the higher the match.
Final Score = CV Match Score + Assessment Score
All candidates are ranked by their final score and the top K candidates (default: 20) are returned to the recruiter as recommendations.
| Layer | Technology |
|---|---|
| Web Framework | Django 3.2 |
| Language | Python 3.9 |
| Database | SQLite 3 |
| ML / Similarity | scikit-learn 1.3 (TF-IDF, KNN) |
| NLP | NLTK 3.8 (tokenisation, stemming, lemmatisation) |
| PDF Parsing | PyPDF2 3.0 |
| Other NLP | inflect, stop-words |
π‘ For Windows-specific setup with Python 3.9, see INSTRUCTIONS.md.
- Python 3.9 (required β Django 3.2 + pinned ML libs are not compatible with 3.11/3.12)
- pip
- Git
1. Clone the repository
git clone <repository-url>
cd Smart-Recruitment-System2. Create and activate a virtual environment
# Windows
py -3.9 -m venv venv
.\venv\Scripts\activate
# macOS / Linux
python3.9 -m venv venv
source venv/bin/activate3. Install dependencies
pip install -r requirements.txt4. Download NLTK data (one-time)
python -c "import nltk; [nltk.download(p) for p in ['punkt','stopwords','wordnet','omw-1.4']]"5. Run database migrations
python manage.py migrate6. Create a superuser (recruiter / admin)
python manage.py createsuperuser7. Start the development server
python manage.py runserverThe application will be available at http://127.0.0.1:8000/ Django Admin panel: http://127.0.0.1:8000/admin/
- Log in with your superuser account (or any account with
is_staff=True) - Post a new job vacancy with required skills and qualifications
- Wait for candidates to apply
- Open the ranking dashboard to view automatically scored and sorted applicants
- Shortlist top candidates for interview
- Register via the sign-up page
- Browse job listings and filter by your preferences
- Apply to a vacancy β upload your text-based PDF CV and cover letter
- Optionally link your GitHub and LinkedIn profile URLs
- Complete any assessments associated with the role
β οΈ CV Format: CVs must be text-based PDFs. Scanned / image-only PDFs cannot be parsed and will result in a zero-text extraction.
| Role | How to set | Permissions |
|---|---|---|
| Recruiter | is_staff = True (Django Admin or superuser) |
Post jobs, view rankings, shortlist candidates |
| Candidate | Default (normal sign-up) | Browse jobs, apply, take assessments |
To promote an existing user to recruiter: Django Admin β Users β select user β tick is_staff
| Homepage | Job Listings | Job Details |
|---|---|---|
![]() |
![]() |
![]() |
| Apply for a Job | Sign Up | Log In |
|---|---|---|
![]() |
![]() |
![]() |






