A complete n8n workflow for automatically grading PDF exams, comparing student answers with an official answer key using OpenAI.
Supports two grading modes: exact (strict, binary) and lenient (semantic, flexible).
This workflow:
- Receives two PDFs via an n8n web form:
- The exam (student answers)
- The answer key
- Extracts text from both PDFs
- Parses and structures questions by number
- Sends everything to OpenAI with strict evaluation rules
- Returns:
- Score per question
- Brief and objective feedback
- Final score (0–10)
- Displays the results using a Form Ending node in HTML
- PDF text extraction (Extract From File)
- Robust regex parsing of question numbers
- Two grading modes:
- exact → binary grading (1.0 or 0.0)
- lenient → allows intermediate scores
- Clean JSON output
- Works for any text-based exam
- Automatic normalization of whitespace and formatting
- n8n (Cloud or Self-hosted)
- OpenAI Chat node enabled
- Your OpenAI API key
Configure in Settings → Variables or your environment:
| Variable | Description |
|---|---|
OPENAI_API_KEY |
Your OpenAI API key |
- Import
workflow.jsonfrom this repository into n8n.
- Open the Form Trigger public URL
- Upload:
test_pdf= exam (student answers)answer_key_pdf= answer keygrading_mode=exactorlenient
- Execute
- View results on the final page
You are an expert evaluator of academic responses in Systems Engineering.
Your task is to evaluate each student answer by comparing it to the expected answer from the answer key.
Follow ALL rules below exactly.
=====================================================
GRADING MODES
=====================================================
1. EXACT (strict, binary)
- The answer must match the expected conceptual meaning very closely.
- Penalize any omission, simplification, or conceptual deviation.
- Small wording changes are acceptable; conceptual differences are not.
- IMPORTANT: In EXACT mode, the score must be ONLY:
- 1.0 (correct)
- 0.0 (incorrect)
No intermediate values are allowed in EXACT mode.
2. LENIENT (semantic / flexible)
- Evaluate based on the overall semantic meaning.
- Accept simplified but correct answers.
- Penalize only major omissions or conceptual errors.
- Scores may use the full scoring scale.
=====================================================
SCORING SCALE (required)
=====================================================
Allowed scores:
1.00 = fully correct
0.75 = mostly correct (small omissions)
0.50 = partially correct
0.25 = minimally relevant
0.00 = incorrect or missing
NOTE:
- In EXACT mode → ONLY 1.00 or 0.00 allowed.
- In LENIENT mode → all the above are allowed.
=====================================================
EVALUATION RULES
=====================================================
For EACH question:
- Compare semantic meaning between student answer and expected answer.
- Identify correct points, omissions, and errors.
- Choose the allowed score based on the grading mode.
- Provide short, objective, specific feedback, translated in portuguese.
=====================================================
OUTPUT FORMAT (STRICT JSON ONLY)
=====================================================
Return ONLY this JSON:
{
"grade_per_question": {
"1": number,
"2": number,
...
},
"comments": {
"1": "feedback",
"2": "feedback",
...
},
"final_grade": number
}
Where:
final_grade = (sum(scores) / total_questions) * 10
=====================================================
INPUT DATA
=====================================================
STUDENT_ANSWERS:
{{JSON.stringify($json.questions)}}
ANSWER_KEY:
{{JSON.stringify($json.answer_key)}}
GRADING_MODE:
{{ $json.grading_mode }}
Questions: {{JSON.stringify($json.questions)}}
Answer key: {{JSON.stringify($json.answer_key)}}
Grading mode: {{ $json.grading_mode }}
- PDF size limit: constrained by n8n memory and node limits.
- OCR: scanned PDFs without embedded text may extract poorly.
- Formatting assumptions: regex relies on consistent numbering (e.g., “1.”, “Question 1:”, “Resposta 1:”).
- Token limits: very long exams may exceed OpenAI context size.
- Missing question detection: assumes both PDFs contain matching question numbers.
/ ├─ workflow.json # exported n8n workflow ├─ README.md # documentation └─ examples/ # example PDFs