Generative AI shows up heavily on the AWS AI Practitioner exam. This chapter follows Domain 2 in the official guide.
Models process text as tokens (subword pieces). Pricing, latency, and context limits are usually expressed in tokens, not raw character counts.
For long documents, chunking splits text into segments that fit context windows and feed retrieval or summarization pipelines.
An embedding is a dense numeric vector representing text, images, or other inputs. Similar meanings tend to map to nearby vectors, which powers semantic search and RAG.
Crafting instructions, examples, and constraints so a model produces reliable outputs. See Domain 3 for techniques (few-shot, chain-of-thought, etc.).
Transformer architectures enabled large-scale pre-training on broad data. A foundation model is a large model pre-trained for general capabilities, then adapted (prompting, fine-tuning, tools) to specific tasks.
- Multimodal models accept or produce more than one modality (text + image).
- Diffusion models are a common family for image/audio generation (iterative denoising).
Data selection → model selection → pre-training → adaptation (fine-tuning / prompting) → evaluation → deployment → feedback loops.
- Adaptability across tasks with prompting and light integration.
- Rapid prototyping of assistants, summarization, drafting, and code help.
- Developer productivity when used with review and tests.
- Hallucinations: plausible but false outputs.
- Nondeterminism: temperature and sampling change outputs run-to-run.
- Interpretability: hard to audit internal reasoning.
- Compliance: data handling, IP, and licensing constraints.
Latency, cost, context length, modality (text vs image), multilingual needs, fine-tuning or private customization requirements, regional availability, and organizational policies.
Examples: conversion rate, average handle time, customer satisfaction, revenue per user, defect rate in generated content (human review burden).
| Need | Typical AWS building blocks |
|---|---|
| Managed FM access | Amazon Bedrock |
| Experimentation / UI prototyping | Amazon Bedrock PartyRock (conceptual playground) |
| Pre-built models & notebooks | Amazon SageMaker JumpStart |
| Enterprise Q&A on company data | Amazon Q Business |
| Developer assistance in the IDE | Amazon Q Developer |
| AWS-native multimodal foundation models | Amazon Nova (family name in exam scope) |
You get security and compliance options, IAM integration, encryption, VPC patterns, regional footprint, and managed operations (less undifferentiated work for your team). You still pay for it: cost tradeoffs (token pricing, provisioned throughput, redundancy) matter.
- On-demand token usage vs provisioned throughput for steady traffic.
- Higher availability / multi-AZ patterns vs cost.
- Larger / more capable models vs latency and price per request.
- Define embedding in your own words and give one use case.
- Name two disadvantages of GenAI that matter in regulated industries.
- For Bedrock, Q Business, and JumpStart, assign each one: managed FM consumption, enterprise knowledge assistant, or ML hub with pre-trained models.