↑ Docs map · ← 06 · Identity · 07 · Playbooks · 08 · Build →
The content the controller (AAP or AWX) pulls from this repo and runs against the Meridian Fleet inventory. Each playbook is launched by a job template that Event-Driven Ansible triggers — pull (ServiceNow incident) or push (approved change / catalog request).
- Role-aware: playbooks act on the host's
serviceinventory variable (hr-portal,crm,ged,httpd,postgresql,postfix) rather than a hard-coded service, so one playbook covers every server. Host vars come from the CMDB (see 03 · Architecture). - ServiceNow: the controller's
ServiceNow PDIcredential injectsSN_HOST/SN_USERNAME/SN_PASSWORD; theservicenow.itsmcollection reads them. - Target:
target_host(an inventory host) and the record id come from the rulebook as extra vars.
By default the pull pattern is triggered by an incident someone (or a test) opens. The
monitor-health activation makes it self-driving: the rulebook
extensions/eda/rulebooks/monitor_health_open_incident.yml runs ansible.eda.url_check against each
app's /health and, on a failure, launches the Open Incident job template
(open_incident.yml). That incident lands in the Auto-Remediation
group and the pull-incident-remediation activation remediates it — no human in the loop:
Validate with python3 tests/scenarios/3_monitor_selfheal.py (it only injects the fault — the monitor
opens the ticket).
| Playbook | Does | Pattern / trigger | Targets |
|---|---|---|---|
open_incident.yml |
open an Auto-Remediation incident for a down server (deduplicated) | monitor — url_check | all |
restart_service.yml |
restart the host's service, clear the FastAPI "degraded" flag, re-check, resolve/escalate the incident |
pull — incident | all |
execute_change.yml |
record a change marker, restart the service, verify, annotate the change |
push — approved change | all |
collect_diagnostics.yml |
service status + disk + memory + recent logs → incident work note (read-only) | pull — incident | all |
free_disk.yml |
vacuum journal, drop rotated logs, clear dnf cache; re-check + resolve/escalate | pull — "disk full" incident | all |
db_create_role.yml |
create/reconcile a PostgreSQL login role (+ optional CONNECT grant) | push — change/catalog | db |
db_apply_migration.yml |
apply a tracked .sql migration to a database, exactly once |
push — change | db |
db_status.yml |
version, uptime, connections, database sizes → work note | pull / on-demand | db |
db_backup.yml |
pg_dump -Fc a database to an archive + prune old ones |
scheduled / on-demand | db |
patch_os.yml |
dnf update; flag (or, with allow_reboot, perform) a reboot |
change / scheduled | all |
housekeeping.yml |
force logrotate, vacuum journal, prune old DB dumps + /var/tmp |
scheduled | all |
restart_service_selfservice.yml |
restart a chosen server's service from a catalog request, then close the request item | pull — catalog | all |
provision_employee.yml |
onboard a new joiner: create their Keycloak identity (+ Employees group), then close the request | pull — catalog | — |
The DB playbooks (the
db_*ones) are unlocked by the real PostgreSQL on the*-dbservers (PGDG). They connect as thepostgressuperuser through peer auth (psql/pg_dumprun as thepostgresOS user viarunuser, over the local socket) — so no password and nopsycopg2in the execution environment. SQL migrations live inplaybooks/files/migrations/, tracked per-database inmeridian_schema_migrations. Smoke-test the lifecycle againsthr-db-01withpython3 tests/scenarios/7_db_admin_lifecycle.py.The
hrdatabase holds HR business data —001_leave_requests.sqlprovisions aleave_requeststable + a read-onlyhr_approle that the HR Portal reads live (employee identity lives in Keycloak, not here).
↑ Docs map · ← 06 · Identity · 07 · Playbooks · 08 · Build →