feat(validation): add workflow image vetting#739
Conversation
4e42ba2 to
1086dd2
Compare
| steps = reana_yaml["workflow"].get("specification", {}).get("steps", []) | ||
|
|
||
| for step in steps: | ||
| step_image = step.get("environment", None) |
There was a problem hiding this comment.
I was a bit unsure about this - is this the correct place to get the user workflow images from?
1086dd2 to
476b200
Compare
476b200 to
463cf5a
Compare
463cf5a to
d415746
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #739 +/- ##
==========================================
+ Coverage 60.06% 60.19% +0.12%
==========================================
Files 33 33
Lines 4074 4092 +18
==========================================
+ Hits 2447 2463 +16
- Misses 1627 1629 +2
🚀 New features to boost your workflow:
|
Extend `validate_workflow` to check that all images used in workflows are authorized. Such authorized images are specified via the `vetted_container_images` Helm value. Also add details of container vetting to the `info` API endpoint.
d415746 to
52fce39
Compare
Extend `validate_workflow` to check that all images used in workflows are authorized. Such authorized images are specified via the `vetted_container_images` Helm value. Also add details of container vetting to the `info` API endpoint.
52fce39 to
7b63032
Compare
Extend `validate_workflow` to check that all images used in workflows are authorized. Such authorized images are specified via the `vetted_container_images` Helm value. Also add details of container vetting to the `info` API endpoint.
7b63032 to
c5e1c16
Compare
| ) | ||
|
|
||
|
|
||
| def validate_images(reana_yaml: Dict) -> None: |
There was a problem hiding this comment.
There was a problem hiding this comment.
Yes good idea, moved image extraction logic to reana-commons
| for step in steps: | ||
| image = step.get("environment", None) | ||
| if image and image not in allowed_images: | ||
| raise REANAValidationError(f"Image not allowed: {image}") |
There was a problem hiding this comment.
The server ignores falsey environments here. Snakemake jobs without an explicit image use REANA_DEFAULT_SNAKEMAKE_ENV_IMAGE client fallback. Maybe let's also vet the resolved default image?
There was a problem hiding this comment.
We don't need to vet the default image because both the list of vetted images and the default snakemake env image are set by cluster admins (i.e. REANA_DEFAULT_SNAKEMAKE_ENV_IMAGE is trustworthy so doesn't need to be vetted).
| REANA_VETTED_CONTAINER_IMAGES = json.loads( | ||
| os.getenv("REANA_VETTED_CONTAINER_IMAGES", "{}") | ||
| ) | ||
| """Container images that users are allowed to use in their workflows.""" |
There was a problem hiding this comment.
Oh ye nice catch, updated the default now.
| }, | ||
| "vetted_container_images_enabled": { | ||
| "title": "Whether container image vetting is enabled", | ||
| "value": "False" |
There was a problem hiding this comment.
The schema declares a boolean, but the example uses "False" as a string in reana-server and the generated reana-commons snapshot. Use JSON false and regenerate the specification.
|
Maybe we can add more engine specific tests as well? |
Extend `validate_workflow` to check that all images used in workflows are authorized. Such authorized images are specified via the `vetted_container_images` Helm value. Also add details of container vetting to the `info` API endpoint.
c5e1c16 to
7afd768
Compare
Extend `validate_workflow` to check that all images used in workflows are authorized. Such authorized images are specified via the `vetted_container_images` Helm value. Also add details of container vetting to the `info` API endpoint.
7afd768 to
4e8a17a
Compare
|
Some tests are failing due to dependency on new reana-commons version: reanahub/reana-commons#497 All tests pass locally:
|
Extend `validate_workflow` to check that all images used in workflows are authorized. Such authorized images are specified via the `vetted_container_images` Helm value. Also add details of container vetting to the `info` API endpoint.
4e8a17a to
0f6e1e5
Compare
Extend `validate_workflow` to check that all images used in workflows are authorized. Such authorized images are specified via the `vetted_container_images` Helm value. Also add details of container vetting to the `info` API endpoint.
0f6e1e5 to
d302ec0
Compare
| ] | ||
| ], | ||
| ), | ||
| vetted_container_images_enabled=dict( |
There was a problem hiding this comment.
🤖 reana-client-go info drops vetted image settings (MEDIUM)
The Python client seems fine here, but reana-client-go info will not show these new settings yet. Its generated InfoOKBody model does not include vetted_container_images_enabled or vetted_container_images_allowlist, and the human-readable output in cmd/info.go is hardcoded field by field. As a result, both plain output and likely --json output discard these fields even though the server OpenAPI now exposes them.
Could we please update reana-client-go as part of this change as well? The fix should be to regenerate the Go client from the updated reana-server/docs/openapi.json, add the two fields to cmd/info.go, and update the info command fixtures/tests.
There was a problem hiding this comment.
Fixed in a new reana-client-go PR: reanahub/reana-client-go#182
9953490 to
6053d6d
Compare
Extend `validate_workflow` to check that all images used in workflows are authorized. Such authorized images are specified via the `vetted_container_images` Helm value. Also add details of container vetting to the `info` API endpoint.
6053d6d to
1ad65b8
Compare
Extend `validate_workflow` to check that all images used in workflows are authorized. Such authorized images are specified via the `vetted_container_images` Helm value. Also add details of container vetting to the `info` API endpoint. Co-authored-by: Tibor Šimko <tibor.simko@cern.ch>
tiborsimko
left a comment
There was a problem hiding this comment.
🤖 GitLab webhook path publishes workflows without image vetting (HIGH)
reana_server/rest/workflows.py:540-613 short-circuits to publish_workflow_submission when git_data is present, bypassing the start endpoint where validate_workflow() → validate_images() normally runs. With vetting enabled, an unvetted image referenced in a GitLab-tracked reana.yaml will still be queued and executed.
Adding validate_images() only at create-time isn't enough on its own: GitLab reana.yaml files commonly use workflow.file:, so the initial yaml.safe_load() carries no workflow.specification and the early check no-ops. The specification is materialised later by _load_and_save_yadage_spec / load_reana_spec at lines 600-610. The vetting check therefore has to run after that load.
Suggested fix — add the import and two validate_images() calls:
from reana_server.validation import (
validate_images,
validate_inputs,
validate_workflow,
validate_workspace_path,
validate_dask_limits,
) validate_inputs(reana_spec_file)
validate_images(reana_spec_file) # catches inline-spec workflows
validate_dask_limits(reana_spec_file) elif workflow.type_ in ["cwl", "snakemake"]:
reana_yaml_path = os.path.join(workflow.workspace_path, "reana.yaml")
workflow.reana_specification = load_reana_spec(
reana_yaml_path, workflow.workspace_path
)
Session.commit()
validate_images(workflow.reana_specification) # closes the GitLab bypass
parameters = request.json
publish_workflow_submission(workflow, user.id_, parameters)Worth adding a tests/test_views.py case that posts a git_data create with a disallowed image and asserts the webhook returns 400 before publish_workflow_submission is reached.
I've added a quick fix to the PR and squashed. |
1ad65b8 to
32d7491
Compare

Extend
validate_workflowto check that all images used in workflows are authorized. Such authorized images are specified via thevetted_container_imagesHelm value. Also add details of container vetting to theinfoAPI endpoint.Closes #728