Skip to content

MGMT-24699: Assisted Service IPv6 CIDR comparison may fail with non-normalized CIDRs#10569

Open
shay23bra wants to merge 1 commit into
openshift:masterfrom
shay23bra:MGMT-24699-Assisted-Service-IPv6-CIDR-comparison-may-fail-with-non-normalized-CIDRs
Open

MGMT-24699: Assisted Service IPv6 CIDR comparison may fail with non-normalized CIDRs#10569
shay23bra wants to merge 1 commit into
openshift:masterfrom
shay23bra:MGMT-24699-Assisted-Service-IPv6-CIDR-comparison-may-fail-with-non-normalized-CIDRs

Conversation

@shay23bra

@shay23bra shay23bra commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Normalize IPv6 CIDRs to canonical form at controller CIDR ingestion points and in network comparison functions
  • PostgreSQL's inet column auto-normalizes IPv6 CIDRs (e.g. fcff:0069:0001::/64fcff:69:1::/64), but Go string comparison treated them as different, causing infinite reconciliation loops
  • This is the same class of bug fixed for VIPs in MGMT-24393 (PR MGMT-24393: Assisted Service IPv6 checks fails comparative #10340) but affecting CIDR comparisons for cluster, service, and machine networks

Changes

  • Added cidrsEqual() helper in internal/network/utils.go (same pattern as existing ipsEqual())
  • Updated AreMachineNetworksIdentical / AreServiceNetworksIdentical / AreClusterNetworksIdentical to use CIDR-aware comparison
  • Normalized CIDRs on ingestion in controller (common.go): clusterNetworksEntriesToArray, serviceNetworksEntriesToArray, machineNetworksEntriesToArray
  • Added unit tests covering non-canonical, mixed-case, and different IPv6 CIDR scenarios

List all the issues related to this PR

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Elaborate on how it was tested). Same root cause as MGMT-24393 — PostgreSQL normalizes IPv6 CIDRs on storage regardless of input form. The fix normalizes CIDRs at controller ingestion and uses CIDR-aware comparison, preventing the string mismatch that caused continuous reconciliation.
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see CONTRIBUTING guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

Summary by CodeRabbit

  • Bug Fixes
    • Network CIDR values are now handled more consistently, including IPv6 formats written in different but equivalent forms.
    • Network comparison checks now treat canonical and non-canonical CIDRs as the same, reducing false differences during validation.
    • Cluster, service, and machine network entries are now normalized before use when possible, improving matching and comparison behavior.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jul 1, 2026
@openshift-ci-robot

openshift-ci-robot commented Jul 1, 2026

Copy link
Copy Markdown

@shay23bra: This pull request references MGMT-24699 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Summary

  • Normalize IPv6 CIDRs to canonical form at controller CIDR ingestion points and in network comparison functions
  • PostgreSQL's inet column auto-normalizes IPv6 CIDRs (e.g. fcff:0069:0001::/64fcff:69:1::/64), but Go string comparison treated them as different, causing infinite reconciliation loops
  • This is the same class of bug fixed for VIPs in MGMT-24393 (PR MGMT-24393: Assisted Service IPv6 checks fails comparative #10340) but affecting CIDR comparisons for cluster, service, and machine networks

Changes

  • Added cidrsEqual() helper in internal/network/utils.go (same pattern as existing ipsEqual())
  • Updated AreMachineNetworksIdentical / AreServiceNetworksIdentical / AreClusterNetworksIdentical to use CIDR-aware comparison
  • Normalized CIDRs on ingestion in controller (common.go): clusterNetworksEntriesToArray, serviceNetworksEntriesToArray, machineNetworksEntriesToArray
  • Added unit tests covering non-canonical, mixed-case, and different IPv6 CIDR scenarios

List all the issues related to this PR

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Elaborate on how it was tested). Same root cause as MGMT-24393 — PostgreSQL normalizes IPv6 CIDRs on storage regardless of input form. The fix normalizes CIDRs at controller ingestion and uses CIDR-aware comparison, preventing the string mismatch that caused continuous reconciliation.
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see CONTRIBUTING guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai

coderabbitai Bot commented Jul 1, 2026

Copy link
Copy Markdown

Walkthrough

CIDR values are now normalized before use in Hive network entry-to-model conversions (cluster, service, machine networks) and a new cidrsEqual helper canonicalizes CIDRs for equality checks in cluster, service, and machine network identity comparisons. Tests add IPv6 canonical-vs-non-canonical equivalence cases.

Changes

CIDR Normalization

Layer / File(s) Summary
Normalize CIDR in Hive network conversions
internal/controller/controllers/common.go
clusterNetworksEntriesToArray, serviceNetworksEntriesToArray, and machineNetworksEntriesToArray normalize each entry's CIDR via network.NormalizeCIDR, using the normalized value on success and falling back to the original CIDR otherwise, before building the corresponding models.*Network struct.
cidrsEqual helper and network identity comparisons
internal/network/utils.go
New cidrsEqual(a, b models.Subnet) bool normalizes both CIDRs and compares canonical strings, falling back to raw string equality on normalization failure. AreMachineNetworksIdentical, AreServiceNetworksIdentical, and AreClusterNetworksIdentical use cidrsEqual for CIDR comparison in both dual-stack and non-dual-stack paths, with HostPrefix equality retained for cluster networks.
IPv6 canonicalization test coverage
internal/network/utils_test.go
New test cases assert that non-canonical and canonical IPv6 CIDR strings are treated as identical across the three "Identical" comparison functions.

Estimated code review effort: 2 (Simple) | ~10 minutes

Sequence Diagram(s)

Not applicable.

Related Issues: Not available from provided context.

Related PRs: Not available from provided context.

Suggested labels: bug, network

Suggested reviewers: Not available from provided context.

🐰 In burrows of code where IPs roam free,
I canonicalize each CIDR I see,
No more false alarms from a colon's disguise,
Now machine, service, cluster nets synchronize.

🚥 Pre-merge checks | ✅ 14 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (14 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly states the bug and the IPv6 CIDR comparison fix, matching the main change.
Description check ✅ Passed The description follows the template with summary, changes, issue type, impact, testing, and checklist filled in.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All changed Ginkgo titles are static strings; none include generated IDs, IPs, timestamps, or other run-specific values.
Test Structure And Quality ✅ Passed PASS: The added Ginkgo tests are narrow unit cases, use no cluster resources or waits, and follow the repo’s existing comparison-test patterns.
Microshift Test Compatibility ✅ Passed Only pure unit-test cases were added in internal/network/utils_test.go; no new e2e tests or MicroShift-unsupported OpenShift APIs/resources appear.
Single Node Openshift (Sno) Test Compatibility ✅ Passed PASS: The PR only adds unit tests for CIDR normalization/comparison and controller/network logic; no new Ginkgo e2e tests or SNO/HA assumptions were added.
Topology-Aware Scheduling Compatibility ✅ Passed Touched controller/network helpers only normalize CIDRs and compare networks; no affinity, nodeSelector, replicas, PDBs, or topology checks were added.
Ote Binary Stdout Contract ✅ Passed Touched files only normalize/compare CIDRs and add tests; no main/init/TestMain/RunSpecs or stdout/logging writes appear in the changed files.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed Only unit-test cases in internal/network/utils_test.go were added; they compare CIDRs in-process and add no external connectivity or IPv6-hostile URL/IP handling.
No-Weak-Crypto ✅ Passed Touched code only normalizes and compares CIDRs; no MD5/SHA1/DES/RC4/3DES/Blowfish/ECB, custom crypto, or secret/token comparisons found.
Container-Privileges ✅ Passed The PR only changes Go code/tests in internal/controller and internal/network; no K8s/container manifest fields for privileged, hostPID, hostNetwork, hostIPC, SYS_ADMIN, or allowPrivilegeEscalation...
No-Sensitive-Data-In-Logs ✅ Passed No new logging was added in the touched code; changes only normalize CIDRs and adjust comparisons/tests, with no sensitive data emitted.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@openshift-ci openshift-ci Bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jul 1, 2026
@openshift-ci openshift-ci Bot requested review from gamli75 and omer-vishlitzky July 1, 2026 11:31
@openshift-ci

openshift-ci Bot commented Jul 1, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: shay23bra

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 1, 2026
@shay23bra shay23bra force-pushed the MGMT-24699-Assisted-Service-IPv6-CIDR-comparison-may-fail-with-non-normalized-CIDRs branch 2 times, most recently from 545e24d to 34cfcc5 Compare July 1, 2026 11:34
PostgreSQL auto-normalizes IPv6 CIDRs on storage (e.g. fcff:0069:0001::/64
→ fcff:69:1::/64), but Go string comparison treated them as different,
causing infinite reconciliation loops in the controller. This is the same
class of bug fixed for VIPs in MGMT-24393 but affecting CIDR comparisons.
@shay23bra shay23bra force-pushed the MGMT-24699-Assisted-Service-IPv6-CIDR-comparison-may-fail-with-non-normalized-CIDRs branch from 34cfcc5 to e31c868 Compare July 1, 2026 11:40
@openshift-ci openshift-ci Bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 1, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
internal/controller/controllers/common.go (1)

324-364: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Extract duplicated normalize-with-fallback logic.

The identical cidr := entry...; if normalized, err := network.NormalizeCIDR(cidr); err == nil { cidr = normalized } pattern is repeated verbatim in clusterNetworksEntriesToArray, serviceNetworksEntriesToArray, and machineNetworksEntriesToArray. Extract a small helper, e.g. normalizeCIDROrRaw(cidr string) string, and call it from all three sites.

♻️ Proposed refactor
+func normalizeCIDROrRaw(cidr string) string {
+	if normalized, err := network.NormalizeCIDR(cidr); err == nil {
+		return normalized
+	}
+	return cidr
+}
+
 func clusterNetworksEntriesToArray(entries []hiveext.ClusterNetworkEntry) []*models.ClusterNetwork {
 	return funk.Map(entries, func(entry hiveext.ClusterNetworkEntry) *models.ClusterNetwork {
-		cidr := entry.CIDR
-		if normalized, err := network.NormalizeCIDR(cidr); err == nil {
-			cidr = normalized
-		}
-		return &models.ClusterNetwork{Cidr: models.Subnet(cidr), HostPrefix: int64(entry.HostPrefix)}
+		return &models.ClusterNetwork{Cidr: models.Subnet(normalizeCIDROrRaw(entry.CIDR)), HostPrefix: int64(entry.HostPrefix)}
 	}).([]*models.ClusterNetwork)
 }

Apply the same pattern to serviceNetworksEntriesToArray and machineNetworksEntriesToArray.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/controller/controllers/common.go` around lines 324 - 364, Extract
the repeated CIDR normalize-with-fallback logic into a shared helper, such as
normalizeCIDROrRaw, and use it from clusterNetworksEntriesToArray,
serviceNetworksEntriesToArray, and machineNetworksEntriesToArray. Keep the
helper in the same controller/common area so the three conversion functions can
call it directly, and preserve the existing behavior of returning the original
CIDR when NormalizeCIDR fails.
internal/network/utils_test.go (1)

473-482: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Consider adding a negative IPv6 test case.

Each new test only checks that non-canonical vs canonical forms of the same IPv6 CIDR are treated as identical. Adding a companion case with two genuinely different IPv6 CIDRs (expecting false) would guard against a regression where cidrsEqual silently treats all IPv6 inputs as equal (e.g., if normalization errors were mishandled).

Also applies to: 580-589, 715-724

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/network/utils_test.go` around lines 473 - 482, Add a negative IPv6
comparison test alongside the existing canonicalization cases in utils_test.go
to verify cidrsEqual returns false for truly different IPv6 CIDRs, not just
equivalent non-canonical/canonical forms. Update the relevant test table entries
in the IPv6 test groups near the existing “IPv6 non-canonical vs canonical”
cases by adding a pair of distinct CIDRs with expectedResult set to false, so
the behavior of cidrsEqual is covered against over-broad normalization.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@internal/controller/controllers/common.go`:
- Around line 324-364: Extract the repeated CIDR normalize-with-fallback logic
into a shared helper, such as normalizeCIDROrRaw, and use it from
clusterNetworksEntriesToArray, serviceNetworksEntriesToArray, and
machineNetworksEntriesToArray. Keep the helper in the same controller/common
area so the three conversion functions can call it directly, and preserve the
existing behavior of returning the original CIDR when NormalizeCIDR fails.

In `@internal/network/utils_test.go`:
- Around line 473-482: Add a negative IPv6 comparison test alongside the
existing canonicalization cases in utils_test.go to verify cidrsEqual returns
false for truly different IPv6 CIDRs, not just equivalent
non-canonical/canonical forms. Update the relevant test table entries in the
IPv6 test groups near the existing “IPv6 non-canonical vs canonical” cases by
adding a pair of distinct CIDRs with expectedResult set to false, so the
behavior of cidrsEqual is covered against over-broad normalization.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 4874cf4e-fe25-45c3-a979-f0a49bd1cfc5

📥 Commits

Reviewing files that changed from the base of the PR and between 56af6d1 and e31c868.

📒 Files selected for processing (3)
  • internal/controller/controllers/common.go
  • internal/network/utils.go
  • internal/network/utils_test.go

return funk.Map(entries, func(entry hiveext.ClusterNetworkEntry) *models.ClusterNetwork {
return &models.ClusterNetwork{Cidr: models.Subnet(entry.CIDR), HostPrefix: int64(entry.HostPrefix)}
cidr := entry.CIDR
if normalized, err := network.NormalizeCIDR(cidr); err == nil {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be easy to normalize when loading the data instead?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The data is normalized when it's written to the DB, the problem is before it's written. The goal is to normalize everywhere it is used to align all, similar to PR #10340

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is because user might enter canonical/non-canonical form and they won't compare between each other?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly.

@rccrdpccl rccrdpccl Jul 2, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we then transform them into canonical form as soon as we read them, instead of having normalize functions all over (i.e. when they get loaded into the model)?

@codecov

codecov Bot commented Jul 1, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 91.66667% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 44.39%. Comparing base (56af6d1) to head (e31c868).

Files with missing lines Patch % Lines
internal/network/utils.go 83.33% 1 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #10569   +/-   ##
=======================================
  Coverage   44.39%   44.39%           
=======================================
  Files         423      423           
  Lines       73543    73555   +12     
=======================================
+ Hits        32649    32657    +8     
- Misses      37966    37968    +2     
- Partials     2928     2930    +2     
Files with missing lines Coverage Δ
internal/controller/controllers/common.go 76.97% <100.00%> (+0.48%) ⬆️
internal/network/utils.go 58.18% <83.33%> (+0.11%) ⬆️

... and 3 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@openshift-ci

openshift-ci Bot commented Jul 1, 2026

Copy link
Copy Markdown

@shay23bra: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/edge-e2e-ai-operator-disconnected-capi e31c868 link true /test edge-e2e-ai-operator-disconnected-capi

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants