Skip to content

[loadbalancingexporter] Logs - duplicate traffic when using routing_key: attributes #49058

@nickinkaty

Description

@nickinkaty

Component(s)

exporter/loadbalancing

What happened?

Description

Duplicate log records are observed after recovering from a network outage while using persistent storage (file_storage) with loadbalancer exporter. The setup consists of Agent Collectors sending logs to Gateway Collectors, which use the loadbalancingexporter to forward traffic to backend collectors.

During testing, a network outage was simulated using a Kubernetes NetworkPolicy. After connectivity was restored, the Agent Collector appeared to resend previously sent logs, resulting in duplicate records.

Current gateway configuration:

routing_key: attributes
routing_attributes:
  - service.name
  - k8s.pod.name
  - log.body

Steps to Reproduce

Deploy Agent Collectors that send logs to Gateway Collectors.
Configure the Gateway Collectors with the loadbalancingexporter.
Generate log traffic.
Simulate a network outage using a Kubernetes NetworkPolicy.
Restore connectivity.
Observe duplicate logs at the backend.

Example duplicate log:

Date,Host,Service,Pod Name,Content
"2026-06-14T17:50:06.474Z","[REDACTED_HOST]","opentelemetry-collector-contrib","otel-gateway-sts-collector-1","Sun Jun 14 17:43:41 UTC 2026 counter=12494 node=[REDACTED_NODE] logtag=F log.file.path=[REDACTED_PATH] time=2026-06-14T17:43:41.358885254Z log.iostream=stdout gw.timestamp=2026-06-14T17:43:41Z"
"2026-06-14T17:49:55.692Z","[REDACTED_HOST]","opentelemetry-collector-contrib","otel-gateway-sts-collector-1","Sun Jun 14 17:43:41 UTC 2026 counter=12494 node=[REDACTED_NODE] logtag=F log.file.path=[REDACTED_PATH] time=2026-06-14T17:43:41.358885254Z log.iostream=stdout gw.timestamp=2026-06-14T17:43:41Z"
"2026-06-14T17:49:50.560Z","[REDACTED_HOST]","opentelemetry-collector-contrib","otel-gateway-sts-collector-1","Sun Jun 14 17:43:41 UTC 2026 counter=12494 node=[REDACTED_NODE] logtag=F log.file.path=[REDACTED_PATH] time=2026-06-14T17:43:41.358885254Z log.iostream=stdout gw.timestamp=2026-06-14T17:43:41Z"
"2026-06-14T17:49:45.446Z","[REDACTED_HOST]","opentelemetry-collector-contrib","otel-gateway-sts-collector-1","Sun Jun 14 17:43:41 UTC 2026 counter=12494 node=[REDACTED_NODE] logtag=F log.file.path=[REDACTED_PATH] time=2026-06-14T17:43:41.358885254Z log.iostream=stdout gw.timestamp=2026-06-14T17:43:41Z"

The same log is received multiple times several seconds apart.

Expected Result

After connectivity is restored, buffered logs should be sent once and routed consistently to the same backend collector.

Actual Result

After the network outage ends, the Agent Collector resends previously sent logs. The duplicate logs are routed to the same backend collector, indicating that routing is working correctly, but duplicate data is being retransmitted.

Collector version

0.154.0

Environment information

Environment

OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")

OpenTelemetry Collector configuration

load_balancing/gateway:
        protocol:
          otlp:
            sending_queue:
              enabled: false
            timeout: 3s
            tls:
              insecure: true
        resolver:
          k8s:
            ports:
              - 4317
            service: otel-gateway-sts-collector-headless.otel-system
        retry_on_failure:
          enabled: true
          initial_interval: 5s
          max_elapsed_time: 0
          max_interval: 30s
        routing_attributes:
          - service.name
          - k8s.pod.name
          - log.body
        routing_key: attributes
        sending_queue:
          batch:
            flush_timeout: 3s
            max_size: 2500
            min_size: 1000
          num_consumers: 35
          queue_size: 4000000
          sizer: items
          storage: file_storage/telemetry-retention

    service:
      extensions:
        - file_storage/fingerprint
        - file_storage/telemetry-retention
      pipelines:
        logs/pod:
          exporters:
            - load_balancing/gateway
          processors:
            - memory_limiter
            - resource/default
            - resource/podlogs
            - resource/lb
            - k8sattributes
            - transform/redact
            - redaction/keys
          receivers:
            - filelog/podLogs

Log output

Additional context

No response

Tip

React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions