Skip to content

MQTTAsync: Idle subscriber disconnected with "PINGRESP not received" when sharing process with busy QoS 2 client #1680

Description

@jtstash-amzn

Describe the bug

When two MQTTAsync (async) clients run in the same process — one busy (publishing/receiving QoS 2 at ~10 msgs/sec) and one idle (subscribe-only, no application messages) — the idle client disconnects with the log message:

PINGRESP not received in keepalive interval for client X on socket N, disconnecting

Tcpdump confirms the broker sends PINGRESP, but the client still disconnects.

To Reproduce

  1. Create two MQTTAsync clients in one process.
  2. Client A: connect to a broker, subscribe, use QoS 2, publish/receive ~10 msg/sec.
  3. Client B: connect to a different broker, subscribe, QoS 0, remain idle (no application messages).
  4. Set Client B keepAliveInterval = 20s.
  5. Observe Client B disconnecting at T+21s every time.

Reproduction environment and exact steps are provided in the Environment section below.

Expected behavior

The idle subscriber (Client B) should remain connected when the broker responds to PINGREQ with PINGRESP within the keepalive interval. The client should not disconnect if the broker's PINGRESP is successfully received at the TCP level.

Log files

Paho trace output shows: PINGRESP not received in keepalive interval for client X on socket 4, disconnecting

Tcpdump evidence (timestamps condensed):

18:04:25 — Client B TCP connect + MQTT CONNECT + SUBSCRIBE (success)
18:04:46 — Client B sends PINGREQ (2 bytes, T+21s)
18:04:46 — Broker responds PINGRESP (2 bytes) — TCP ACKed
18:04:46 — Client B sends RST (3ms after PINGRESP arrival)

The tcpdump shows the broker's PINGRESP arrived and was ACKed at the TCP layer, but the Paho client still generated a reset/disconnect.

Environment (please complete the following information):

  • Paho C version: 1.3.13
  • OS: Ubuntu 22.04 (Docker, --network host)
  • API: MQTTAsync (async)
  • Two mqtt::async_client instances in one process

Additional context

Root Cause Analysis:

  • MQTTAsync_cycle() calls Socket_getReadySocket() which returns one fd per iteration. When the busy client's socket is consistently returned first, the idle client's socket is never read and PINGRESP bytes remain in the kernel buffer.
  • MQTTAsync_retry() invokes MQTTProtocol_keepalive() every retryLoopInterval = (keepAlive * 1000) / 10 ms. If the idle client's socket hasn't been read within the keepalive interval, the keepalive logic sees a stale lastReceived timestamp and disconnects the client even though the broker's PINGRESP arrived at the TCP level.

Workaround:

  • Increasing keepAliveInterval to 60s (Paho's default) mitigates the issue because it gives the receive thread more time to process the idle client's socket before the timeout fires.

Suggested Fixes:

  • MQTTAsync_cycle() should drain all ready sockets per iteration (i.e., loop over the select()/poll() results) instead of processing only one ready socket.
  • Alternatively, MQTTProtocol_keepalive() should take into account whether recv() has been called on the client's socket since the last PINGREQ was sent (rather than relying solely on lastReceived).

Related issues:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions