jam-cloud/agent-tasks/client-simulator/webrtc-web-client-plan/progress.md

5.2 KiB

WebRTC Web Client Plan Progress

Goal

Implement browser-native WebRTC media for "web client mode" while preserving the existing frontend/backend control-plane behavior (bridge call sequencing, REST patterns, websocket message flow, and session state transitions) as closely as possible to native.

Baseline Artifacts

  • Native 2-party references:
    • web/ai/native-client-2p-recording/20260228-210957-seth-native-2p-mute-volume.log
    • web/ai/native-client-2p-recording/20260228-211006-david-native-2p-mute-volume.log
  • High-signal baseline from these logs:
    • join-dedupe.create once and join-dedupe.release once per join.
    • one POST /api/sessions/:id/participants per client join.
    • repeated PUT /api/sessions/:id/tracks during mute/volume operations.
    • heavy jamClient polling/queries (getConnectionDetail, SessionGetAllControlState, SessionSetControlState, FTUEGetChannels, P2PMessageReceived) that inform adapter contract priorities.

Compatibility Contract (for web-client mode)

  • Preserve:
    • action-level call sequence (SessionActions -> SessionStore -> rest/websocket/adapter)
    • REST endpoint/method cadence for session join + track control
    • websocket gateway usage and event routing pattern
    • participant/session state behavior in frontend stores
  • Allow controlled divergence:
    • bridge methods that are native-only may return null/stub values in web mode if callers are guarded by isWebClient/isNativeClient aware logic.
    • media transport and signaling payload internals may differ for web-only peers.

Execution Plan

1) Contract Matrix (bridge + REST + websocket) [done]

  • Build a method matrix from recordings:
    • Tier A (must implement behavior): session join/leave, participant lifecycle, mute/volume/control-state methods, callback registration methods used in session runtime.
    • Tier B (must exist, can be stubbed): profile/device/vst/native diagnostics that do not affect core web-to-web media.
    • Tier C (deferred): legacy recording/VST branches being deprecated.
  • Output: web/ai/native-client-2p-recording/contract-matrix.md.

2) WebJamClient Core (control plane parity)

  • Expand WebJamClient to provide Tier A method behavior with deterministic return shapes.
  • Maintain adapter-level instrumentation in jamClientAdapter for parity diffing.
  • Keep method names and call sites unchanged wherever possible.

3) WebRTC Session Media Engine (data plane)

  • Add web-client media manager:
    • local mic capture lifecycle
    • peer connection lifecycle keyed by participant client_id
    • mute/volume mapping between UI track state and WebRTC tracks/gain nodes
    • VU meter signal path for local/remote tracks
  • Use existing gateway/P2P channel pattern for signaling envelopes in web mode.

4) Track-Control Parity

  • Ensure UI interactions (self mute, other mute, self volume, other volume) continue to drive:
    • same store/action flow
    • same PUT /api/sessions/:id/tracks pattern
    • same session refresh behavior (track_changes_counter driven updates)

5) Automated Compatibility Assertions

  • Add a test harness that compares captured run artifacts against compatibility rules:
    • join count and participant POST cardinality
    • REST endpoint/method sequence constraints (tolerant matching where needed)
    • required bridge method call presence/order windows for core flows
    • required websocket event classes
  • Keep this as "pattern parity" (not strict payload byte-equality).

6) End-to-End Validation Runs

  • Run 2-party web-client scenario (same script as native baseline):
    • self mute/unmute
    • other mute/unmute
    • self volume min/max
    • other volume min/max
  • Compare artifacts and close gaps before widening feature scope.

Status

  • Baseline native 2-party logs captured and reviewed
  • Contract matrix drafted (Tier A/B/C)
  • [~] WebJamClient Tier A behavior implemented (initial control-state + session/participant + p2p/webrtc signaling scaffolding in place)
  • WebRTC peer/media manager integrated
  • Track control parity verified against baseline patterns
  • Compatibility assertions added
  • 2-party web-client parity run completed

2026-03-01 Implementation Update

  • Implemented first Tier A WebJamClient pass in web/app/assets/javascripts/webJamClient.js:
    • Added control-state cache for SessionGetAllControlState(master|personal) seeded from delegate then locally updated.
    • Added SessionSetControlState local mutation from trackVolumeObject and broadcast-aware MixerActions.syncTracks() scheduling to preserve REST /tracks flow in web mode.
    • Added session/participant lifecycle state updates in:
      • UpdateSessionInfo
      • ParticipantJoined / ParticipantLeft
      • ClientJoinedSession / ClientLeftSession
    • Added initial WebRTC signaling scaffolding (feature-flagged):
      • peer connection registry
      • offer/answer/candidate handling over existing websocket-gateway P2P path (JK.JamServer.sendP2PMessage)
      • inbound signal handling in P2PMessageReceived
      • local mic stream bootstrap for web-client WebRTC mode.
  • This is intentionally incremental; next step is validating 2-party web-client flow and closing gaps in remote audio rendering/mix handling.