Video Timestamp Misalignment Between WebRTC Playback and Smart Client

Hi Bo,
I would like to follow up on my previous questions, as this is a blocking issue for our development.
To summarize our core concern: if the same timestamp requested via WebRTC can result in different frames depending on the client, this implies that the timestamps delivered through WebRTC are not accurate representations of the actual recording time. We need to understand whether this is the case.

Specifically:

  1. Are the timestamps in the WebRTC stream guaranteed to correspond accurately to the original recording timestamps? Or is some degree of offset expected due to the API Gateway’s re-timestamping?
  2. If accurate timestamp-to-frame correspondence cannot be guaranteed via WebRTC, is there an alternative method to achieve this? For example, obtaining the original Recording Server timestamp, or any other mechanism that would allow us to reliably synchronize video frames with external data based on timestamps.

This is critical for our use case, as we overlay analysis data onto the video using timestamps. If the timestamps in the WebRTC stream are not reliable, we need to find a way to compensate or use a different approach.
We would appreciate your response.

WebRTC is based on RTP, which means timestamps represent relative time deltas, not absolute wall‑clock timestamps. The original Recording Server timestamps are converted into RTP units using a 90 kHz clock, which is standard RTP behavior.
At some point, the receiving side reconstructs absolute timestamps from the RTP stream. If this reconstruction is based on the current wall‑clock time, the resulting offset will naturally include network and buffering delays. This mainly affects live video.
For playback, the Recording Server returns the nearest available frame to the requested time, so the first frame may not match the requested timestamp exactly. That initial offset is reflected in the RTP timestamp, which allows you to approximate the original recording time using:
request_time + (rtp_timestamp / 90) ms ≈ original_frame_timestamp
This can give you a close approximation, but you will never get the exact original recording timestamp via WebRTC, as RTP does not carry absolute timestamps. Exact frame‑to‑timestamp correspondence cannot be guaranteed.
For what it’s worth, if precise timestamp‑to‑frame synchronization is a hard requirement, WebRTC may not be the best fit. An Image Server protocol integration would likely be a more appropriate approach.

Thank you for the detailed explanation. That clarifies the RTP timestamp behavior and the limitation of WebRTC for absolute timestamp recovery.

We have one remaining question that we would like to clarify.

Your explanation covers the general RTP/WebRTC behavior and the recommendation to use Image Server for precise synchronization. However, we are still unclear on the client-to-client discrepancy we reported earlier.

To recap: when we requested playback of the same camera at the same timestamp via WebRTC from three different PCs, the playback start position differed — with a lower-spec PC showing approximately a one-second offset compared to the other two.

Given your explanation that the Recording Server returns the nearest available frame to the requested time during playback, our expectation would be that the same request should return the same starting frame regardless of the client. The starting frame selection should be a server-side decision, not influenced by client performance.

Could you clarify whether this client-to-client discrepancy is expected behavior? If so, what causes it — is it related to the API Gateway’s processing, or something else?

Hello,

I read this thread and agree in your “expectation would be that the same request should return the same starting frame regardless of the client. The starting frame selection should be a server-side decision, not influenced by client performance.“, and hence I can’t explain why you experience this differently.

Just to ensure we’re on the same page here, it is important that the client calculate the actual timestamp of each frame correctly, in order to match the corresponding metadata overlay correctly. Here I will refer to the sample on GitHub
mipsdk-samples-protocol/WebRTC_JavaScript at 6ffa1fafec50d0ff2e9dfda34290f26503d222a1 · milestonesys/mipsdk-samples-protocol · GitHub especially How to calculate time for current frame.

The sample code works with playbackTime as the time used for the start of the playback request and converted to
frameStartTime = Date.parse(playbackTime);
in calculations for the actual timestamp, as we see in this line:
frameDate = new Date(frameStartTime + metadata.rtpTimestamp);

  • in this case, frameDate should be used to lookup the corresponding metadata

Basically, it is important that these calculations happens independently of any time/clock on the actual box/laptop executing the code.