Image Tearing/Corruption when Grabbing Single Frame w/ Python App

Hi all,

I’m facing an extremely weird issue and I’m unsure if it’s related to AI Bridge or XProtect itself.

Here’s the situation:

- We’re working with numerous Bosch NDE-3502-AL cameras across a few recording servers connected to XProtect Corporate 2022 R3.

- All cameras are set to run at 1080p.

- AI Bridge v1.6 is being used on the same network and communicates just fine.

- On a remote machine, we’re running a custom Python (3.11) application that uses OpenCV (4.8).

- That Python app queries the AI Bridge for streams. The user then selects one of the streams at which point the code grabs a frame from the relevant RTSP stream and displays it on screen.

- The app works perfectly with some of the cameras; however, there are other cameras where image corruption/tearing happens 90-95% of the time.

- We’ve tried changing firmware versions with no luck.

- If we change the problematic camera’s profile to use 720p, the corruption/tearing goes away. We find this part most bizarre.

We also ran tests where we pulled a problematic camera off of XProtect (temporarily) and used the camera’s own RTSP stream (i.e. XProtect and AI Bridge were not involved in producing or consuming the stream). Doing this, the corruption/tearing was not present.

The above seems to point at an issue somewhere with either XProtect or AI Bridge in how it’s ultimately presenting the streams for consumption.

I’ve tried googling around for issues concerning OpenCV and RTSP. I’ve even implemented the “frame grabbing” code to run in a separate thread from the main application.

If it helps, here’s an example of the output we see when image tearing occurs:

[h264 @ 0x7f3310006480] top block unavailable for requested intra mode -1

[h264 @ 0x7f3310006480] error while decoding MB 90 0, bytestream 89025

[rtsp @ 0x7f3310001680] RTP: PT=60: bad cseq 1538 expected=1638

[rtsp @ 0x7f3310001680] RTP: PT=60: bad cseq 157f expected=167f

[rtsp @ 0x7f3310001680] RTP: PT=60: bad cseq 1588 expected=1688

If anyone has any ideas or suggestions concerning this issue, your help would be very much appreciated.

Thank you!

Hi,

we have heard about similar cases, where this tearing can happen if using RTSP over UDP since UDP does not handle packet loss in the network.

AI Bridge supports both UDP and TCP, and it is up to the client to decide how to connect.

If in doubt, then the log file of the streaming container will print for every connect, whether UDP or TCP is used.

HTH!

Hi Hans,

Thanks for your response.

We’re actually using TCP (not to mention that the AI Bridge cluster is showing only TCP ports available in the streaming pod/container). We’ve verified TCP is being used through the logs as well as Wireshark.

We also checked the logs for the streaming container. We can’t see any errors or issues there.

In the Wireshark session, we did notice several “out of order” packets when the tearing issue happened. We did notice that the issue seems to be somewhat correlated to the camera being on overburdened recording server (though not all the time). That said, even when this is the case, the feed looks fine when viewed in the Smart Client. It only seems to be when the stream is consumed via the AI Bridge.

If there’s any other info I can give you, please let me know and I’ll see about arranging it.

Thanks again!

Cheers,

Duncan

Hi,

Just to confirm your IVA is actually using TCP, you should see “TCP” as highlighted below for each stream connecting (these are log entries from the steaming container):

LogSampleTCP

How do you experience the Tearing/Corruption? Can you provide a sample?

Could it be that there is a bottleneck somewhere in the video chain from the Recording Server, and the image rendering in your application, potentially leading to that some frames are dropped? Maybe, somewhere in the chain, there aren’t capacity (network bandwidth, buffer size or compute) to process 1080p, where 720p is ok…

if you’re only grapping a single frame, could you then use a snapshot instead, in your particular use case?

Yes, that’s correct: TCP is being shown in those logs, and is also listed as the only exposed port type for that service in the K8s cluster.

I’ve attached a sample of what the tearing looks like:

It is possible there’s a bottleneck as between the recording server and the AI Bridge…it might be that the server in question is overwhelmed or oversubscribed, but how can we tell? What metrics should be used to determine downstream performance? The network this is all on is a fast, static-IP based one and latency is super low.

When you say, “could you then use a snapshot instead,” what do you mean?

This problem has been fixed on AI Bridge 1.7.2.

Thanks for reporting it.