How to Extract and Visualize Embedded AI Metadata from RTSP Stream in XProtect

Our company is currently developing an AI camera. We embed AI metadata into the H.264 SEI NAL units and transmit it via RTSP stream. Is it possible to use a MIP Plugin to access the H.264 stream, parse the embedded AI metadata, and render bounding boxes or events on top of the video? We’re aiming to implement functionality similar to CVEDIA. Like the picture below. Or how to do is the best way.

Well…, the best way depends a bit on the options available. As an outset, we’ll recommend to utilize the Metadata and Analytics Event features that are already available in XProtect. This will enable bounding boxes and search of identified objects.

The AI generated metadata should end up in XProtect as ONVIF Profile M style metadata, and that will often happen by making the metadata and events directly available for, for example, the “ONVIF Device Driver”. You can also write your own driver (using the MIP Driver Framework) and based on the metadata in the H.264 SEI NAL units extract/format ONVIF Profile M style Metadata stream for XProtect to consume.

Alternatively, a processing service can be setup that extract the metadata from the “SEI NAL units” from the original H.264 stream and format it as ONVIF Profile M style Metadata for XProtect to handle.

  • AI Bridge may be a candidate for this, especially if you prefer developing for Docker Containers, though we haven’t tested it with the SEI NAL units