Using the AI Bridge 1.6 Helm chart, the Kafka Broker image refuses to pull. All other pods are coming through just fine. I noticed that the Kafka Broker, Init, and Proxy containers were all modified today (March 13, 2024) in NGC, so maybe that has something to do with it, but the Init and Proxy containers both pulled just fine.
Here’s part of what `helm describe pod` returns:
Normal Pulling 24s (x2 over 50s) kubelet Pulling image "nvcr.io/isv-milestone/partners/aibridge-kafka-broker:v1.6.0"
Warning Failed 23s (x2 over 37s) kubelet Failed to pull image "nvcr.io/isv-milestone/partners/aibridge-kafka-broker:v1.6.0": rpc error: code = NotFound desc = failed to pull and unpack image "nvcr.io/isv-milestone/partners/aibridge-kafka-broker:v1.6.0": failed to resolve reference "nvcr.io/isv-milestone/partners/aibridge-kafka-broker:v1.6.0": nvcr.io/isv-milestone/partners/aibridge-kafka-broker:v1.6.0: not found
Warning Failed 23s (x2 over 37s) kubelet Error: ErrImagePull
Normal BackOff 8s (x2 over 36s) kubelet Back-off pulling image "nvcr.io/isv-milestone/partners/aibridge-kafka-broker:v1.6.0"
Warning Failed 8s (x2 over 36s) kubelet Error: ImagePullBackOff
Of course this is preventing the AI Bridge from coming up.
I don’t know if this is an NGC issue or a Milestone issue, but browsing the corresponding private registry’s containers shows that the Kafka Broker v1.6.0 tag is available (or at least visible).
I’ve also tried rebooting the machine to double check things.
The Helm chart was working just fine yesterday (March 12, 2024).
Any help would be appreciated. Thank you in advance.