Today Azure OpenAI had huge issues and my AI backed service kept restarting itself as I implemented the health endpoint monitoring for k8s to fail with latency spikes, but that was something I didn't control. So instead of failing the service and restarting it if the latency is high, I'll now show a small toast to the user informing them that services have high latency and the interface might be responding slow. Improve your UX all!