NGINX Gateway Fabric Supports the Gateway API Inference Extension

With NGINX Gateway Fabric (NGF) version 2.2, organizations can now tap into the Gateway API Inference Extension to enable smart, inference-aware routing in Kubernetes. Platform and ML teams can publish self-hosted GenAI and LLM services and inference workloads with smarter decisions about routing and fine control over GPU and compute resource usage.

The Gateway API Inference Extension is a community-driven Kubernetes project that standardizes routing logic for inference workloads across the ecosystem. NGF 2.2 integrates with that extension, allowing NGINX to make routing decisions based on AI workload and model characteristics rather than generic traffic heuristics.

For more details, check out the blog.

Cheers,

Micheál

2 Likes