Akamai Deploys NVIDIA AI Grid with Thousands of GPUs at 4,400 Edges
Akamai has launched Inference Cloud, deploying thousands of NVIDIA RTX PRO 6000 Blackwell GPUs across 4,400 edge locations to implement the first global-scale NVIDIA AI Grid for real-time inference. The orchestrator matches workloads to appropriate compute tiers to cut latency, lower costs per token and improve throughput.
1. Launch of Inference Cloud with NVIDIA AI Grid
Akamai has integrated NVIDIA AI Grid into its Inference Cloud by deploying thousands of NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs. This global-scale implementation enables enterprises to run real-time AI inference across edge, regional and core environments for the first time.
2. Global Edge Infrastructure and Intelligent Orchestration
The AI Grid’s orchestrator serves as a real-time broker, distributing AI workloads across Akamai’s 4,400 edge locations and centralized GPU clusters to optimize latency, cost and performance. Semantic caching and workload-aware routing automatically match requests to the right compute tier, cutting cost per token and accelerating time-to-first-token.
3. Early Industry Use Cases
Gaming studios are delivering sub-50ms AI-driven NPC interactions, financial institutions are executing instant fraud detection and personalized marketing, and broadcasters are performing real-time content transcoding and dubbing. Enterprises leverage Akamai Cloud IaaS with open-source infrastructure and generous egress allowances to support data-intensive AI at scale.