Akamai Integrates Nvidia AI Grid, Distributing AI Computing to the Edge
Akamai Technologies has officially integrated Nvidia AI Grid technology into its infrastructure. This technology enables the distribution of artificial intelligence (AI) computing to thousands of edge points worldwide.
This step positions Akamai as the first company to implement Nvidia’s AI Grid reference design on a global scale. The integration is part of the development of the Akamai Inference Cloud platform, which was previously introduced.
Through this approach, Akamai deploys specialised AI workloads, particularly inference, to more than 4,400 edge locations, as well as regional data centres and the core. The goal is to balance latency, cost, and performance in AI processing.
Akamai’s Chief Operating Officer, Adam Karon, stated that the previously centralised AI infrastructure model is no longer sufficient to meet the needs of modern applications.
“Our intelligent AI Grid orchestration provides a way for AI factories to extend inference outwards, leveraging the same distributed architecture that has revolutionised content delivery to direct AI workloads to 4,400 locations, at the right cost and at the right time,” said Karon.
The GPUs complement Akamai’s edge network, allowing the company to offer a combination of high-powered centralised computing and user-proximate processing.
At the edge, the system handles requests requiring low latency. Meanwhile, heavy workloads such as model training and advanced processing remain in the core data centres.
Akamai emphasises that the core of the AI Grid is an intelligent orchestrator that manages workload distribution in real-time. The system optimises what is termed “tokenomics”, namely cost efficiency per token, response time, and throughput.
With mechanisms like semantic caching and intelligent routing, AI requests are directed to the most suitable resources. Lightweight workloads can run at the edge, while heavy tasks are allocated to high-end GPUs.
This approach is claimed to reduce operational costs while improving AI service performance, especially for large-scale applications.
Akamai notes that initial adoption of the platform has been high across various sectors requiring rapid responses.
In the gaming industry, the technology is used to deliver real-time AI-based non-player character (NPC) interactions with latency under 50 milliseconds.
In the financial sector, the network is utilised for fraud detection and service personalisation as users enter applications.
Meanwhile, in the media industry, broadcasters use the technology for direct content transcoding and dubbing processes.
Usage is also expanding into the retail sector, particularly for in-store AI applications and AI-based checkout systems.
Distributing computing to the edge is seen as a solution to overcome latency and scalability challenges, while enhancing user experience.
The Akamai Inference Cloud platform is currently available to select corporate customers. The company has also announced securing service contracts worth $200 million over four years for providing GPU clusters in edge AI infrastructure.