Amazon Unveils 10x Faster AI Inference, Expands Air Cargo to Northeast India
Amazon Web Services will integrate Cerebras CS-3 AI inference systems with its Trainium-powered servers on Bedrock, leveraging Elastic Fabric Adapter networking to achieve an order-of-magnitude performance boost for generative AI workloads and support leading LLMs later this year. Separately, Amazon Air launched new freight routes connecting Kolkata and Guwahati to Delhi, slashing transit times up to fivefold across Northeast India’s seven states.
1. AWS and Cerebras Partner on Disaggregated Inference
AWS will deploy a combined solution of its Trainium-powered servers and Cerebras CS-3 systems connected via Elastic Fabric Adapter networking to separate AI inference into prefill and decode stages. This disaggregation allows Trainium to handle prompt processing in parallel while CS-3 accelerates serial token generation, delivering up to 10x faster inference performance on Amazon Bedrock for generative AI and LLM workloads.
2. Amazon Air Expands to Northeast India
Amazon’s in-house cargo airline introduced new routes linking Kolkata and Guwahati to Delhi and other fulfillment hubs, improving delivery speeds by as much as five times across Assam, Arunachal Pradesh, Manipur, Meghalaya, Mizoram, Nagaland and Tripura. The dedicated air capacity aims to overcome regional logistics challenges, enabling local sellers—especially in horticulture and specialty produce—to reach national customers more reliably.