Google’s TurboQuant Cuts AI Memory Sixfold as OpenAI Eyes Asia Expansion
Google introduced TurboQuant, an AI-memory compression algorithm reducing large-language model caches to three bits, cutting memory needs sixfold and speeding processing up to eight times. Competitor OpenAI named JioStar CEO Kiran Mani as Asia-Pacific managing director to target India’s 1.4 billion market, intensifying regional AI competition.
1. Google unveils TurboQuant memory compression
Google introduced TurboQuant, an AI-memory compression algorithm that reduces key-value cache for large language models to three bits without requiring retraining of existing models. This breakthrough aims to optimize resource usage across AI deployments by significantly lowering memory demands.
2. Efficiency gains and technical performance
Internal tests demonstrated TurboQuant can achieve a sixfold reduction in memory footprint and deliver up to eight times faster inference on compatible hardware platforms. The algorithm targets critical bottlenecks in AI workloads to accelerate processing and lower operational costs.
3. Memory chip stocks react
Shares of major memory suppliers fell sharply after the announcement, with SanDisk down 6%, Western Digital off about 5%, Seagate sliding 4% and Micron declining 3%, even as the broader Nasdaq 100 index advanced. Investors weighed the potential drop in hardware demand if TurboQuant sees wide adoption.
4. OpenAI names Asia-Pacific managing director
OpenAI appointed Kiran Mani, former JioStar CEO who scaled platforms to over 300 million subscribers, as managing director for Asia-Pacific. Mani will relocate to Singapore in June, report to the chief strategy officer and build on initiatives like the Tata Group data center partnership to capture India’s 1.4 billion-user market.