Saturday, 13 December 2025
26 C
Singapore
22.1 C
Thailand
20.6 C
Indonesia
26.9 C
Philippines

AMD powers Zyphra’s large-scale AI training milestone

Zyphra trains its ZAYA1 foundation model entirely on AMD hardware, marking a major step for large-scale AI development.

AMD has announced that Zyphra has completed training ZAYA1, a new Mixture-of-Experts (MoE) foundation model built entirely on AMD’s GPU and networking platform. The work marks the first time a large-scale MoE model has been trained using AMD Instinct MI300X GPUs together with AMD Pensando networking and the ROCm open software stack.

Zyphra detailed the achievement in a technical report. According to the company, ZAYA1 delivers competitive or superior results across reasoning, mathematics and coding benchmarks when compared with leading open models. The performance demonstrates that AMD’s platform can support production-scale AI workloads that typically rely on rival GPU systems.

Emad Barsoum, corporate vice president of AI and engineering in AMD’s Artificial Intelligence Group, said the milestone highlights how the company’s technology can support modern AI development. “AMD leadership in accelerated computing is empowering innovators like Zyphra to push the boundaries of what’s possible in AI,” he said. “This milestone showcases the power and flexibility of AMD Instinct GPUs and Pensando networking for training complex, large-scale models.”

Zyphra’s chief executive Krithik Puthalath said the model reflects the company’s broader focus on efficiency. “Efficiency has always been a core guiding principle at Zyphra. It shapes how we design model architectures, develop algorithms for training and inference, and choose the hardware with the best price-performance to deliver frontier intelligence to our customers,” he said. He added that the organisation is “thrilled to be the first company to demonstrate large-scale training on an AMD platform” and intends to continue working with AMD and IBM as it develops future multimodal foundation models.

Focus on memory capacity and training throughput

Zyphra reported that the MI300X GPU’s 192 GB of high-bandwidth memory played a central role in enabling the model’s training efficiency. The additional capacity allowed the team to avoid the need for expert or tensor sharding, which can add complexity and slow performance. Zyphra added that it achieved more than ten times faster model save times because of AMD’s optimised distributed I/O, which helped improve reliability during large-scale runs.

ZAYA1-Base contains 8.3 billion total parameters, with 760 million active at any given moment. Despite the lower active parameter count, the model matches or exceeds the performance of several well-known systems, including Qwen3-4B from Alibaba, Gemma3-12B from Google, Meta’s Llama-3-8B and OLMoE.

Joint work with AMD and IBM on large-scale infrastructure

The development builds on earlier collaboration between Zyphra, AMD and IBM. Together, the companies designed and deployed a large-scale training cluster that combines AMD Instinct GPUs with IBM Cloud’s high-performance fabric and storage architecture. The system, first announced earlier in the quarter, provided the infrastructure required to train ZAYA1 at scale.

The companies said the engineering partnership enabled Zyphra to run complex pretraining workloads more efficiently, supported by AMD’s hardware platform and IBM’s cloud-native performance architecture.

The ZAYA1 report, together with accompanying updates from both companies, outlines the training approach, model design and AMD technologies used during development. AMD said the milestone reflects growing momentum around its GPU platform as an alternative to well-established competitors in large-scale AI training.

Hot this week

New research finds growing public demand for modern emergency call systems in Australia and New Zealand

New study shows strong public support for modern, data-driven and AI-enabled emergency call systems in Australia and New Zealand.

Sony unveils 27-inch PlayStation monitor with DualSense charging hook

Sony unveils a 27-inch PlayStation monitor with a DualSense charging hook, HDR support, and variable refresh rates, set to release in 2026.

Enterprise AI adoption accelerates as organisations deepen workflow integration

A new OpenAI report shows rapid global growth in enterprise AI, rising productivity gains, and a widening gap between leading and lagging adopters.

Samsung signals major step forward with new Exynos teaser

Samsung teases its Exynos 2600 chip, expected to debut as the first 2nm mobile processor and power the upcoming Galaxy S26 series.

Deepal marks Christmas in Singapore with Pantler Café collaboration and S07 test drive giveaway

Deepal partners with Pantler Café in Singapore for festive treats, an S07 showcase and a 3D2N electric SUV test drive giveaway.

PlayStation introduces limited edition Genshin Impact DualSense controller

PlayStation announces a limited edition Genshin Impact DualSense controller for PS5, launching in Singapore on 21 January 2026.

PGL brings Counter-Strike 2 Major to Singapore in November 2026

PGL confirms the Counter-Strike 2 Major is coming to Singapore in November 2026, marking the first CS2 Major in Southeast Asia.

Denodo: Rethinking data architecture for AI agility and measurable ROI in Asia-Pacific

Denodo highlights how modern, composable data architectures powered by logical data management are helping Asia-Pacific enterprises accelerate AI adoption, ensure governance, and achieve measurable ROI.

Veeam completes acquisition of Securiti AI to build unified trusted data platform

Veeam completes its US$1.725 billion acquisition of Securiti AI to form a unified trusted data platform for secure and scalable AI adoption.

Related Articles

Popular Categories