Thursday, 4 December 2025
26.3 C
Singapore
17.8 C
Thailand
24.9 C
Indonesia
27.1 C
Philippines

Red Hat expands AWS collaboration to enhance AI inference performance

Red Hat expands its AWS collaboration to support large-scale generative AI with improved performance and lower costs.

Red Hat has announced an expanded collaboration with Amazon Web Services to strengthen how enterprises run generative AI workloads across hybrid cloud environments. The company aims to give IT leaders more choice and efficiency by enabling Red Hat AI to run on AWS Trainium and Inferentia chips.

Rising demand for scalable AI performance

The rapid adoption of generative AI is prompting organisations to reassess how they build and scale their infrastructure. With inference workloads growing quickly, businesses are looking for ways to optimise performance while managing costs. IDC forecasts that by 2027, 40 per cent of organisations will adopt custom silicon such as ARM processors or chips designed specifically for AI and machine learning. Red Hat said this highlights the need for infrastructure that can support high-performance inference with better efficiency.

Joe Fernandes, vice president and general manager of Red Hat’s AI Business Unit, said the company’s work with AWS reflects these shifts in enterprise requirements. “By enabling our enterprise-grade Red Hat AI Inference Server, built on the innovative vLLM framework, with AWS AI chips, we’re empowering organisations to deploy and scale AI workloads with enhanced efficiency and flexibility,” he said. He added that the partnership is rooted in Red Hat’s open source approach, aiming to make generative AI more accessible and cost-effective across hybrid cloud environments.

Integrating Red Hat AI with AWS AI accelerators

A key part of the collaboration is the enablement of the Red Hat AI Inference Server on AWS Trainium and Inferentia chips. Red Hat said this will create a common inference layer capable of running any generative AI model with lower latency and improved price performance. The company expects customers to see up to 30 to 40 per cent better price performance compared with similar GPU-based Amazon EC2 instances.

Colin Brace, vice president at Annapurna Labs, AWS, said the joint effort is designed to support organisations looking for alternatives that deliver strong performance without high operational costs. “Enterprises demand solutions that deliver exceptional performance, cost efficiency, and operational choice for mission-critical AI workloads,” he said. “AWS designed its Trainium and Inferentia chips to make high-performance AI inference and training more accessible and cost-effective. Our collaboration with Red Hat provides customers with a supported path to deploying generative AI at scale.”

Red Hat and AWS have also developed an AWS Neuron operator for Red Hat OpenShift, Red Hat OpenShift AI and Red Hat OpenShift Service on AWS. This gives customers a more seamless route to run AI workloads using AWS accelerators within OpenShift environments. Red Hat has additionally released the amazon.ai Certified Ansible Collection to simplify orchestration of AI services on AWS.

Another part of the collaboration involves upstream community work. Red Hat and AWS are jointly optimising an AWS AI chip plugin contributed to vLLM. Red Hat is the largest commercial contributor to the project, which also underpins llm-d, an open source tool for large-scale inference that is now commercially supported in Red Hat OpenShift AI 3.

Supporting hybrid cloud strategies across industries

Red Hat said the companies’ long-standing partnership is now evolving to meet demand from organisations adopting AI across datacentres, public cloud and edge environments. Many enterprises are building hybrid cloud strategies that rely on consistent performance and efficient operations.

Jean-François Gamache, chief information officer and vice president of Digital Services at CAE, said Red Hat OpenShift Service on AWS has already played a significant role in its digital transformation. “This platform supports our developers in focusing on high-value initiatives, driving product innovation and accelerating AI integration across our solutions,” he said. He added that OpenShift’s flexibility and scalability have helped the organisation improve customer-facing outcomes, from real-time insights to faster response times for user-reported issues.

Industry analysts believe the collaboration addresses an important shift in enterprise AI adoption. Anurag Agrawal, founder and chief global analyst at Techaisle, said organisations are increasingly focused on balancing performance with long-term cost sustainability. “As AI inference costs escalate, enterprises are prioritising efficiency alongside performance,” he said. “This collaboration exemplifies Red Hat’s ‘any model, any hardware’ strategy by combining its open hybrid cloud platform with the distinct economic advantages of AWS Trainium and Inferentia. It empowers CIOs to operationalise generative AI at scale, shifting from cost-intensive experimentation to sustainable, governed production.”

Hot this week

Cronos: The New Dawn drives major profit surge for Bloober Team

Bloober Team reports record Q3 2025 results as Cronos: The New Dawn drives a major surge in global sales and profit.

123RF introduces Gen AI-powered video comprehension capability on AWS

123RF launches AI-powered video comprehension on AWS to improve search accuracy, compliance checks, and creative asset discovery.

DJI Osmo Pocket 4 leak suggests launch may be imminent

DJI’s Osmo Pocket 4 appears in FCC filings, hinting at an imminent launch amid rumours of new features and a possible US product ban.

ShadowV2 botnet spotted during AWS outage, researchers warn of possible return

ShadowV2 botnet briefly emerged during the AWS outage, targeting IoT devices, raising concerns about future cyberattacks.

HoYoverse unveils Varsapura, an open-world action game inspired by Singapore

HoYoverse reveals Varsapura, an open-world action game inspired by Singapore, with Unreal Engine 5 visuals and atmospheric, Control-like themes.

Sony launches the Alpha 7 V with new sensor, AI-powered processing and enhanced reliability

Sony introduces the Alpha 7 V with a new 33MP sensor, updated AI processing and enhanced reliability for photography and video.

SynaXG secures more than US$20 million in pre-Series A funding to drive global AI-RAN growth

SynaXG raises over US$20 million to expand its AI-RAN technology and accelerate global adoption of next-generation wireless infrastructure.

OpenAI enters circular ownership deal with Thrive Holdings

OpenAI enters a circular ownership deal with Thrive Holdings, deepening ties with private equity while expanding its AI reach.

Solace launches Agent Mesh Enterprise to support real-time agentic AI adoption

Solace launches Agent Mesh Enterprise to help organisations build and scale real-time agentic AI applications across the enterprise.

Related Articles

Popular Categories