Sunday, 14 September 2025
27.7 C
Singapore
28.6 C
Thailand
20.2 C
Indonesia
28.1 C
Philippines

OVHcloud launches AI Endpoints to simplify access to open-source models

OVHcloud launches AI Endpoints to offer serverless access to over 40 open-source AI models across key global markets.

OVHcloud has launched AI Endpoints, a serverless solution designed to make open-source artificial intelligence models more accessible to developers and businesses. The platform offers over 40 models, including large language models (LLMs) and generative AI tools, supporting applications such as chatbots, speech transcription, and code generation.

With AI Endpoints, developers can integrate advanced AI capabilities into their applications without needing to manage infrastructure or possess deep machine learning expertise. The service is hosted on OVHcloud’s trusted cloud environment, allowing users to experiment with and deploy AI models securely and efficiently.

Support for diverse business use cases

AI Endpoints provides a sandbox environment for developers to test features before rolling them out across applications and business processes. The platform is suited for a wide range of AI applications, including real-time conversational agents, data extraction, speech recognition and synthesis, and coding assistance.

For example, LLMs can be embedded into applications to enhance customer service or user interaction. Text extraction capabilities help businesses process unstructured data, improving operational workflows. Through voice APIs, developers can incorporate both transcription and voice response features, supporting voice-based user interfaces. Additionally, coding tools such as Continue offer in-IDE support with code suggestions and error detection to streamline development.

Privacy, transparency, and environmental responsibility

The platform is built on OVHcloud’s energy-efficient infrastructure, which relies on water-cooled servers housed in environmentally friendly data centres. This approach helps minimise the environmental footprint of AI operations while maintaining performance.

A key differentiator of AI Endpoints is its focus on data sovereignty and transparency. By hosting the solution in Europe, OVHcloud ensures data is protected from non-European regulations. The use of open-weight AI models allows organisations to migrate or replicate these models across different infrastructures, giving them greater control over their data and applications.

“We are excited to launch AI Endpoints and are humbled by the incredible feedback we get from our amazing community. With support for the most diverse and sought after open source LLM models, AI Endpoints helps to democratise AI so developers can add to their apps the most cutting-edge models. Our solution enables them to do this easily in a trusted cloud environment with full confidence in OVHcloud’s sovereign infrastructure,” said Yaniv Fdida, Chief Product and Technology Officer at OVHcloud.

Flexible pricing and regional availability

Following an early preview phase, AI Endpoints is now live in Asia-Pacific, Canada, and Europe, with services deployed from OVHcloud’s Gravelines data centre. Based on user feedback, the service includes enhanced features such as better API key management, increased model stability, and a wider range of supported models.

The offering covers several model categories, including LLMs like Llama 3.3 70B and Mixtral 8x7B; small language models such as Mistral Nemo and Llama 3.1 8B; code models like Qwen 2.5 Coder 32B and Codestral Mamba; reasoning tools such as DeepSeek-R1; multimodal models like Qwen 2.5 VL 72B; image generation with SDXL; and speech-to-text (ASR) and text-to-speech (TTS) capabilities.

Pricing is offered on a pay-as-you-go basis, with costs calculated by the number of tokens processed per minute, depending on the selected model.

Hot this week

The rise of the Fractional CMO is reshaping marketing leadership in modern organisations

Explore how the rise of Fractional CMOs is transforming marketing leadership in Southeast Asia, offering companies flexible, strategic expertise without full-time costs.

NetApp launches StorageGRID 12.0 to accelerate AI workloads and boost data security

NetApp introduces StorageGRID 12.0 with faster AI performance, simplified management, and stronger security for unstructured data.

Maxicare adopts Agentforce to streamline dental authorisations

Maxicare adopts Salesforce’s Agentforce to automate dental authorisations, improving clinic efficiency and member healthcare services.

Reddit tests in-app article reading with new publisher tools

Reddit is testing in-app article reading with new analytics and AI tools for publishers, aiming to boost content sharing and engagement.

Grammarly expands grammar support to Spanish, French and more languages

Grammarly now supports Spanish, French, Portuguese, German, and Italian, expanding its AI grammar tools to six core languages.

Asus unveils US$4,000 ProArt P16 with 4K tandem OLED and RTX 5090

Asus launches its ProArt P16 laptop with a 4K tandem OLED, RTX 5090 GPU, and creator-focused features, priced from US$1,999.

Lenovo unveils Legion Go 2 handheld with OLED display and higher price tag

Lenovo launches the Legion Go 2 handheld with an OLED display, upgraded specs and a higher starting price of €999 at IFA 2025.

Samsung could launch two Galaxy Z Fold8 models in 2026

Samsung may release two Galaxy Z Fold8 models in 2026, including one with a square-like screen, alongside the Galaxy Z Flip8.

Apple brings new health features to older Watch models

Apple adds hypertension notifications and Sleep Score to older Watch models with watchOS 26, expanding health tools beyond its newest devices.

Related Articles

Popular Categories