Wednesday, 27 August 2025
30.1 C
Singapore
30.7 C
Thailand
25.4 C
Indonesia
27.3 C
Philippines

Elon Musk’s AI company, xAI, enhances Grok with multimodal inputs

xAI, Elon Musk's AI company, adds image capabilities to Grok, offering enhanced features for users and closing the gap with competitors.

As revealed in public developer documents, Elon Musk’s artificial intelligence (AI) company, xAI, is working on integrating multimodal inputs into its Grok chatbot. This development implies that users will soon be able to upload images to Grok and receive text-based responses.

In a recent blog post by xAI, a teaser indicated that the upcoming Grok-1.5V version will introduce “multimodal models across various domains.” The latest updates in the developer documents suggest advancements towards the implementation of a new model.

The developer documents showcase a sample Python script illustrating how developers can leverage the xAI software development kit library to generate responses based on both text and images. By reading an image file, setting up a text prompt, and utilising the xAI SDK, developers can create responses efficiently.

Enhancements for Grok users

Grok, initially launched by xAI in November 2023, is accessible to users subscribed to the X Premium Plus service. The most recent update, Grok 1.5, introduced enhanced reasoning capabilities to the platform in March.

The model is trained on various textual data from publicly available sources up to Q3 2023 and datasets meticulously reviewed by human evaluators. While Grok-1 was not trained on xAI data, it possesses real-time knowledge of the world, including information from x posts.

Founded by Elon Musk in March 2023, xAI is a newcomer to the AI industry, lagging behind competitors like OpenAI’s ChatGPT. However, xAI’s blog post highlights that their Grok 1.5 model is narrowing the gap with GPT-4 across different benchmarks, covering a broad spectrum of academic problems from grade school to high school.

Challenges in benchmarking Large Language Models

Benchmarking large language models can be contentious. Models may excel in benchmarks if the data is part of their training set, akin to memorising answers rather than understanding the content. Despite these challenges, xAI is making significant strides with Grok’s development.

The landscape of AI is evolving towards multimodal conversational chatbots, with notable advancements announced at events like Google I/O and OpenAI’s release of GPT-4o. Grok’s integration of multimodal capabilities signifies a step forward in keeping pace with industry trends and enhancing the user experience.

Hot this week

Apple set to bring back Touch ID with upcoming foldable iPhone

Apple is expected to launch its first foldable iPhone in 2026, featuring Touch ID, four cameras and a slim in-cell display design.

Adyen: How fragmented cross-border payments are hindering ASEAN’s growth potential

Fragmented cross-border payments are slowing Southeast Asia’s digital growth. Adyen’s Ben Wong explains how AI, unified platforms, and regional initiatives are reshaping the future of payments in ASEAN.

GitLab 18.3 expands AI orchestration in software engineering

GitLab 18.3 boosts AI orchestration with new flows, real-time code intelligence, and enterprise-ready agent integration.

Keeper Security launches biometric login with passkeys

Keeper Security introduces biometric passkey login, allowing secure passwordless access to vaults through Windows Hello and Apple Touch ID.

Google adds AI-powered audio feature to Docs

Google introduces a new Gemini AI feature in Docs, allowing users to listen to documents with customisable voices and playback speeds.

Best monitors to buy in 2025 for gaming, work, and creativity

Discover the best monitors of 2025 with top picks for gaming, work, and creativity, offering speed, accuracy, and stunning visuals.

Telkomsel partners with OpenAI to accelerate AI adoption in Indonesia

Telkomsel partners with OpenAI to drive AI adoption across Indonesia, showcasing innovations and collaborations at Solution Day 2025.

Malaysia to host road and traffic technology events in November

Malaysia will host My-ARTTE 2025 and MRMC in November, highlighting innovation in road safety, maintenance, and traffic technology.

Naluri secures US$5 million to expand digital health services in Asia

Naluri raises US$5 million in Series-B funding to expand into the Philippines and Vietnam and strengthen its digital health services in Asia.

Related Articles

Popular Categories