Sunday, 29 June 2025
28.6 C
Singapore
35.1 C
Thailand
20.4 C
Indonesia
28.9 C
Philippines

Elon Musk’s AI company, xAI, enhances Grok with multimodal inputs

xAI, Elon Musk's AI company, adds image capabilities to Grok, offering enhanced features for users and closing the gap with competitors.

As revealed in public developer documents, Elon Musk’s artificial intelligence (AI) company, xAI, is working on integrating multimodal inputs into its Grok chatbot. This development implies that users will soon be able to upload images to Grok and receive text-based responses.

In a recent blog post by xAI, a teaser indicated that the upcoming Grok-1.5V version will introduce “multimodal models across various domains.” The latest updates in the developer documents suggest advancements towards the implementation of a new model.

The developer documents showcase a sample Python script illustrating how developers can leverage the xAI software development kit library to generate responses based on both text and images. By reading an image file, setting up a text prompt, and utilising the xAI SDK, developers can create responses efficiently.

Enhancements for Grok users

Grok, initially launched by xAI in November 2023, is accessible to users subscribed to the X Premium Plus service. The most recent update, Grok 1.5, introduced enhanced reasoning capabilities to the platform in March.

The model is trained on various textual data from publicly available sources up to Q3 2023 and datasets meticulously reviewed by human evaluators. While Grok-1 was not trained on xAI data, it possesses real-time knowledge of the world, including information from x posts.

Founded by Elon Musk in March 2023, xAI is a newcomer to the AI industry, lagging behind competitors like OpenAI’s ChatGPT. However, xAI’s blog post highlights that their Grok 1.5 model is narrowing the gap with GPT-4 across different benchmarks, covering a broad spectrum of academic problems from grade school to high school.

Challenges in benchmarking Large Language Models

Benchmarking large language models can be contentious. Models may excel in benchmarks if the data is part of their training set, akin to memorising answers rather than understanding the content. Despite these challenges, xAI is making significant strides with Grok’s development.

The landscape of AI is evolving towards multimodal conversational chatbots, with notable advancements announced at events like Google I/O and OpenAI’s release of GPT-4o. Grok’s integration of multimodal capabilities signifies a step forward in keeping pace with industry trends and enhancing the user experience.

Hot this week

HPE launches modular AI factory solutions in partnership with NVIDIA

HPE expands its AI factory solutions with NVIDIA to simplify enterprise AI adoption through integrated infrastructure, software, and services.

The Blood of Dawnwalker lets you step into a dark, vampire-filled world

Explore the dark world of The Blood of Dawnwalker, a vampire RPG set in 14th-century Europe that will be released for PC and consoles in 2026.

How Asia’s innovation is reshaping the global economy

Asia is becoming a global innovation powerhouse, driving sustainable growth through AI, clean energy, and deep tech ecosystems.

AI to play a key role in Healthier SG’s next phase of preventive care

AI may help doctors predict diseases early as part of Singapore’s Healthier SG programme, with apps offering personal health advice.

Salesforce launches Agentforce 3 to improve control and visibility over AI agents

Salesforce launches Agentforce 3 with new tools for monitoring, interoperability and performance to help enterprises scale AI agents confidently.

YouTube Create is finally coming to iOS devices

YouTube Create is coming to iOS, offering free mobile video editing tools as Google aims to catch up with CapCut and InShot.

OpenAI turns to Google’s AI chips in the shift from Microsoft and Nvidia

OpenAI begins renting Google's AI chips to run ChatGPT, shifting away from Microsoft and Nvidia to lower computing costs.

Google launches Gemini AI for schools and students, raising questions about future of learning

Google launches Gemini AI in schools with safety tools and fact-checking, sparking debate on its impact on learning and student development.

Google adds precise Bluetooth tracking to Pixel Watch 3, but it’s not active yet

Pixel Watch 3 gets new Bluetooth tracking tech called Channel Sounding, which promises precise tracking but still needs full device support.

Related Articles

Popular Categories