Friday, 24 October 2025
27.6 C
Singapore
22.5 C
Thailand
19.3 C
Indonesia
27.7 C
Philippines

Elon Musk’s AI company, xAI, enhances Grok with multimodal inputs

xAI, Elon Musk's AI company, adds image capabilities to Grok, offering enhanced features for users and closing the gap with competitors.

As revealed in public developer documents, Elon Musk’s artificial intelligence (AI) company, xAI, is working on integrating multimodal inputs into its Grok chatbot. This development implies that users will soon be able to upload images to Grok and receive text-based responses.

In a recent blog post by xAI, a teaser indicated that the upcoming Grok-1.5V version will introduce “multimodal models across various domains.” The latest updates in the developer documents suggest advancements towards the implementation of a new model.

The developer documents showcase a sample Python script illustrating how developers can leverage the xAI software development kit library to generate responses based on both text and images. By reading an image file, setting up a text prompt, and utilising the xAI SDK, developers can create responses efficiently.

Enhancements for Grok users

Grok, initially launched by xAI in November 2023, is accessible to users subscribed to the X Premium Plus service. The most recent update, Grok 1.5, introduced enhanced reasoning capabilities to the platform in March.

The model is trained on various textual data from publicly available sources up to Q3 2023 and datasets meticulously reviewed by human evaluators. While Grok-1 was not trained on xAI data, it possesses real-time knowledge of the world, including information from x posts.

Founded by Elon Musk in March 2023, xAI is a newcomer to the AI industry, lagging behind competitors like OpenAI’s ChatGPT. However, xAI’s blog post highlights that their Grok 1.5 model is narrowing the gap with GPT-4 across different benchmarks, covering a broad spectrum of academic problems from grade school to high school.

Challenges in benchmarking Large Language Models

Benchmarking large language models can be contentious. Models may excel in benchmarks if the data is part of their training set, akin to memorising answers rather than understanding the content. Despite these challenges, xAI is making significant strides with Grok’s development.

The landscape of AI is evolving towards multimodal conversational chatbots, with notable advancements announced at events like Google I/O and OpenAI’s release of GPT-4o. Grok’s integration of multimodal capabilities signifies a step forward in keeping pace with industry trends and enhancing the user experience.

Hot this week

Meta cuts 600 roles across AI division amid restructuring

Meta cuts 600 jobs in its AI division as it restructures teams and shifts focus to its new superintelligence project, TBD Lab.

DJI Mic Mini review: A pocket-sized wireless mic that punches above its weight

DJI Mic Mini is a pocket-sized wireless mic offering crisp audio, noise cancellation, long-range stability, and easy pairing with cameras and smartphones.

OpenAI launches ChatGPT Atlas, a browser built around AI assistance

OpenAI launches ChatGPT Atlas, a new browser with built-in AI that helps users browse, plan, and work more efficiently.

Perplexity AI introduces a language-learning feature to its platform

Perplexity AI adds an in-app language-learning mode for vocabulary, translations and practise; available on iOS and web.

Microsoft warns of rising AI-driven cyber threats in 2025 defence report

Microsoft’s 2025 Digital Defense Report warns of rising AI-driven cyber threats, a growing cybercrime economy, and evolving nation-state tactics.

Amazon introduces revamped Luna game streaming service with new multiplayer collection

Amazon revamps Luna with new multiplayer games, smartphone controls, and a refreshed library for Prime members and subscribers.

Leica launches new M-mount camera that ditches the rangefinder

Leica unveils the M EV1, its first M-series camera with an electronic viewfinder, marking a bold step beyond its iconic rangefinder design.

Ledger unveils Nano Gen5, redefining the crypto wallet as a personal digital signer

Ledger launches the Nano Gen5, redefining its crypto wallet as a secure digital identity signer for the modern online world.

GM introduces hands-free, eyes-off driving for Escalade IQ in 2028

GM unveils plans for hands-free, eyes-off driving in the Escalade IQ by 2028, alongside AI voice assistants, robotics, and energy innovations.

Related Articles