Sunday, 30 November 2025
30.2 C
Singapore
30 C
Thailand
23.4 C
Indonesia
28.3 C
Philippines

Elon Musk’s xAI set to launch Grok-1.5, rivalling GPT-4’s prowess

Elon Musk's xAI is launching Grok-1.5, a cutting-edge AI model with enhanced capabilities, set to rival top models like GPT-4 and Claude 3.

In a groundbreaking announcement, Elon Musk’s xAI is poised to unveil Grok-1.5 next week, an upgrade to its proprietary large language model (LLM), Grok-1. This new version boasts enhanced reasoning and problem-solving abilities, edging closer to the performance benchmarks set by leading LLMs like OpenAI’s GPT-4 and Anthropic’s Claude 3.

Bridging the gap with Grok-1.5

Just weeks after making Grok-1 open-source, xAI is taking significant strides with Grok-1.5, which promises substantial improvements in AI benchmarks across coding, math, and language understanding tasks. According to xAI, Grok-1.5 has demonstrated a 50.6% score on the MATH benchmark and an impressive 90% on the GSM8K benchmark, indicative of its superior math problem-solving skills. Moreover, it has achieved a 74.1% score on the HumanEval benchmark, underscoring its advanced code generation and problem-solving capabilities.

Notably, Grok-1.5’s score of 81.3% on the MMLU benchmark, which assesses language understanding across various tasks, represents a significant leap from Grok-1’s 73%. This achievement signifies Grok-1.5’s enhanced comprehension skills and its capacity to process up to 128,000 tokens, allowing it to handle complex prompts and analyse long documents more effectively than its predecessor.

Nearing the zenith of AI performance

Grok-1.5’s advancements place it in close competition with other significant LLMs, though it still trails behind some on benchmarks like the MMLU and GSM8K. Despite these gaps, Grok-1.5 leads in the HumanEval benchmark, excluding its performance against Claude 3 Opus. With continuous improvements, xAI anticipates that the forthcoming Grok-2 will surpass current AI models on all metrics, according to Elon Musk.

Unveiling to the world

Set for deployment next week, Grok-1.5 will initially be available to early testers and Grok chatbot users on the X platform. This phased rollout aims to enhance the model further and introduce new features, catering to a broader user base over time. Musk’s strategy to integrate Grok into the X platform, coupled with subscription benefits for specific users, underscores his vision for widespread adoption of both the AI model and the platform.

In conclusion, Grok-1.5 represents a significant leap forward in AI, bringing us closer to models that can understand and solve complex problems with near-human proficiency. With its deployment, users can look forward to engaging with an AI that pushes the boundaries of what’s possible, marking a new era in the evolution of artificial intelligence.

Hot this week

Southeast Asia’s Agnes AI partners with Agora to launch real-time AI workspace

Agnes AI and Agora launch a real-time AI workspace that connects human teams and AI agents for collaborative work at scale.

Singapore sees surge in ransomware attacks during holidays, Semperis study finds

A new Semperis study shows 59% of ransomware attacks in Singapore occur during holidays, driven by reduced staffing and major corporate events.

Sumsub reports sharp rise in synthetic personal data fraud in APAC

Sumsub reports a sharp rise in synthetic identity fraud and deepfake attacks across APAC as AI-driven scams become more sophisticated.

Asia’s boards place AI and digital transformation at the top of 2026 priorities

Nearly half of Asia’s governance leaders plan to prioritise AI in 2026 as digital transformation reshapes board agendas.

Sony announces December PS Plus Monthly Games lineup featuring five titles

Sony unveils a five-game PS Plus lineup for December, including Lego Horizon Adventures, Neon White, and several horror titles.

Meta and Google reportedly close to landmark AI chip agreement

Meta is in talks with Google on a major AI chip deal that could reshape the competitive landscape across cloud and hardware markets.

IBM expands Storage Scale System 6000 to support full-rack capacity of 47PB

IBM expands its Storage Scale System 6000 to a full-rack capacity of 47PB, boosting performance for AI, supercomputing, and large-scale data workloads.

DJI Osmo Pocket 4 leak suggests launch may be imminent

DJI’s Osmo Pocket 4 appears in FCC filings, hinting at an imminent launch amid rumours of new features and a possible US product ban.

DeepSeek launches open AI model achieving gold-level scores at the Maths Olympiad

DeepSeek launches Math-V2, the first open AI model to achieve gold-level scores at the International Mathematical Olympiad.

Related Articles

Popular Categories