Elon Musk's xAI reveals Grok-1.5V, its inaugural multimodal model capable of processing various visual inputs alongside text comprehension. Grok-1.5V surpasses competitors in RealWorldQA testing, marking a significant leap in AI technology evolution. Despite strides, concerns arise regarding Grok's integration into the platform amid criticism over misinformation generation.
Grok-1.5V: Pioneering Multimodal Model Surpasses Competitors, Sets New Standard in RealWorldQA
Weeks after xAI unveiled its improved Grok-1.5 chatbot model, this revelation takes place.
“Grok-1.5V is competitive with existing frontier multimodal models in a number of domains, ranging from multi-disciplinary reasoning to understanding documents, science diagrams, charts, screenshots, and photographs,” the company said in a blog post.
In a recent report by VentureBeat, the company provides seven examples that demonstrate the capabilities of the Grok-1.5V. These include translating a child's drawing of a flowchart onto Python code, creating a bedtime story from the drawing, translating a table into a CSV file format, and determining whether the wood on your deck needs to be replaced due to rot.1
The xAI asserts that its multimodal model outperforms contemporaries GPT-4V, Claude 3Sonnet, Claude 3 Opus, and Gemini Pro 1.5 in testing. It takes particular pride in the fact that Grok-1.5V beats its rivals in the RealWorldQA standard, a brand-new criterion it developed to assess spatial comprehension in the real world.
Initially, over 700 images were used for training, with a question-and-answer for each image in RealWorldQA. The pictures ranged from other real-world samples to anonymized pictures shot from cars. xAI is making RealWorldQA available to the public under a Creative Commons license.
With its chatbot debuting in November 2023, Musk's AI startup has been striving to stay ahead of OpenAI and other industry heavyweights. Less than a month has passed since xAI released Grok-1.5V as open-source software. Its efforts, however, have not been without criticism. Researchers discovered earlier this month that the Grok chatbot might teach users about illegal activity.
Nevertheless, xAI's goal is to create “beneficial [artificial general intelligence]” that can comprehend the cosmos. It indicates that in the upcoming months, "significant" improvements will be made to Grok AI's multimodal understanding and generation capabilities.
Grok-1.5V Unveiled: Elon Musk's AI Chatbot Breaks New Ground in Visual Understanding
Elon Musk's AI chatbot can "understand" visuals, even charts and diagrams that are jam-packed with information, per Mashable SEA report.
The firm introduces Grok-1.5V, or Grok 1.5 "Vision," as its "first-generation multimodal model." According to the company, this bot can reason through complex texts, science diagrams, charts, screenshots, and photographs and respond to the pictures and screenshots you provided. Furthermore, Grok-1.5V will acquire "real-world spatial understanding" to enhance its comprehension of the real world as portrayed in the photographs its users upload.
"Advancing both our multimodal understanding and generation capabilities are important steps in building beneficial AGI that can understand the universe," the company wrote in its' announcement. "In the coming months, we anticipate to make significant improvements in both capabilities, across various modalities such as images, audio, and video."
Examples of use cases include converting a child's drawing into a story for the bedroom, identifying the largest object in a group of several, and informing a driver if there is adequate room to drive past an obstruction.
The RealWorldQA image and prompt dataset, created by xAI, is made available with Grok-1.5V to evaluate other GenAI models against Grok's real-world reasoning.
But competition is the least of Grok's concerns. Grok has been unable to hold on to early adopters and employees despite xAI's ongoing investment; according to a recent article, its developers find it difficult to utilize the sluggish xAI API.
The X employees expressed worries in the same Fortune piece this week over Musk's suggestion that Grok write paid user posts for them in defiance of staff and developer cautions. Grok faced criticism last week for fabricating news headlines from a different reality in which Iran had repeatedly attacked Tel Aviv with a military arsenal.
While it's common for GenAI chatbots to produce fake news and have hallucinations, Grok's error points to yet another problem with the entire website. The bot, a standard reaction to Musk's ChatGPT, is integrating into a platform that has gradually reduced the strength of its safeguards against malicious AI.
Due to X's generally bad reputation for moderating and the CEO's unwillingness to correct false material to support the website's "citizen journalists," Grok is in a dangerous position in the platform's overrun information ecology.


Tesla Plans FSD Subscription Price Hikes as Autonomous Capabilities Advance
Microsoft AI Spending Surge Sparks Investor Jitters Despite Solid Azure Growth
Advantest Shares Hit Record High on Strong AI-Driven Earnings and Nvidia Demand
Nintendo Stock Jumps as Switch 2 Becomes Best-Selling Console in the U.S. in 2025
Samsung Electronics Posts Record Q4 2025 Profit as AI Chip Demand Soars
Ericsson Plans SEK 25 Billion Shareholder Returns as Margins Improve Despite Flat Network Market
SoftBank Shares Surge as It Eyes Up to $30 Billion New Investment in OpenAI
Google Disrupts Major Residential Proxy Network IPIDEA
California Governor Gavin Newsom Launches Review Into Alleged TikTok Content Suppression After U.S. Ownership Deal
Anthropic Raises 2026 Revenue Outlook by 20% but Delays Path to Profitability
U.S. Lawmakers Demand Scrutiny of TikTok-ByteDance Deal Amid National Security Concerns
Meta Faces Lawsuit Over Alleged Approval of AI Chatbots Allowing Sexual Interactions With Minors
Intel Stock Slides Despite Earnings Beat as Weak Q1 Outlook Raises Concerns
Alibaba-Backed Moonshot AI Unveils Kimi K2.5 to Challenge China’s AI Rivals
ByteDance Finalizes Majority U.S.-Owned TikTok Joint Venture to Avert American Ban 



