Visual Large Language Models

Study Shows Today’s Top AI Models Struggle With Visual Reasoning—Raising Concerns for Real-World Use

New study reveals top AI models still struggle with visual reasoning, exposing hidden weaknesses in today’s multimodal ...

12d

Z.ai debuts open source GLM-4.6V, a native tool-calling vision model for multimodal reasoning

Chinese AI startup Zhipu AI aka Z.ai has released its GLM-4.6V series, a new generation of open-source vision-language models ...

WinBuzzer

Z.ai Launches GLM-4.6V AI Model to Let AI Agents See Natively

V, a multimodal model that has introduced native visual function calling to bypass text conversion in agentic workflows.

InfoQ

LLaVA-CoT Shows How to Achieve Structured, Autonomous Reasoning in Vision Language Models

A monthly overview of things you need to know as an architect or aspiring architect. Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with ...

Futurism

Large Language Models Will Never Be Intelligent, Expert Says

Are tech companies on the verge of creating thinking machines with their tremendous AI models, as top executives claim they are? Not according to one expert. We humans tend to associate language with ...

EurekAlert!

New multi-modal AI framework brings human-like reasoning to self-driving vehicles

Autonomous driving systems increasingly rely on data-driven approaches, yet many still struggle with reasoning, handling rare scenarios, and transparently explaining their actions. A new study ...

Results that may be inaccessible to you are currently showing.

Hide inaccessible results