Skip to content Skip to sidebar Skip to footer

0 items - $0.00 0

Robotics

Google DeepMind Ships Three Physical AI Models For Whole Body Control, Dexterity And Multi Robot Collaboration

Robotics17 hours ago1View 0Likes 0Comments

Google DeepMind has released Gemini Robotics 2, the intelligence layer for its next generation of robots. The release moves the stack past table-top manipulation into whole body control, five finger dexterity and multi robot teamwork. It ships as three separate models with three different access tiers. Most robots today are pre-programmed or tele-operated for narrow,…

NVIDIA Releases Cosmos 3 Edge: A 4B-Parameter Open World Model That Reasons and Generates Robot Actions On-Device

RoboticsJuly 21, 20269Views 0Likes 0Comments

NVIDIA has released Cosmos 3 Edge, a 4-billion-parameter open world model built to run on-device. It helps robots and vision AI agents understand surroundings, reason in real time, and generate robot actions locally. The Cosmos 3 family included Cosmos 3 Nano (16B) and Cosmos 3 Super (64B) shipped on May 31, 2026 at GTC Taipei.…

Mistral AI Releases Robostral Navigate: An 8B Model Enabling Robots to Navigate Complex Environments Using a Single RGB Camera

RoboticsJuly 16, 202614Views 0Likes 0Comments

Mistral AI has released Robostral Navigate, its first model built for embodied navigation. The 8B model takes RGB images and a plain-language instruction, then moves a robot. Notably, it reaches 76.6% success on R2R-CE validation unseen using only a single RGB camera. What is Robostral Navigate? Robostral Navigate is an 8B model for robotic navigation…

Ant Group’s Robbyant Unveils LingBot-VA 2.0: A Causal Video-Action Model Built Natively for Physical AI

RoboticsJuly 11, 202612Views 0Likes 0Comments

Robbyant, the embodied AI unit inside Ant Group, has released the LingBot-VA 2.0.The first embodied-native foundation model. It describes a video-action foundation model for generalist robot manipulation. The research team pretrains the whole stack for embodiment instead of fine-tuning a video generator. What is LingBot-VA 2.0? Most video-action models reuse two components built for digital…

NVIDIA AI Introduces ASPIRE: A Self-Improving Robotics Framework Reaching 31% Zero-Shot on LIBERO-Pro Long Tasks

RoboticsJuly 6, 202613Views 0Likes 0Comments

Traditional robot programming is hard to scale. It requires orchestrating multimodal perception, physical contact dynamics, diverse configurations, and execution failures by hand. Code-as-policy systems let language models compose these into executable robot programs. That makes robot behavior inspectable, editable, and debuggable. But existing robotic coding agents run in naive execution environments. They receive only coarse,…

Meet OAT: The New Action Tokenizer Bringing LLM-Style Scaling and Flexible, Anytime Inference to the Robotics World

RoboticsJuly 1, 202624Views 0Likes 0Comments

Robots are entering their GPT-3 era. For years, researchers have tried to train robots using the same autoregressive (AR)…

NVIDIA Releases DreamDojo: An Open-Source Robot World Model Trained on 44,711 Hours of Real-World Human Video Data

RoboticsJune 26, 202612Views 0Likes 0Comments

Building simulators for robots has been a long term challenge. Traditional engines require manual coding of physics and perfect 3D models. NVIDIA is changing this with DreamDojo, a fully open-source, generalizable robot world model. Instead of using a physics engine, DreamDojo ‘dreams’ the results of robot actions directly in pixels. https://arxiv.org/pdf/2602.06949 Scaling Robotics with 44k+…

Physical Intelligence Team Unveils MEM for Robots: A Multi-Scale Memory System Giving Gemma 3-4B VLAs 15-Minute Context for Complex Tasks

RoboticsJune 21, 202624Views 0Likes 0Comments

Current end-to-end robotic policies, specifically Vision-Language-Action (VLA) models, typically operate on a single observation or a very short history. This ‘lack of memory’ makes long-horizon tasks, such as cleaning a kitchen or following a complex recipe, computationally intractable or prone to failure. To address this, researchers from Physical Intelligence, Stanford, UC Berkeley, and MIT have…

Meet Qwen-RobotSuite: Three Embodied AI Models for VLA Manipulation, Video World Modeling, and Navigation

RoboticsJune 16, 202619Views 0Likes 0Comments

The Qwen team has released three embodied AI models, grouped as Qwen-Robot-Suite. The three are Qwen-RobotManip, Qwen-RobotWorld, and Qwen-RobotNav. Each is built on a Qwen vision-language backbone and targets a different robotics problem. Qwen-RobotManip is a Vision-Language-Action model for manipulation, built on Qwen3.5-4B. Qwen-RobotWorld is a language-conditioned video world model with a 60-layer MMDiT and…

How to Build Advanced Cybersecurity AI Agents with CAI Using Tools, Guardrails, Handoffs, and Multi-Agent Workflows

RoboticsJune 11, 202618Views 0Likes 0Comments

In this tutorial, we build and explore the CAI Cybersecurity AI Framework step by step in Colab using an OpenAI-compatible model. We begin by setting up the environment, securely loading the API key, and creating a base agent. We gradually move into more advanced capabilities such as custom function tools, multi-agent handoffs, agent orchestration, input…