← 返回 AI 每日简报

2026-03-11 · AI 每日简报

2026-03-11 18:36

robotdaily ai-daily embodied 具身智能 representation 表征学习 reinforcement 强化学习 llm

Hugo 归档版，来源于 RobotDaily 当日 Markdown 简报。

RobotDaily 2026-03-11：共 9 篇，含具身智能 3 篇，表征学习 3 篇，强化学习 3 篇。

偏应用导向精选，按方向整理成短卡片式 Markdown 归档。

具身智能（3 篇）

1. PlayWorld: Learning Robot World Models from Autonomous Play

关键词命中 real world, deployed, world model, scalable，应用信号: real world, deployed, robot；创…

作者：Tenny Yin, Zhiting Mei, Zhonghe Zheng, Miyu Yamane 等另外7人
标签：具身智能 机器人 真实部署 操控
中文摘要：【LLM 暂不可用，先保留英文摘要要点】Action-conditioned video models offer a promising path to building general-purpose robot simulators that can improve directly from data. Yet, despite training on large-scale robot datasets, current s…
链接：DOI | arXiv | PDF

2. MetaWorld-X: Hierarchical World Modeling via VLM-Orchestrated Experts for Humanoid Loco-Manipulation

关键词命中 robot, robotic, world model，应用信号: robot, robotic, system；创新信号: world model；领域匹配…

作者：Yutong Shen, Hangxu Liu, Penghui Liu, Jiashuo Luo 等另外5人
标签：具身智能 机器人 真实部署 操控
中文摘要：【LLM 暂不可用，先保留英文摘要要点】Learning natural, stable, and compositionally generalizable whole-body control policies for humanoid robots performing simultaneous locomotion and manipulation (loco-manipulation) remains a fundament…
链接：DOI | arXiv | PDF

3. Embodied Human Simulation for Quantitative Design and Analysis of Interactive Robotics

关键词命中 robot, robotic, scalable，应用信号: robot, robotic, system；创新信号: scalable；领域匹配: embo…

作者：Chenhui Zuo, Jinhao Xu, Michael Qian Vergnolle, Yanan Sui
标签：具身智能 机器人 真实部署 操控
中文摘要：【LLM 暂不可用，先保留英文摘要要点】Physical interactive robotics, ranging from wearable devices to collaborative humanoid robots, require close coordination between mechanical design and control. However, evaluating interactive dynami…
链接：DOI | arXiv | PDF

表征学习（3 篇）

1. $M^2$-Occ: Resilient 3D Semantic Occupancy Prediction for Autonomous Driving with Incomplete Camera Inputs

关键词命中 real-world, deployment, first，应用信号: real-world, deployment, system；创新信号: first；…

作者：Kaixin Lin, Kunyu Peng, Di Wen, Yufan Chen 等另外2人
标签：表征学习 潜在空间 世界模型 预训练
中文摘要：【LLM 暂不可用，先保留英文摘要要点】Semantic occupancy prediction enables dense 3D geometric and semantic understanding for autonomous driving. However, existing camera-based approaches implicitly assume complete surround-view observat…
链接：DOI | arXiv | PDF

2. Emerging Extrinsic Dexterity in Cluttered Scenes via Dynamics-aware Policy Learning

关键词命中 real-world, real world, world model，应用信号: real-world, real world, deployment；创新…

作者：Yixin Zheng, Jiangran Lyu, Yifan Zhang, Jiayi Chen 等另外7人
标签：表征学习 潜在空间 世界模型 预训练
中文摘要：【LLM 暂不可用，先保留英文摘要要点】Extrinsic dexterity leverages environmental contact to overcome the limitations of prehensile manipulation. However, achieving such dexterity in cluttered scenes remains challenging and underexplored…
链接：DOI | arXiv | PDF

3. From Semantics to Pixels: Coarse-to-Fine Masked Autoencoders for Hierarchical Visual Understanding

关键词命中 dataset, self-supervised, first，应用信号: dataset；创新信号: self-supervised, first；领域匹配…

作者：Wenzhao Xiang, Yue Wu, Hongyang Yu, Feng Gao 等另外2人
标签：表征学习 潜在空间 世界模型 预训练
中文摘要：【LLM 暂不可用，先保留英文摘要要点】Self-supervised visual pre-training methods face an inherent tension: contrastive learning (CL) captures global semantics but loses fine-grained detail, while masked image modeling (MIM) preserves lo…
链接：DOI | arXiv | PDF

强化学习（3 篇）

1. SPAARS: Safer RL Policy Alignment through Abstract Exploration and Refined Exploitation of Action Space

关键词命中 robot, robotic，应用信号: robot, robotic；领域匹配: reinforcement learning, policy gradie…

作者：Swaminathan S K, Aritra Hazra
标签：强化学习 策略优化 奖励设计 离线RL
中文摘要：【LLM 暂不可用，先保留英文摘要要点】Offline-to-online reinforcement learning (RL) offers a promising paradigm for robotics by pre-training policies on safe, offline demonstrations and fine-tuning them via online interaction. However, a…
链接：DOI | arXiv | PDF

2. Robust Regularized Policy Iteration under Transition Uncertainty

关键词命中 benchmark, unified，应用信号: benchmark；创新信号: unified；领域匹配: reinforcement learning,…

作者：Hongqiang Lin, Zhenghui Fu, Weihao Tang, Pengfei Wang 等另外3人
标签：强化学习 策略优化 奖励设计 离线RL
中文摘要：【LLM 暂不可用，先保留英文摘要要点】Offline reinforcement learning (RL) enables data-efficient and safe policy learning without online exploration, but its performance often degrades under distribution shift. The learned policy may vis…
链接：DOI | arXiv | PDF

3. Towards Batch-to-Streaming Deep Reinforcement Learning for Continuous Control

关键词命中 benchmark, hardware, novel，应用信号: benchmark, hardware, sim2real；创新信号: novel；领域匹配…

作者：Riccardo De Monte, Matteo Cederle, Gian Antonio Susto
标签：强化学习 策略优化 奖励设计 离线RL
中文摘要：【LLM 暂不可用，先保留英文摘要要点】State-of-the-art deep reinforcement learning (RL) methods have achieved remarkable performance in continuous control tasks, yet their computational complexity is often incompatible with the constrain…
链接：DOI | arXiv | PDF