Beyond the Current Observation: Evaluating Multimodal Large Language Models in Controllable Non-Markov Games Paper • 2606.19338 • Published 5 days ago • 46
JoyAI-VL-Interaction: Real-Time Vision-Language Interaction Intelligence Paper • 2606.14777 • Published 12 days ago • 196
OmniDirector: General Multi-Shot Camera Cloning without Cross-Paired Data Paper • 2606.13432 • Published 11 days ago • 106
CoVEBench: Can Video Editing Models Handle Complex Instructions? Paper • 2606.08415 • Published 15 days ago • 49
Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories Paper • 2606.02060 • Published 21 days ago • 55
SmartPhotoCrafter: Unified Reasoning, Generation and Optimization for Automatic Photographic Image Editing Paper • 2604.19587 • Published Apr 21 • 49
LongAV-Compass: Towards Unified Evaluation of Minute-Scale Audio-Visual Generation Across T2AV, I2AV, and V2AV Paper • 2605.26244 • Published 28 days ago • 38
DexJoCo: A Benchmark and Toolkit for Task-Oriented Dexterous Manipulation on MuJoCo Paper • 2605.16257 • Published May 15 • 55
MemPrivacy: Privacy-Preserving Personalized Memory Management for Edge-Cloud Agents Paper • 2605.09530 • Published May 10 • 149
Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers Paper • 2605.06169 • Published May 7 • 236
MultiWorld: Scalable Multi-Agent Multi-View Video World Models Paper • 2604.18564 • Published Apr 20 • 47
OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation Paper • 2604.18486 • Published Apr 20 • 96
Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence Paper • 2604.18292 • Published Apr 20 • 87
DR^{3}-Eval: Towards Realistic and Reproducible Deep Research Evaluation Paper • 2604.14683 • Published Apr 16 • 36
Frontier-Eng: Benchmarking Self-Evolving Agents on Real-World Engineering Tasks with Generative Optimization Paper • 2604.12290 • Published Apr 14 • 16
Frontier-Eng: Benchmarking Self-Evolving Agents on Real-World Engineering Tasks with Generative Optimization Paper • 2604.12290 • Published Apr 14 • 16