9 43 9

zhumuzhi

Z-MU-Z

Z-MU-Z

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

MemDreamer: Decoupling Perception and Reasoning for Long Video Understanding via Hierarchical Graph Memory and Agentic Retrieval Mechanism

upvoted a paper 3 days ago

Latent Spatial Memory for Video World Models

new activity 4 days ago

ReasonMatch/ReasonMatch:Add task category and improve dataset card

View all activity

Organizations

upvoted a paper 1 day ago

MemDreamer: Decoupling Perception and Reasoning for Long Video Understanding via Hierarchical Graph Memory and Agentic Retrieval Mechanism

Paper • 2606.07512 • Published 7 days ago • 36

upvoted a paper 3 days ago

Latent Spatial Memory for Video World Models

Paper • 2606.09828 • Published 4 days ago • 62

upvoted a paper 7 days ago

Eliciting Complex Spatial Reasoning in MLLMs through Wide-Baseline Matching

Paper • 2606.03577 • Published 10 days ago • 16

upvoted a paper 8 days ago

AGILE: Hand-Object Interaction Reconstruction from Video via Agentic Generation

Paper • 2602.04672 • Published Feb 4 • 1

upvoted a paper 10 days ago

Where to Look: Can Foundation Models Reach a Target Viewpoint Through Active Exploration?

Paper • 2606.01247 • Published 12 days ago • 30

upvoted a paper 16 days ago

TriSplat: Simulation-Ready Feed-Forward 3D Scene Reconstruction

Paper • 2605.26115 • Published 18 days ago • 52

upvoted a paper 25 days ago

Flash-GRPO: Efficient Alignment for Video Diffusion via One-Step Policy Optimization

Paper • 2605.15980 • Published 28 days ago • 36

upvoted 2 papers about 1 month ago

MARBLE: Multi-Aspect Reward Balance for Diffusion RL

Paper • 2605.06507 • Published May 7 • 40

World-R1: Reinforcing 3D Constraints for Text-to-Video Generation

Paper • 2604.24764 • Published Apr 27 • 118

upvoted a paper about 2 months ago

Exploring Spatial Intelligence from a Generative Perspective

Paper • 2604.20570 • Published Apr 22 • 23

upvoted 2 papers 2 months ago

OmniJigsaw: Enhancing Omni-Modal Reasoning via Modality-Orchestrated Reordering

Paper • 2604.08209 • Published Apr 9 • 26

InCoder-32B-Thinking: Industrial Code World Model for Thinking

Paper • 2604.03144 • Published Apr 3 • 235

upvoted a paper 3 months ago

HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning

Paper • 2603.17024 • Published Mar 17 • 110

upvoted 3 papers 4 months ago

OmniVideo-R1: Reinforcing Audio-visual Reasoning with Query Intention and Modality Attention

Paper • 2602.05847 • Published Feb 5 • 12

LLaDA2.1: Speeding Up Text Diffusion via Token Editing

Paper • 2602.08676 • Published Feb 9 • 73

Alleviating Sparse Rewards by Modeling Step-Wise and Long-Term Sampling Effects in Flow-Based GRPO

Paper • 2602.06422 • Published Feb 6 • 47

upvoted a paper 5 months ago

Beyond Hard Masks: Progressive Token Evolution for Diffusion Language Models

Paper • 2601.07351 • Published Jan 12 • 26

upvoted a paper 6 months ago

Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic Quality

Paper • 2512.07951 • Published Dec 8, 2025 • 51

upvoted a paper 7 months ago

Emu3.5: Native Multimodal Models are World Learners

Paper • 2510.26583 • Published Oct 30, 2025 • 117

upvoted a collection 7 months ago

Emu3.5

Collection

Native Multimodal Models are World Learners 🌍 • 4 items • Updated Feb 4 • 77

zhumuzhi

AI & ML interests

Recent Activity

Organizations

Z-MU-Z's activity