- Santa Clara
-
20:02
(UTC -12:00) - https://dragonlong.github.io/
- @lxiaol9
Lists (1)
Sort Name ascending (A-Z)
Stars
Realtime & high-frequency control interfaces for various robot arms including bi-manual I2RT YAM, Franka Panda, with manual tele-operation control or autonomous policy control
Official code for paper: N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models
Spirit-v1.5: A Robotic Foundation Model by Spirit AI
Code Release of MVInverse: Feedforward Multi-view Inverse Rendering in Seconds
End-to-end pipeline converting generative videos (Veo, Sora) to humanoid robot motions
Code for [AAAI 2026] AffordDex: Towards Affordance-Aware Robotic Dexterous Grasping with Human-like Priors
official implement of "Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation"
A general physic-based retargeting framework.
Towards Unified Latent VLA for Whole-body Loco-manipulation Control
Minimalistic 4D-parallelism distributed training framework for education purpose
Team Comet's 2025 BEHAVIOR Challenge Codebase
Cosmos-Reason2 models understand the physical common sense and generate appropriate embodied decisions in natural language through long chain-of-thought reasoning processes.
Unified high-performance Python client for object and file stores.
InternVLA-A1: Unifying Understanding, Generation, and Action for Robotic Manipulation
EgoDex: Learning Dexterous Manipulation from Large-Scale Egocentric Video
InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy
VITRA: Scalable Vision-Language-Action Model Pretraining for Robotic Manipulation with Real-Life Human Activity Videos
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada and Blackwell GPUs, to provide better performance…
DGGT: Feedforward 4D Reconstruction of Dynamic Driving Scenes using Unposed Images
InternRobotics' open platform for building generalized navigation foundation models.
GigaWorld-0: World Models as Data Engine to Empower Embodied AI
[RSS 2025] TactAR teleopeartion APP in "Reactive Diffusion Policy: Slow-Fast Visual-Tactile Policy Learning for Contact-Rich Manipulation"
[RSS 2025] Reactive Diffusion Policy: Slow-Fast Visual-Tactile Policy Learning for Contact-Rich Manipulation
Google DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.
robomimic: A Modular Framework for Robot Learning from Demonstration
robosuite: A Modular Simulation Framework and Benchmark for Robot Learning

