ARX Robot Control System
This skill provides expert guidance for working with the ARX robotic manipulation system based on mobile-aloha and act-plus-plus frameworks. The system implements Action Chunking with Transformers (ACT) for imitation learning on dual-arm robotic platforms.
What This Skill Does
Assists developers working with the ARX dual-arm robot platform by:
Understanding the three-phase workflow (data collection → training → inference)Navigating the codebase structure and key componentsRunning data collection, training, and inference pipelinesConfiguring cameras, CAN bus communication, and ROS2 integrationDebugging hardware setup and multi-process coordinationManaging conda environments and dependenciesInstructions
1. Repository Structure Understanding
When users ask about code organization:
**act/**: Core ACT implementation - `collect.py`: Data collection from robot sensors/cameras
- `train.py`: Neural network training for imitation learning
- `inference.py`: Real-time robot control with trained models
- `robomimic/`: Robotics learning framework integration
- `detr/`: DETR transformer architecture for vision
- `utils/`: Policy utilities, ROS operations, data handling
**realsense/**: Intel RealSense camera integration (3 cameras: head, left, right) - Configured for 640x480@90fps color and depth streams
**tools/**: Automation scripts - `01_collect.sh`: Automated data collection
- `02_train.sh`: Training pipeline
- `03_inference.sh`: Inference deployment
**ARX_CAN/**: CAN bus communication**ROS2/**: ROS2 workspace for robot control**arx_joy/**: Joystick controller integration2. Environment Setup
When users need to set up the development environment:
```bash
Activate conda environment
conda activate act
Install dependencies
pip install -r tools/IL/requirements.txt
```
Key dependencies:
PyTorch with CUDA supportROS2 (rclpy)OpenCV, torchvisionmujoco, dm_controlh5py, numpy==1.263. Data Collection Workflow
When users want to collect training data:
**Automated approach:**
```bash
./tools/01_collect.sh
```
**Manual approach:**
```bash
cd act
python collect.py --episode_idx -1 --num_episodes 20
```
Explain that collection requires:
CAN bus communication runningROS2 controllers activeAll 3 RealSense cameras streamingJoystick controller connectedCamera topics: `/camera/camera_{h,l,r}/color/image_rect_raw/compressed`
4. Training Workflow
When users want to train models:
**Automated approach:**
```bash
./tools/02_train.sh
```
**Manual approach:**
```bash
cd act
python train.py --num_episodes -1
```
Training requirements:
GPU with CUDA supportCollected demonstration data in h5 formatSufficient disk space for checkpoints5. Inference/Deployment
When users want to run inference:
**Automated approach:**
```bash
./tools/03_inference.sh
```
**Manual approach:**
```bash
cd act
python inference.py
```
Inference requires:
Trained model checkpointCamera streams activeRobot controllers runningReal-time performance (GPU recommended)6. Camera Management
When users need to manage RealSense cameras:
```bash
cd realsense
./realsense.sh
```
Cameras are hardcoded with specific serial numbers for:
Head cameraLeft cameraRight cameraAll stream at 640x480@90fps.
7. Configuration Files
Point users to key configuration locations:
**Main config**: `act/data/config.yaml` - Camera topics
- Arm controller topics (`/arm_master_{l,r}_status`, `/arm_slave_{l,r}_status`)
- Robot base configuration
**Code style**: `.flake8` - Max line length: 120 characters
- Run checks: `flake8 <file>`
8. Multi-Process Coordination
Explain that the system uses gnome-terminal for coordinating multiple processes:
CAN bus communicationROS2 controllersCamera nodesData collection/inference processesJoystick input handling9. Hardware Requirements
When discussing system requirements:
ARX dual-arm robot platform3x Intel RealSense cameras (specific serial numbers)CAN bus interfaceGPU with CUDA support for training/inferenceJoystick controller for teleoperation10. Debugging Guidance
Common issues to check:
Camera serial numbers match hardware in `realsense.sh`All processes launched successfully in multi-terminal setupCUDA/GPU availability for training and inferenceROS2 topics publishing correctlyCAN bus communication established11. Code Conventions
Codebase primarily in Chinese with some English commentsFollow flake8 style guidelines (max line length 120)Neural network architecture based on DETR transformersData format: HDF5 for episode storageExamples
**Example 1: Starting data collection**
```bash
User asks: "How do I collect training data?"
Response:
./tools/01_collect.sh
This automated script will:
1. Start CAN bus communication
2. Launch ROS2 controllers
3. Start RealSense cameras
4. Begin data collection with teleoperation
```
**Example 2: Training a model**
```bash
User asks: "I have 50 episodes collected, how do I train?"
Response:
cd act
python train.py --num_episodes 50
Or use the automated script:
./tools/02_train.sh
```
**Example 3: Configuration check**
```bash
User asks: "Where do I configure camera topics?"
Response: Edit act/data/config.yaml
Camera topics follow pattern:
/camera/camera_h/color/image_rect_raw/compressed (head)
/camera/camera_l/color/image_rect_raw/compressed (left)
/camera/camera_r/color/image_rect_raw/compressed (right)
```
Important Constraints
**Hardware-specific**: System expects specific ARX robot hardware and RealSense camera serial numbers**Multi-terminal requirement**: Scripts use gnome-terminal for process coordination**GPU required**: Training and inference need CUDA support**ROS2 dependency**: All robot control goes through ROS2 topics**Real-time constraints**: Inference must run at sufficient frame rate for robot controlWhen to Use This Skill
Use this skill when:
Working with ARX dual-arm robot platformsImplementing imitation learning for roboticsSetting up data collection pipelines for robot manipulationTraining ACT models for robot controlDeploying learned policies for real-time inferenceDebugging ROS2/camera/CAN bus integration issuesUnderstanding mobile-aloha/act-plus-plus architectures