A lightweight command interface for executing Kinova Gen 3 7-DoF arm skills using typed natural-language-style commands, Cartesian motion primitives, gripper control, and ArUco-based object pose detection.
This project demonstrates a simple control pipeline where a user can type commands such as:
open
close
move left 10 cm
move right 10 cm
move up 5 cm
pick up the box
quitThe command server parses each instruction and executes the corresponding robot behavior.
The demo video shows an interactive command session where typed commands are issued from a laptop and executed on the real robot.
demo2.5x.mp4
The system demonstrates:
- opening and closing the gripper
- relative Cartesian motion commands
- camera-based object pose detection using an ArUco marker
- a high-level pick sequence
- real robot execution on a Kinova arm
Full Demo video available media
The project combines several basic robotics components into a single command-driven system:
- Command parsing — the main command server receives typed instructions and maps them to robot actions.
- Gripper control — the robot gripper can be opened or closed through simple commands.
- Cartesian motion — the robot can move in relative directions such as left, right, forward, backward, up, and down.
- Vision-based object localization — a RealSense camera and ArUco marker are used to estimate the object pose in the robot base frame.
- Pick skill execution — the system computes approach, grasp, and lift poses and executes a simple pick-and-lift sequence.
This is not intended to be a full natural-language robot reasoning system. Instead, it is a practical rule-based command interface for controlling predefined robot skills.
- Interactive command-line robot control
- Natural-language-style typed commands
- Rule-based command parser using keywords and regular expressions
- Kinova gripper open/close control
- Relative Cartesian movement commands
- Absolute Cartesian target commands
- RealSense camera integration
- ArUco marker pose detection
- Object pose transformation into the robot base frame
- Simple approach-grasp-lift pick sequence
- Workspace bounds for safer relative motion
open
open gripper
close
close grippermove left 10 cm
move right 10 cm
move forward 5 cm
move backward 5 cm
move up 5 cm
move down 5 cmIf no distance is provided, the default relative movement is 5 cm.
move to x=0.4 y=0.1 z=0.3pick up the box
pick box
pick up boxThe current pick behavior uses the ArUco-based vision pipeline to latch an object pose and then executes a predefined approach, grasp, and lift sequence.
quit
exit
stoplanguage-commanded-kinova/
│
├── README.md
├── requirements.txt
│
├── src/
│ ├── command_server.py
│ ├── gripper_control.py
│ ├── move_cartesian.py
│ ├── pick.py
│ ├── grasp_utils.py
│ ├── vision_aruco.py
│ └── utilities.py
│
├── docs/
│ ├── command_reference.md
│ └── system_overview.md
│
├── configs/
│ └── camera_calibration.example.json
- Kinova Gen 3 7-DoF robot arm
- Intel RealSense camera
- Printed ArUco marker attached to or placed on the target object
- A clear robot workspace
- Emergency stop access
- Python 3.9+
- Kinova Kortex Python API
- OpenCV
- NumPy
- Intel RealSense SDK / Pyrealsense2
Clone the repository:
git clone https://github.com/muslim-adedamola/language-commanded-kinova.git
cd language-commanded-kinovaCreate and activate a virtual environment:
python3 -m venv venv
source venv/bin/activateInstall Python dependencies:
pip install -r requirements.txtInstall the Kinova Kortex API following the official Kinova documentation for your robot model.
From the repository root:
python src/command_server.py --ip ROBOT_IP --username ROBOT_USERNAME --password ROBOT_PASSWORDExample:
python src/command_server.py --ip 192.168.1.10 --username admin --password adminOnce the server starts, it will show a prompt:
Command:You can then type commands such as:
open
move left 10 cm
move up 5 cm
pick up the box
quitThe pick sequence follows this high-level process:
- Start the RealSense camera stream.
- Detect the ArUco marker in the scene.
- Estimate the object pose in the camera frame.
- Transform the object pose into the robot base frame.
- Wait for the user to press
SPACEin the vision window to latch the pose. - Compute approach, grasp, and lift poses.
- Open the gripper.
- Move to the approach pose.
- Close the gripper.
- Lift the object.
By default, the current implementation can skip the explicit move-to-grasp pose depending on the do_grasp_move flag in the pick sequence.
The vision script contains example calibration values for:
- camera intrinsics
- distortion coefficients
- ArUco marker size
- camera-to-robot-base transform
These values are specific to the original experimental setup and should be recalibrated before using the code on a different robot, camera, or workspace.
For a cleaner setup, you can move these values into:
configs/camera_calibration.example.json
and load them from the vision script.
This code controls a physical robot. Use it carefully.
Before running the system:
- Keep the emergency stop within reach.
- Use a clear workspace.
- Start with small motion distances.
- Keep the robot speed low during testing.
- Verify the robot coordinate frame before issuing movement commands.
- Confirm that the camera-to-base calibration is correct.
- Avoid placing hands or objects near the robot during execution.
- Test gripper commands and small Cartesian movements before running the full pick sequence.
This project is provided for educational and experimental development.
- The command parser is rule-based and does not use an LLM.
- The pick skill currently depends on ArUco marker detection.
- Object selection is limited; the object name in the pick command is currently parsed but not used for multi-object detection.
- The system assumes a calibrated camera-to-base transform.
- The pick behavior is a simple predefined skill, not a learned policy.
- Workspace bounds are basic and do not replace full collision checking.
- The code should be adapted carefully before use in a new environment.
Possible extensions include:
- Add speech-to-text input for voice commands.
- Add LLM-based command interpretation while keeping safety-checked skill execution.
- Support multiple named objects.
- Replace ArUco detection with object detection or segmentation.
- Add collision checking and obstacle-aware motion planning.
- Add a graphical user interface.
- Add configurable robot and camera calibration files.
- Add logging of executed commands and robot outcomes.
- Add confirmation prompts before large robot motions.
Parts of the robot connection, Cartesian waypoint, and gripper command logic are adapted from Kinova Kortex Python API examples.