Foxtech Provides Industrial Drone Solutions & UAV Payload Systems.
As embodied AI continues to move from simulation to real-world robotic tasks, researchers and developers need more than a robotic arm. They need a complete platform that can collect high-quality data, support different VLA models, and help validate robot learning workflows in real environments.
The Agility A2 VR Teleoperation VLA Suite is designed for this purpose. It combines dual-arm robotic manipulation, immersive VR teleoperation, multi-camera 3D vision, synchronized data acquisition, and a VLA-ready software pipeline. More importantly, it supports multiple VLA model options, including ACT, SmoLVLA, Pi0, Pi0.5, and XVLA, giving researchers and developers more flexibility for embodied AI experimentation.
Many robotics teams face the same challenge: they have AI models, simulation tools, or research ideas, but they lack a practical robotic platform for real-world data collection and model validation.
A VLA model needs visual input, robot state data, action data, and real task execution. If the hardware, cameras, control system, and software pipeline are not well integrated, researchers may spend too much time building infrastructure instead of testing ideas.
The Agility A2 VR Teleoperation VLA Suite helps solve this problem by providing an integrated dual-arm platform for teleoperation, data collection, imitation learning, VLA training, and real-time inference. It allows teams to move faster from human demonstration to robot learning and from model testing to task execution.
One of the key advantages of the Agility A2 VLA Suite is its multi-model VLA support. According to the supported VLA model structure, the system can work with:
ACT — Default
SmoLVLA — Optional
Pi0 — Optional
Pi0.5 — Optional
XVLA — Optional
This means users can start with ACT as the default inference model, while also having the flexibility to explore other VLA models depending on their research direction, computing configuration, and task requirements.
For research labs, universities, and robotics startups, this multi-model support is important. Different models may perform better in different tasks, such as grasping, picking, placing, folding, tool use, or long-horizon manipulation. Instead of being limited to one model pipeline, users can compare, validate, and develop different VLA-based approaches on the same robotic platform.

ACT is supported as the default model, making it suitable for users who want to quickly start VLA inference and task validation.
For teams that are new to embodied AI or VLA-based robotics, a default model option is very useful. It reduces the initial setup barrier and provides a practical starting point for collecting demonstrations, training behavior policies, and testing robot actions.
With ACT, users can focus on basic manipulation tasks such as grasping, object transfer, pick-and-place, folding, or simple assembly workflows. This makes it suitable for teaching labs, AI robotics courses, and early-stage proof-of-concept validation.
While ACT provides a strong starting point, advanced users may need more model flexibility. This is where optional support for SmoLVLA, Pi0, Pi0.5, and XVLA becomes valuable.
SmoLVLA can be used by teams exploring lightweight or efficient VLA workflows. Pi0 and Pi0.5 provide additional options for experimenting with newer VLA model structures and task execution strategies. XVLA offers another optional direction for users who want to test different vision-language-action capabilities in robotic manipulation scenarios.
The value of these optional models is not just about model variety. It gives users a way to compare performance across different tasks, datasets, and control strategies. A university lab may use the platform for model comparison. A startup may use it to test which model is more suitable for a commercial application. An industrial R&D team may use it to validate whether a VLA model can handle specific repetitive tasks in a controlled environment.
Embodied AI is still developing quickly. No single model can solve every robotic task perfectly. A model that performs well in one task may not be the best choice for another.
For example, one model may be better for short-horizon pick-and-place tasks, while another may be more suitable for more complex sequences involving visual understanding, object interaction, and continuous action planning.
By supporting ACT, SmoLVLA, Pi0, Pi0.5, and XVLA, the Agility A2 VLA Suite gives researchers more room to experiment. Users can collect data once, test different model pipelines, and evaluate how each model performs under real robotic conditions.
This makes the platform especially suitable for embodied AI research, imitation learning, VLA training, robot learning experiments, and simulation-to-real validation.
High-quality robot learning starts with high-quality demonstration data. The Agility A2 VLA Suite uses VR teleoperation to let users control the dual-arm robot in a more natural and immersive way.
With a VR headset, operators can guide the robot through tasks while the system records synchronized visual data, robot state data, and action data. This workflow is especially useful for imitation learning, where the robot learns from human demonstrations.
Compared with traditional control methods, VR teleoperation can make demonstration collection more intuitive. The operator can guide both arms, observe the task environment, and generate task data for later training and inference.
For VLA research, this creates a practical bridge between human operation and autonomous robotic learning.
VLA models rely heavily on visual information. The Agility A2 VLA Suite integrates a multi-camera 3D vision setup, including a head camera and wrist cameras, to capture the workspace from different perspectives.
The head camera provides a broader view of the environment, while wrist cameras provide close-up visual information near the grippers. This combination is useful for manipulation tasks where both global scene understanding and local object interaction are important.
For example, when the robot is picking up an object, the head camera can help understand the overall scene, while the wrist camera can capture detailed visual feedback near the target object. This kind of synchronized RGB-D data can support more accurate VLA training and more reliable real-time inference.
The Agility A2 VLA Suite can support a complete embodied AI workflow:
First, the user controls the robot through VR teleoperation and performs real manipulation tasks. Then, the system captures synchronized data, including camera input, robot motion, and action information. After that, the collected data can be processed through the VLA pipeline for imitation learning, model training, and inference testing. Finally, the trained model can be deployed back to the robot for autonomous task execution.
This workflow helps users test whether a VLA model can move from demonstration data to real robotic behavior.
Typical research tasks may include object grasping, item sorting, folding, assembly, insertion, transportation, shelf placement, and other dual-arm manipulation experiments.
Many real-world tasks require two hands. Folding fabric, transferring objects, stabilizing items, opening packages, arranging objects, or assembling parts often need coordinated bimanual control.
The Agility A2 platform is designed with dual 7-DOF robotic arms, allowing human-like two-arm manipulation. This makes it more suitable for embodied AI research compared with single-arm systems when the target task requires cooperation between both arms.
For VLA training, dual-arm data is also valuable because it allows the model to learn more complex interaction patterns. Instead of only learning simple single-arm reaching or grasping, the system can support more advanced manipulation workflows involving both arms.
The Agility A2 VR Teleoperation VLA Suite is suitable for universities, AI robotics research labs, teaching laboratories, industrial R&D teams, and advanced robotics startups.
For universities, it can be used in embodied AI courses, robot learning experiments, and student research projects. For research labs, it provides a practical platform for VLA model testing, imitation learning, and real-world manipulation studies. For startups and R&D teams, it can support fast proof-of-concept validation before building a more specialized robot system.
Because the platform supports ROS2, NVIDIA Isaac, MoveIt, MuJoCo, Python, C++, URDF, VLA, and IK-related development, it also gives developers more flexibility for integration and secondary development.
Building a VLA robot system from scratch can be time-consuming. Teams need to integrate robotic arms, cameras, VR control, data acquisition, communication, simulation tools, motion planning, and AI model pipelines.
The Agility A2 VLA Suite reduces this development burden by offering a more complete platform for data collection, teleoperation, training, and inference. Users can spend less time on basic hardware-software integration and more time on model performance, task design, dataset quality, and application validation.
For teams working on embodied AI, this can significantly improve development efficiency.
The Agility A2 VR Teleoperation VLA Suite is more than a dual-arm robotic platform. It is a complete research-ready system for embodied AI, VLA training, imitation learning, and real-world robot manipulation.
With ACT as the default model and optional support for SmoLVLA, Pi0, Pi0.5, and XVLA, the platform gives researchers and developers the flexibility to explore different VLA workflows on one integrated robotic system.
By combining VR teleoperation, multi-camera 3D vision, synchronized data acquisition, dual-arm manipulation, and ROS2/NVIDIA Isaac-compatible development tools, Agility A2 helps accelerate the path from human demonstration to robot learning and from AI model testing to real-world task execution.
For universities, research labs, and robotics teams looking for a practical VLA-ready dual-arm platform, the Agility A2 VR Teleoperation VLA Suite provides a powerful foundation for the next stage of embodied AI research.