Training AI models requires a large amount of data. For AI models like ChatGPT, they have access to vast amounts of text, images and videos from the Internet to be trained on – and even then, the question of whether AI is running out of training data has been asked before on Electronic Specifier.
For the robotics industry, this is a very different situation: acquiring robot-specific data is costly, and there is a lack of readily available data due to the limited number of robots actively operating in diverse, real-world environments like homes.
In response to these challenges, researchers from the University of Washington (UW) have turned to simulation as a training method for robots. However, this approach, which often requires input from graphic designers or engineers, remains both time-consuming and expensive.
Two recent studies from researchers have introduced innovative AI systems that utilise video and photos to generate simulations capable of training robots for real-world applications. This advancement has the potential to dramatically reduce the cost of training robots for complex tasks.
In the first study, a user scans an area with a smartphone to capture its geometry. The system, known as RialTo, then creates a “digital twin” of the space, allowing users to specify how various elements function, such as the opening of a drawer. A robot can then practise these actions within the simulated environment, adjusting its movements slightly to optimise performance. The second study introduces URDFormer, a system that uses internet-sourced images of real environments to rapidly generate physically realistic simulation environments, enabling robots to train in a variety of settings.
These studies were presented at the Robotics Science and Systems conference in Delft, Netherlands, on 16th and 19th July.
“We’re trying to enable systems that cheaply go from the real world to simulation,” said Abhishek Gupta, a UW assistant professor in the Paul G. Allen School of Computer Science & Engineering and co-senior author on both papers. “The systems can then train robots in those simulation scenes, so the robot can function more effectively in a physical space. That’s useful for safety — you can’t have poorly trained robots breaking things and hurting people — and it potentially widens access. If you can get a robot to work in your house just by scanning it with your phone, that democratises the technology.”
While robots are currently well-suited for structured environments such as assembly lines, training them to interact with people in less predictable settings, such as homes, remains a significant challenge.
“In a factory, for example, there’s a ton of repetition,” said Zoey Chen, lead author of the URDFormer study and a UW doctoral student in the Allen School. “The tasks might be hard to do, but once you program a robot, it can keep doing the task over and over and over. Whereas homes are unique and constantly changing. There’s a diversity of objects, of tasks, of floor plans and of people moving through them. This is where AI becomes really useful to roboticists.”
The two systems tackle these challenges differently.
RialTo, developed by Gupta in collaboration with a team from the Massachusetts Institute of Technology, involves recording the geometry and moving parts of an environment, such as a kitchen, via video. The system employs existing AI models, with some quick input from a human using a graphic user interface, to create a simulated version of the recorded environment.
A virtual robot then learns through trial and error, refining its ability to perform tasks, such as opening a toaster oven, within the simulated space. The experience gained in the simulation is then transferred to the physical environment, where the robot’s performance is almost as accurate as if it had been trained in the real world.
URDFormer, on the other hand, focuses on generating a large number of generic simulations quickly and cost-effectively. It uses images sourced from the internet and combines them with existing models to predict how elements in these environments, such as kitchen drawers and cabinets, might move. This allows for the rapid training of robots across a broad range of environments. However, the accuracy of these simulations is generally lower compared to those created by RialTo.
“The two approaches can complement each other,” Gupta noted. “URDFormer is really useful for pre-training on hundreds of scenarios. RialTo is particularly useful if you’ve already pre-trained a robot, and now you want to deploy it in someone’s home and have it be maybe 95% successful.”
Looking ahead, the RialTo team plans to test its system in real homes, having primarily worked in laboratory settings thus far. Gupta also intends to explore how integrating small amounts of real-world data with simulation data can improve outcomes.
“Hopefully, just a tiny amount of real-world data can fix the failures,” Gupta said. “But we still have to figure out how best to combine data collected directly in the real world, which is expensive, with data collected in simulations, which is cheap, but slightly wrong.”
Original article source:
FAQ
- What is the primary challenge UW researchers are addressing?
– Answer: The primary challenge is the scarcity of high-quality, diverse training data needed for developing and improving robotic systems. Robotics often requires large amounts of specific data to train AI models, particularly for tasks involving manipulation, navigation, and interaction with humans and environments.
- Why is training data important for robotics?
– Answer: Training data is crucial for teaching robots how to perform tasks, recognize objects, and interact with their surroundings. The quality, diversity, and quantity of data directly affect the robot’s ability to learn effectively and perform tasks accurately in real-world scenarios.
- How are UW researchers addressing the lack of training data?
– Answer: UW researchers are developing new methods to generate and collect more diverse and extensive datasets for training robots. This includes using techniques like synthetic data generation, crowd-sourcing data collection, and leveraging simulations to create large-scale, realistic training environments.
- What role does synthetic data play in this research?
– Answer: Synthetic data involves generating artificial datasets that mimic real-world scenarios. UW researchers use synthetic data to supplement real-world data, allowing for more extensive and varied training without the need for labor-intensive data collection processes.
- How does simulation contribute to this effort?
– Answer: Simulation allows researchers to create virtual environments where robots can be trained in a controlled and scalable manner. These simulated environments can replicate a wide range of scenarios, providing robots with the experience needed to handle diverse tasks in the real world.
- What are the advantages of using simulated data over real-world data?
– Answer: Simulated data can be generated faster, more cheaply, and with greater diversity than real-world data. It also allows researchers to control variables and create specific scenarios that may be difficult or impossible to replicate in real life. Moreover, it enables large-scale data generation without physical limitations.
- What challenges are associated with using synthetic or simulated data?
– Answer: One challenge is the “reality gap”—the difference between simulated environments and the real world. Robots trained exclusively on synthetic data may not perform as well in real-world situations due to unmodeled variables or unforeseen complexities. Bridging this gap is a key area of ongoing research.