MIT researchers at the Computer Science and Artificial Intelligence Laboratory (CSAIL) have created a predictive AI (Artificial Intelligence) that allows robots to connect multiple senses in the same way as humans.
“While our sense of contact gives us a channel to feel the physical world, our eyes help us to immediately understand the full picture of these tactile signals,” writes Rachel Gordon of MIT CSAIL. In robots, this connection does not exist. In an effort to fill the gap, researchers have developed a predictive AI that can learn to “see touching” and “feel to see,” a means of linking the senses of sight and touch to future robots.
Using a robotic arm by the name of KUKA, with a touch sensor called GelSight (another creation of MIT), the team recorded almost 200 objects with a web cam. These include tools, fabrics, household products and other materials with which humans regularly contact.
The team used the robotic arm to play the items more than 12,000 times, recording video for later analysis. Altogether, the researchers have wiped out more than three million visual / tactile images in their data set.
“By looking at the scene, you can imagine the sensation of touching a flat or sharp surface,” said Yunzhu Li, a doctoral student at CSAIL and the lead author of a new article on the system. iteration with the environment purely of tactile feelings. “
We can touch an item once, even years before, and get a sense of how it feels when we get in touch with it later. In robots, this could help reduce human input to mechanical tasks such as flipping a switch or deciding where the safest place to put a package is.
According to Li:
Gathering these two senses could empower the robot and reduce the data we might need for tasks involving manipulating and grasping objects.
By referencing images from a set of data, future robotic arms – such as those used to assemble cars or mobile phones, for example – could make immediate predictions by comparing the object ahead of it with those in the dataset. Once operational, the arm can easily identify the best place to lift, fold or manipulate the object.
The current MIT dataset was built on iterations within a controlled environment. The team hopes to improve this by collecting data in more unstructured areas to broaden the data set.