Challenge #1:
Seemingly simple tasks such as peeling a fruit or crafting a cocktail take dexterity, spatial awareness, and fine motor control. A place as dynamic as a kitchen also involves several risks, such as hot liquids, sharp tools, and fragile items. Typical VLMs (Vision-Language Models) don't always guarantee safe and precise operation when working with humans in real-time. They often lack physical grounding or memory, or feature latency. In some cases, vague language commands also lead to hazardous situations.
Solution:
We will integrate safety directly into the control stack by introducing workplace boundaries and hazard rules specific to different objects. We'll also enhance their perception modules so misinterpretations, such as putting metal in microwaves, can be avoided. Different learning methods are suited for different tasks and we're going to ensure that's taken into account by using logic-infused VLMs for high-level tasks, and reinforcement learning and imitation learning for tasks involving fine-grained motion. All training methods will be tested using real-life simulations before they're used in the real world.
Challenge #2:
Robots typically find it difficult to operate outside of situations they've been trained for. Changes in a robot's environment or layout could render them unable to function as they normally would.
Solution:
Our plan is to introduce methods around better environmental awareness and safety. We also intend on integrating diverse training scenarios into our testing approach to prepare our bots for unforeseen events. When such events occur, our robots will be able to trigger an alert or ask for human intervention. And to make human intervention seamless, we will use models to improve communication and interpretability.
Challenge #3:
Tasks can get complex when there are multiple robots involved. The synchronization that's required often suffers from factors that make multi-robot operation unsafe.
Solution:
To ensure safe synchronization, we will develop custom simulation environments to train our robots. They will undergo training to avoid collisions, and feature better timing and consistency. We'll make sure our robots perform the same when deployed for real-world applications.