Many works have recently explored Sim-to-real transferable visual model predictive control (MPC).
However, such works are limited to one-shot transfer, where real-world data must be collected once to
perform the sim-to-real transfer, which remains a significant human effort of transferring the models
learned in simulations to new domains in the real world.
To alleviate these problems, we first propose a novel model-learning framework called
Kalman Randomized-to-Canonical Model (KRC-model). This framework is capable of extracting
task-relevant intrinsic features and their dynamics from randomized images. We then propose
Kalman Randomized-to-Canonical Model Predictive Control (KRC-MPC) as a zero-shot sim-to-real
transferable visual MPC using KRC-model. The effectiveness of our method is evaluated through valve rotation
tasks by a robot hand in both simulation and the real world, and the block mating tasks in simulation. The
experimental results show that KRC-MPC can be applied to various real domains and tasks in a zero-shot
manner.
Overview
RCAN Network Architecture
Simulation Experiments
Valve Rotation Task
[Success]
KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy
[Failure]
KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy
Block Mating Task
[3 Lower Error Domains]
KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy
[3 Upper Error Domains]
KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy
Real Experiments (Valve Rotation Task)
Test Domains
[Success]
--Domain1--
KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy
--Domain2--
KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy
--Domain3--
KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy
--Domain4--
KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy
--Domain5--
KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy
--Domain6--
KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy
[Failure]
--Domain1--
KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy
--Domain2--
KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy
--Domain3--
KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy
--Domain4--
KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy
--Domain5--
KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy
--Domain6--
KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy
Parameters
Additional Tasks (Valve Rotation)
Supplementary Material
We show the figure about the evaluation result of control performance in additional tasks in the real world.
This figure corresponds to the result of TABLE V in the paper.
This figure shows that target and actual valve trajectory during execution in additional tasks of KRC w/ z and KC2 w/ z.
The means and confidence intervals for the three results with the lowest values of Eq. (17) among the five models are shown to eliminate the outlier values for failure cases.
From this figure, we can see that KRC w/ z indeed can track each target valve trajectory much better than KC2 w/ z.