Abstract

Many works have recently explored Sim-to-real transferable visual model predictive control (MPC). However, such works are limited to one-shot transfer, where real-world data must be collected once to perform the sim-to-real transfer, which remains a significant human effort of transferring the models learned in simulations to new domains in the real world. To alleviate these problems, we first propose a novel model-learning framework called Kalman Randomized-to-Canonical Model (KRC-model). This framework is capable of extracting task-relevant intrinsic features and their dynamics from randomized images. We then propose Kalman Randomized-to-Canonical Model Predictive Control (KRC-MPC) as a zero-shot sim-to-real transferable visual MPC using KRC-model. The effectiveness of our method is evaluated through valve rotation tasks by a robot hand in both simulation and the real world, and the block mating tasks in simulation. The experimental results show that KRC-MPC can be applied to various real domains and tasks in a zero-shot manner.

Overview

Overview

RCAN Network Architecture

net arch.

Simulation Experiments

Valve Rotation Task

[Success]

KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy

[Failure]

KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy

Block Mating Task

[3 Lower Error Domains]

KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy

[3 Upper Error Domains]

KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy

Real Experiments (Valve Rotation Task)

Test Domains

[Success]

--Domain1--

KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy

--Domain2--

KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy

--Domain3--

KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy

--Domain4--

KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy

--Domain5--

KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy

--Domain6--

KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy

[Failure]

--Domain1--

KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy

--Domain2--

KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy

--Domain3--

KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy

--Domain4--

KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy

--Domain5--

KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy

--Domain6--

KRC w/ z
KRC w/o z
KR2 w/ z
KR2 w/o z
KC2 w/ z
Random Policy

Parameters

Additional Tasks (Valve Rotation)

Supplementary Material

We show the figure about the evaluation result of control performance in additional tasks in the real world. This figure corresponds to the result of TABLE V in the paper. This figure shows that target and actual valve trajectory during execution in additional tasks of KRC w/ z and KC2 w/ z. The means and confidence intervals for the three results with the lowest values of Eq. (17) among the five models are shown to eliminate the outlier values for failure cases. From this figure, we can see that KRC w/ z indeed can track each target valve trajectory much better than KC2 w/ z.
cat