Unity Machine Learning For Project Butterfly by avivelor - 2

Games & ProjectsVR & AR

An exploration of Unity ML Agents for deep reinforcement learning on a "robo arm" to protect butterflies with bubble shields in an immersive virtual reality exercise game.

Unknown VersionMIT LicenseUpdated 9 days agoCreated on February 13th, 2020
Go to source

UnityMachineLearningForProjectButterfly

Aviv Elor - aelor@ucsc.edu - avivelor1@gmail.com

What if we could train a virtual robot arm to guide us through our physical exercises, compete with us, and test out various double-jointed movements? This project is an exploration of Unity ML-Agents on training a double-jointed “robot arm” to protect butterflies with bubble shields in an immersive virtual environment. The arm is trained through utilizing General Adversarial Imitation Learning (GAIL) and Reinforcement Learning through Proximal Policy Optimization (PPO) to play an Immersive Virtual Reality Physical Exercise Game. Overall, this was a fun, deep dive into exploring Machine Learning through mlagents. Feel free to try out the standalone build if you want to attempt to predict torque values for two joints better than a neural network, or use the VR build to compete head to head with a neural network-driven arm for the same VR exercise game!

If any questions, email aelor@ucsc.edu and or message Aviv Elor.

Imitation Learning and Virtual Reality Gameplay

To explore imitation learning with PPO, refer to the Unity ML-Agents Documentation at https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Readme.md and https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Training-Imitation-Learning.md . This section explored the application of General Adversarial Imitation Learning (GAIL) with Deep Reinforcement Learning through Proximal Policy Optimization (PPO). Demonstrations were recorded with an HTC Vive 2018 VR System by utilizing two 2018 Vive Trackers on a human demonstrator’s shoulder and elbow joint to capture torque and angular momentum using Unity’s Fixed Joint API.

To get started, download Anaconda and set up a virtual environment through conda activate mlagents (see this example to configure your environment https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Readme.md). Enter the training scene at the following:

~\UnitySDK\Assets\ML-Agents\Examples\CM202-ExoArm\Scenes\

To capture demonstrations in VR, place the SteamVR trackers at the elbow and shoulder joint of the human user. Set the Observations of the agent to be “Heuristic Only.” Check the record demonstration box in the Demonstration Recorder script and have the human user perform ideal and concise movements. After recording the demonstration, update the config yaml files to point to the demonstration for GAIL.

Pre-recorded demonstrations for Project Butterfly can be found at:

~\UnitySDK\Assets\Demonstrations\

Training configuration with GAIL for Project Butterfly can be found at:

~\config\trainer_config_exoarm.yaml

After demonstrations are recorded, proceed back to the training scene to begin agent learning. With the anaconda terminal, prepare to train through using the following terminal command:

mlagents-learn config/gail_config_exoimitationarm.yaml --run-id=<run-identifier> --train --time-scale=100

Now sit back and let the model train. After checkpoints are saved, you can use tensorboard to examine the model’s performance:

tensorboard --logdir=summaries

The trained model for this section can be found at:

~\models\ImitationButterfly-0\ExoReacher.nn
or
~\UnitySDK\Assets\ML-Agents\Examples\CM202-ExoArm\TFModels\ImitationReacher\

A demo video of this section can be found at: https://youtu.be/ckMaDXHUGrw

IMAGE ALT TEXT

Reinforcement Learning and Non-VR Based Gameplay

To mess around with deep reinforcement learning and training, refer to the Unity ML-Agents documentation at https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Readme.md. To get started, download Anaconda and set up a virtual environment through conda activate mlagents (see this example to configure your environment https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Readme.md). Enter the training scene at the following:

~\UnitySDK\Assets\ML-Agents\Examples\CM202-ExoArm\Scenes\

With the anaconda terminal, prepare to train through using the following terminal command:

mlagents-learn config/trainer_config_exoarm.yaml --run-id=<run-identifier> --train --time-scale=100

Now sit back and let the model train. After checkpoints are saved, you can use tensorboard to examine the model’s performance:

tensorboard --logdir=summaries

Subsequently, I was able to train the robot arm very well through utilizing 16 agents in parallel through Deep Reinforcement Learning with Proximal Policy Optimization (PPO). After four hours of training, my reward slowly rose from 0.1 to 40 (where +0.01 reward was given per every frame the arm successfully protected the butterfly). See the demo video below for a discussion of the training, results, and demo experience.

The trained model for this section can be found at:

~\models\ExoReacherPBF-0\ExoReacher.nn
or
~\UnitySDK\Assets\ML-Agents\Examples\CM202-ExoArm\TFModels\ExoReacher\

A demo video of this section can be found at: https://youtu.be/5J7xes28bZA

IMAGE ALT TEXT

Materials and References

Materials:

External Tools Used and Modified for this Project:

Reading References:

Show all projects by avivelor