Training a bot using reinforcement learning to aim and fire at a target. This project was made on Unity 2019.2.11f1.
First create a new 3D project in Unity 2018+. After that clone MLAgents from Unity’s official official repo and install necessary dependecies, like Python and Tensorflow. Next, copy over the MLAgents SDK (/ml-agents/UnitySDK/ML-Agents folder) into the project you just created and install Barracuda. Then import the unitypackage in the release of this repo and you should be good to go.
This process is the same as creating a new envrionment, as described here.
Training is done on an executable build using Python. More information can be found here. These graphs are for the two different reward functions tried on the below mentioned observation space.
Currently the bot is a camera which can rotate on x and y axes. It is placed in a room in which a wall stands with a target appearing randomly.
The bot is fed its rotation, the dispalcement vector between itself and the target and the angle between its forward vector and the displacement vector.
The bot can take 3 actions : Move up-down, left-right and fire.
The bot is rewarded for the following tasks with these values:
- +100 : Correctly aiming at the target.
- +0.1 : Moving the crosshair towards the target.
- -0.001 : Moving the crosshair away from the target.
- -10 : Moving the crosshair out of the wall.
- -0.001 : Every frame, to encourage quicker aiming.
- -0.5 : Every time the environment is reset due to time limit.
When resetting the environment the target is moved to a new random locaation on the wall and the bot’s rotation is set to initial rotation. The environment is reset if
- bot’s aim/crosshair leaves the wall.
- target has been hit.
- fixed amount of time has passed without the above two happening.
When using heuristics to play around and manually test the environment, the controls to interact with the environmment are as follows
- Movement : WASD keys to aim the bot.
- Fire : F key is used to fire. Currently not usable as target moves as soon as crosshair lines up with it.
- Camera :
- C - Change view to individual rooms and cycle between them. Rooms are numbered from 1 to 9.
- V - Switch back to aerial view showing all the training rooms at once.
There is also a “perfect aimmer” script attached to the gun gameObject (the camera that the bot controls) and it uses Quaternion rotation to perfectly aim at the target. This can be turned on or off by ticking a boolean in the inspector in the PerfectAim component unde the Gun gameobject.
Currently the Unity environmet and python training funcitonality is working correctly. The bot is taking aim but with quite a bit of inaccuracy. Different reward parameters and values are bieng tried to correctly teach it to aim accurately. Training does not include actually firin at the moment and only aiming is being learned.
9 instances of the environment are run at the same time to train faster.