Supervisors: Fabien Péan, Prof. Orcun Goksel
Muscle activations that generate movements in the human body are highly redundant, nonlinear and time dependent. This thesis proposes a Reinforcement learning (RL) based controller, which allows the control of a musculoskeletal model of the human shoulder directly via said activations. Two customized reward functions are introduced which enable the position and trajectory control of the humanoid shoulder model and are constructed in a way that allows simple adaption to multiple degrees of freedom and various muscles. Furthermore a generalized learning process is presented which enables end to end learning, i.e. from a given current position, velocity as well as the desired position and velocity directly to muscle activations. One main bottleneck of the numerical model at hand is the slow running time which is overcome by running multiple environment in parallel using network based sockets. The proposed RL based controller is applied to a simplified version of the shoulder with up to five different muscles working at the same time. Convergence was reached after 1.5 million time steps i.e. in about three hours with 100 environments running in parallel. Moreover the trained controller was applied to follow randomly generated trajectories in real time, showing different muscle activations over time with four out of five muscles showing physiological behavior for the movement at hand. The main goal of this project is to explore the use of RL for the control of complex biomechanical models and to move toward the reproduction of muscle recruitment for complex movement of the human body.