Abstract: Model-free control approaches require advanced exploration-exploitation policies to achieve practical tasks such as learning to bipedal robot walk in unstructured environments. In this ...