Publications
REValueD: Regularised Ensemble Value-Decomposition for Factorisable Markov Decision Processes
Accepted for a short talk at ICLR 2024. This work looks at how the well-known over estimation bias inherent to Q-learning with function approximation is affected when performing Q-learning with value-decomposition for high dimensional discrete action spaces. We find that the bias of the learning target is decreased, whilst the variance is increased. To mitigate the increased variance, we demonstrate that the use of an ensemble can effectively reduce the increased variance!
LeNSE: Learning To Navigate Subgraph Embeddings for Large-Scale Combinatorial Optimisation
Accepted for a short talk at ICML 2022. This work investigates how to effectively prune Combinatorial Optimisation by efficiently searching for an optimal subgraph, initialised at any random subgraph.