Machine Learning Seminar
Thursday, 1 June, 2pm, ELG10
Jakob Foerster (University of Oxford)
Title: Counterfactual Multi-Agent Policy Gradients
Abstract: Cooperative multi-agent systems can be naturally used to model many real world problems, such as network packet routing or the coordination of autonomous vehicles. There is a great need for new reinforcement learning methods that can efficiently learn decentralised policies for such systems. To this end, we propose a new multi-agent actor-critic method called counterfactual multi-agent (COMA) policy gradients. COMA uses a centralised critic to estimate the Q-function and decentralised actors to optimise the agents’ policies. In addition, to address the challenges of multi-agent credit assignment, it uses a counterfactual baseline that marginalises out a single agent’s action, while keeping the other agents’ actions fixed. COMA also uses a critic representation that allows the counterfactual base- line to be computed efficiently in a single forward pass. We evaluate COMA in the testbed of StarCraft unit micromanagement, using a decentralised variant with significant partial observability. COMA significantly improves average performance over other multi-agent actor-critic methods in this setting, and the best performing agents are competitive with state-of-the-art centralised controllers that get access to the full state.
When: 15 May 2017, 1pm
Abstract: Appropriate feature engineering, the process of combining and transforming raw features, can contribute significantly to improving the performance of a supervised learning task. Because of the combinatorial explosion of possible engineered features, very often successful feature engineering in a corporate environment requires domain knowledge. However, with the increasing complexity of data and numbers of sources of data, applying human insight to the feature engineering process becomes difficult.
In this presentation, I introduce a novel, fast, and hopefully elegant algorithm where the complexity of the process of feature engineering is only connected to the number of features relevant to modelling the target variable, instead of all the features in a dataset. When the number of relevant features is much smaller than the total number of features, this amounts to increasing the efficiency to almost constant time. This method draws in insights from previous work linking machine learning and information theory. I then present the results of tests of the algorithm, which show that the engineering method I have developed is indeed effective in creating a feature that improves the performance of both classification and regression algorithms when the engineered feature is included along with the rest of the features. In order to reach statistically valid conclusions, it is necessary to test the algorithm on large numbers of appropriate datasets. Therefore I also introduce a method by which to generate synthetic datasets with desirable characteristics on which to test machine learning algorithms.
At this stage, the work is a proof of concept of the method created. Future work would include creating more generalised methods and coding these into Python packages for use by the data science community.