A new framework for robust control called Non-Stochastic Control. This permits control in adversarial environments via a new type of algorithm, the Gradient Perturbation Controller, which also gives rise to the first logarithmic regret in online control.
For a survey, see these lecture notes.
Combining time series and control algorithms via the new technique of Boosting for Dynamical Systems.
Learning auto-regressive moving-average time series with adversarial noise.
Maximum-entropy exploration in partially observed and/or approximated Markov Decision Processes.
Machine learning moves us from the custom-designed algorithm to generic models, such as neural networks, that are trained by optimization algorithms. For a survey see these lecture notes. Some of the most useful and efficient methods for training convex as well as non-convex methods that we have worked on include:
The AdaGrad algorithm, and the technique of adaptive preconditioning.
Projection-free algorithms for online learning in the context of recommender systems, and the first linearly convergent projection-free algorithm.
In recent years, convex optimization and the notion of regret minimization in games, have been combined and applied to machine learning in a general framework called online convex optimization. For more information see graduate text book on online convex optimization in machine learning, or survey on the convex optimization approach to regret minimization. Our research spans efficient online algorithms as well as matrix prediction algorithms, and decision making under uncertainty and continuous multi-armed bandits.Share on: