Recent Papers

Here are some recent published papers grouped roughly in three directions

Approximation Theory for deep learning

Machine learning + scientific computing + control

  • Principled Acceleration of Iterative Numerical Methods Using Machine Learning: In this paper, we investigate the application of meta-learning type approaches to speed up iterative algorithms in scientific computing. An example is guessing the initial condition for a Jacobi solver (say for the Poisson equation as a sub-step in a Navier-Stokes equation solver). We show that a naive application of meta-learning (MAML algorithm) does not necessarily lead to gains in performance – contrary to what has been suggested in many recent empirical works. We concretely investigate this phenomena through analytical examples, and propose principled solutions to this dilemma. This work is published in ICML 2023.
  • Fairness In a Non-Stationary Environment From an Optimal Control Perspective: In this work, we connect the fairness problem in machine learning with control theory: in particular, ensuring that a machine learning model is fair to all demographic groups in a changing environment can be understood as an optimal control problem – in particular some sort of stabilisation problem. This allows us to design control strategies to promote fairness dynamically. This work is presented as a workshop paper in ICML 2023.

Machine learning + materials science

Recent Papers

Here are some recent published papers.

  • On stability and regularisation for data-driven solution of parabolic inverse source problems: In this paper, we investigate the stability properties on using neural networks for inverse problems involving parabolic equations with source terms, and appropriate regularisation strategies such as gradient penalisation, that improves the solution of these inverse problems. This work is published in the Journal of Computational Physics.
  • Computing high-dimensional invariant distributions from noisy data: In this paper, we further develop methods to compute invariant distributions of SDEs from noisy sampled data. The key idea is to use a quasi-potential type decomposition of the force field and using neural networks as surrogates for the decomposition. This allows us to compute invariant distributions of high-dimensional systems, especially when the noise is small (so that the distribution becomes quite singular). This work is published in the Journal of Computational Physics.
  • Accelerating the Discovery of Rare Materials with Bounded Optimization Techniques: In this paper, we propose a modified Bayesian optimisation method that are shown to be effective for a class of optimisation problems involving the inverse design for materials discovery. This work is presented as a spotlight at the NeurIPS 2022 Workshop on AI for Accelerated Materials Design.
  • Dynamic Modeling of Intrinsic Self-Healing Polymers Using Deep Learning: In this paper, we develop a methodology to learn interpretable dynamics from experimental data on self-healing polymers. We show that even with a small dataset and in the presence of noise, an appropriate physics-based model ansatz combined with machine learning enables prediction of the dynamics of healing, and the study of toughness evolutions as functions of time. We envision this as a first step towards a principled combination of self-healing polymer analysis and design using deep learning. This work is published in ACS Applied Materials and Interfaces.
  • From Optimization Dynamics to Generalization Bounds via Łojasiewicz Gradient Inequality: In this work, we investigate the connection between optimisation and generalisation in machine learning. The main idea is that one can relate the length of an optimisation path with generalisation estimates. Moreover, a LG type inequality can be used to control the optimisation trajectory length, thereby obtaining generalisation bounds. We show that this method allows for the establishment of a number of generalisation estimates (both known and new) for machine learning models. This work is published in Transactions on Machine Learning Research (TMLR).
  • Adaptive sampling methods for learning dynamical systems: In this work, we develop numerical methods for sampling trajectories that are used to train dynamical models using machine learning. Unlike static supervised learning problems, here there involves a distribution shift caused by disparity between sampling probability measures and target probability measures. The sampling method proposed in this work makes use of this disparity to design efficient sampling strategies. This work is published in the proceedings of Mathematical and Scientific Machine Learning (MSML).
  • Personalized Algorithm Generation: A Case Study in Learning ODE Integrators: In this work, we develop a machine learning method for the acceleration of RK-type integrators using ideas from multi-task learning. In particular, we show that if one solves repeatedly problems of a similar type, it is advantageous to adapt the integrator structure (here the RK coefficients) to obtain speedups. This work is published in SIAM Journal on Scientific Computing.

Recent Papers

Here are some recent published papers.

  • Approximation and Optimization Theory for Linear Continuous-Time Recurrent Neural Networks: In this paper, we extend our previous results on the approximation and optimization properties of recurrent neural networks, paying particular attention to the effect of memory on RNN hypothesis space and training performance. In particular, we prove a further inverse approximation theorem, which together with prior results show that that RNNs are effective if and only if the target relationship possesses exponentially decaying memory. This paper is published in the Journal of Machine Learning Research (JMLR).
  • On the approximation properties of recurrent encoder-decoder architectures: In this paper, we extend the approximation results for sequence modelling in deep learning to encoder-decoder type structures (one important component of the transformer, for example). We show that encoder-decoder structures are especially effective at modelling sequence relationships with a “low rank” structure in the form of temporal products. This paper is published in the International Conference on Learning Representations (ICLR 2022).
  • Unraveling Model-Agnostic Meta-Learning via The Adaptation Learning Rate: In this paper, we analyze the popular MAML algorithm for meta-learning, paying particular attention to the role (and the choice) of inner-loop adaptation learning rate. This contributes to the theoretical understanding of MAML.  This paper is published in the International Conference on Learning Representations (ICLR 2022).
  • Deep learning via dynamical systems: An approximation perspective: Our paper on sufficient conditions for approximation via deep learning is now online at the Journal of the European Mathematical Society.
  • An Annealing Mechanism for Adversarial Training Acceleration: In this paper, we extend our previous work on accelerating adversarial training using optimal control. This is published in IEEE Transactions on Neural Networks and Learning Systems (TNNLS).
  • Computing the Invariant Distribution of Randomly Perturbed Dynamical Systems Using Deep Learning: In this paper, we propose a method to compute invariant distributions of high dimensional stochastic dynamics via a decomposition method + deep neural network representation of the component forces. This can be used to analyze the potential landscape of stochastic dynamical systems, such as gene regulation networks. This work is published in the Journal of Scientific Computing.

Recent papers

An update on some recent papers in machine learning applications to the sciences and algorithm developments.

Recent papers

Here are a number of recent accepted/published papers in machine learning and computer vision:

  • Approximation Theory of Convolutional Architectures for Time Series Modelling: In this paper, we develop some approximation theory for convolutional based architectures for time series analysis – with WaveNet as a prime example. This can be seen as a parallel for the approach taken in our previous paper, but this time for convolutional networks instead of recurrent networks. Our key finding here is that convolutional structures exploits certain “effective low rank” structures for efficient approximation, which can be very different from the “exponentially decaying memory structures” that RNN brings. This paper will appear at ICML 2021.
  • Adversarial Invariant Learning: In this paper, we develop methods to use adversarially chosen data splits to tackle the out-of-distribution generalization problems. This paper is published at CVPR 2021.

We also have a number of recent papers on the application of machine learning to science and engineering:

Paper accepted at JEMS

Our paper Deep Learning via Dynamical Systems: An Approximation Perspective has been accepted at the Journal of the European Mathematical Society (JEMS).

In this paper, we set up the mathematical foundations of the approximation theory of deep learning idealized as continuous dynamical systems. Our main result is a set of general sufficient conditions on the flow field that implies the universal approximation of such networks. This is a first step towards uncovering the power of composition (which in continuous-time is just dynamics) on approximation of functions.

Accepted Papers at ICLR and AAAI 2021

Two co-authored papers have been accepted for publication in ICLR 2021

A co-authored paper has been accepted at AAAI 2021