Recent Papers

Here are some recent published papers.

  • Approximation and Optimization Theory for Linear Continuous-Time Recurrent Neural Networks: In this paper, we extend our previous results on the approximation and optimization properties of recurrent neural networks, paying particular attention to the effect of memory on RNN hypothesis space and training performance. In particular, we prove a further inverse approximation theorem, which together with prior results show that that RNNs are effective if and only if the target relationship possesses exponentially decaying memory. This paper is published in the Journal of Machine Learning Research (JMLR).
  • On the approximation properties of recurrent encoder-decoder architectures: In this paper, we extend the approximation results for sequence modelling in deep learning to encoder-decoder type structures (one important component of the transformer, for example). We show that encoder-decoder structures are especially effective at modelling sequence relationships with a “low rank” structure in the form of temporal products. This paper is published in the International Conference on Learning Representations (ICLR 2022).
  • Unraveling Model-Agnostic Meta-Learning via The Adaptation Learning Rate: In this paper, we analyze the popular MAML algorithm for meta-learning, paying particular attention to the role (and the choice) of inner-loop adaptation learning rate. This contributes to the theoretical understanding of MAML.  This paper is published in the International Conference on Learning Representations (ICLR 2022).
  • Deep learning via dynamical systems: An approximation perspective: Our paper on sufficient conditions for approximation via deep learning is now online at the Journal of the European Mathematical Society.
  • An Annealing Mechanism for Adversarial Training Acceleration: In this paper, we extend our previous work on accelerating adversarial training using optimal control. This is published in IEEE Transactions on Neural Networks and Learning Systems (TNNLS).
  • Computing the Invariant Distribution of Randomly Perturbed Dynamical Systems Using Deep Learning: In this paper, we propose a method to compute invariant distributions of high dimensional stochastic dynamics via a decomposition method + deep neural network representation of the component forces. This can be used to analyze the potential landscape of stochastic dynamical systems, such as gene regulation networks. This work is published in the Journal of Scientific Computing.

Leave a Reply

Your email address will not be published. Required fields are marked *