Podcast/Video
Neural Network Growth for Frugal AI : a functional analysis viewpoint
Edited by :
Guillaume Charpiat, Erwin Schrödinger International Institute for Mathematics and Physics (ESI) in Wien
Summary
This talk was part of the Thematic Programme on "Infinite-dimensional Geometry: Theory and Applications" held at the ESI at the beginning of 2025. Summary of the talk:
”Machine learning tasks are generally formulated as optimization problems, where one searches for an optimal function within a certain functional space. In practice, parameterized functional spaces are considered, in order to be able to perform gradient descent. Typically, a neural network architecture is chosen and fixed, and its parameters (connection weights) are optimized, yielding an architecture-dependent result. This way of proceeding however forces the evolution of the function during training to lie within the realm of what is expressible with the chosen architecture, and prevents any optimization across architectures. Costly architectural hyper-parameter optimization is often performed to compensate for this.”
Instead, Guillaume Charpiat proposes to adapt the architecture on the fly during training. He shows that the information about desirable architectural changes, due to expressivity bottlenecks when attempting to follow the functional gradient, can be extracted from backpropagation. To do this, he proposes a mathematical definition of expressivity bottlenecks, which enables him to detect, quantify and solve them while training, by adding suitable neurons. Thus, while the standard approach requires large networks, in terms of number of neurons per layer, for expressivity and optimization reasons, Guillaume Charpiat provids a framework to develop an architecture starting with a very small number of neurons.”
The editorial perspective
It goes back over how neural networks work. The talk is particularly technical and clearly aimed at the more advanced among us. Non-scientific minds, beware!
That said, even though the content is very technical, the speaker is especially good at explaining things, making it easier to grasp the concepts. However, it moves fast—very fast—so multiple viewings are necessary to fully absorb the content if you're not already familiar with technical concepts related to machine learning and deep learning.
In brief, the editorial perspective
The most
- In-depth technical insights
- Frugality approaches explained in detail
- Concepts illustrated with numerous diagrams
The least
- The pace of the talk is very fast—sometimes a bit too quick to fully grasp the concepts… but the upside of an online video is that you can rewatch it endlessly!
Publication date
February 2025
Available in
- English
License
Intellectual property of the author