When are solutions connected in deep networks?

Nguyen Q, Bréchet P, Mondelli M. When are solutions connected in deep networks? 35th Conference on Neural Information Processing Systems. 35th Conference on Neural Information Processing Systems.

Conference Paper | Accepted | English
Author
Nguyen, Quynh; Bréchet, Pierre; Mondelli, MarcoISTA
Department
Abstract
The question of how and why the phenomenon of mode connectivity occurs in training deep neural networks has gained remarkable attention in the research community. From a theoretical perspective, two possible explanations have been proposed: (i) the loss function has connected sublevel sets, and (ii) the solutions found by stochastic gradient descent are dropout stable. While these explanations provide insights into the phenomenon, their assumptions are not always satisfied in practice. In particular, the first approach requires the network to have one layer with order of N neurons (N being the number of training samples), while the second one requires the loss to be almost invariant after removing half of the neurons at each layer (up to some rescaling of the remaining ones). In this work, we improve both conditions by exploiting the quality of the features at every intermediate layer together with a milder over-parameterization condition. More specifically, we show that: (i) under generic assumptions on the features of intermediate layers, it suffices that the last two hidden layers have order of N−−√ neurons, and (ii) if subsets of features at each layer are linearly separable, then no over-parameterization is needed to show the connectivity. Our experiments confirm that the proposed condition ensures the connectivity of solutions found by stochastic gradient descent, even in settings where the previous requirements do not hold.
Publishing Year
Date Published
2021-12-01
Proceedings Title
35th Conference on Neural Information Processing Systems
Acknowledgement
MM was partially supported by the 2019 Lopez-Loreta Prize. QN and PB acknowledge support from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement no 757983).
Conference
35th Conference on Neural Information Processing Systems
Conference Location
Virtual
Conference Date
2021-12-06 – 2021-12-14
IST-REx-ID

Cite this

Nguyen Q, Bréchet P, Mondelli M. When are solutions connected in deep networks? In: 35th Conference on Neural Information Processing Systems. NeurIPS.
Nguyen, Q., Bréchet, P., & Mondelli, M. (n.d.). When are solutions connected in deep networks? In 35th Conference on Neural Information Processing Systems. Virtual: NeurIPS.
Nguyen, Quynh, Pierre Bréchet, and Marco Mondelli. “When Are Solutions Connected in Deep Networks?” In 35th Conference on Neural Information Processing Systems. NeurIPS, n.d.
Q. Nguyen, P. Bréchet, and M. Mondelli, “When are solutions connected in deep networks?,” in 35th Conference on Neural Information Processing Systems, Virtual.
Nguyen Q, Bréchet P, Mondelli M. When are solutions connected in deep networks? 35th Conference on Neural Information Processing Systems. 35th Conference on Neural Information Processing Systems.
Nguyen, Quynh, et al. “When Are Solutions Connected in Deep Networks?” 35th Conference on Neural Information Processing Systems, NeurIPS.
All files available under the following license(s):
Copyright Statement:
This Item is protected by copyright and/or related rights. [...]

Link(s) to Main File(s)
Access Level
OA Open Access

Export

Marked Publications

Open Data ISTA Research Explorer

Sources

arXiv 2102.09671

Search this title in

Google Scholar