{"publication":"34th Conference on Neural Information Processing Systems","quality_controlled":"1","external_id":{"arxiv":["2002.07867"]},"date_published":"2020-07-07T00:00:00Z","project":[{"name":"Prix Lopez-Loretta 2019 - Marco Mondelli","_id":"059876FA-7A3F-11EA-A408-12923DDC885E"}],"main_file_link":[{"url":"https://arxiv.org/abs/2002.07867","open_access":"1"}],"day":"07","status":"public","conference":{"start_date":"2020-12-06","name":"NeurIPS: Neural Information Processing Systems","end_date":"2020-12-12","location":"Vancouver, Canada"},"citation":{"chicago":"Nguyen, Quynh, and Marco Mondelli. “Global Convergence of Deep Networks with One Wide Layer Followed by Pyramidal Topology.” In <i>34th Conference on Neural Information Processing Systems</i>, 33:11961–11972. Curran Associates, 2020.","ama":"Nguyen Q, Mondelli M. Global convergence of deep networks with one wide layer followed by pyramidal topology. In: <i>34th Conference on Neural Information Processing Systems</i>. Vol 33. Curran Associates; 2020:11961–11972.","ieee":"Q. Nguyen and M. Mondelli, “Global convergence of deep networks with one wide layer followed by pyramidal topology,” in <i>34th Conference on Neural Information Processing Systems</i>, Vancouver, Canada, 2020, vol. 33, pp. 11961–11972.","short":"Q. Nguyen, M. Mondelli, in:, 34th Conference on Neural Information Processing Systems, Curran Associates, 2020, pp. 11961–11972.","apa":"Nguyen, Q., &#38; Mondelli, M. (2020). Global convergence of deep networks with one wide layer followed by pyramidal topology. In <i>34th Conference on Neural Information Processing Systems</i> (Vol. 33, pp. 11961–11972). Vancouver, Canada: Curran Associates.","ista":"Nguyen Q, Mondelli M. 2020. Global convergence of deep networks with one wide layer followed by pyramidal topology. 34th Conference on Neural Information Processing Systems. NeurIPS: Neural Information Processing Systems vol. 33, 11961–11972.","mla":"Nguyen, Quynh, and Marco Mondelli. “Global Convergence of Deep Networks with One Wide Layer Followed by Pyramidal Topology.” <i>34th Conference on Neural Information Processing Systems</i>, vol. 33, Curran Associates, 2020, pp. 11961–11972."},"user_id":"8b945eb4-e2f2-11eb-945a-df72226e66a9","abstract":[{"text":"Recent works have shown that gradient descent can find a global minimum for over-parameterized neural networks where the widths of all the hidden layers scale polynomially with N (N being the number of training samples). In this paper, we prove that, for deep networks, a single layer of width N following the input layer suffices to ensure a similar guarantee. In particular, all the remaining layers are allowed to have constant widths, and form a pyramidal topology. We show an application of our result to the widely used LeCun’s initialization and obtain an over-parameterization requirement for the single wide layer of order N2.\r\n","lang":"eng"}],"publisher":"Curran Associates","article_processing_charge":"No","language":[{"iso":"eng"}],"date_created":"2021-03-03T12:06:02Z","page":"11961–11972","date_updated":"2022-01-04T09:24:41Z","month":"07","volume":33,"_id":"9221","title":"Global convergence of deep networks with one wide layer followed by pyramidal topology","department":[{"_id":"MaMo"}],"author":[{"first_name":"Quynh","full_name":"Nguyen, Quynh","last_name":"Nguyen"},{"orcid":"0000-0002-3242-7020","full_name":"Mondelli, Marco","last_name":"Mondelli","id":"27EB676C-8706-11E9-9510-7717E6697425","first_name":"Marco"}],"type":"conference","year":"2020","oa_version":"Preprint","oa":1,"publication_status":"published","acknowledgement":"The authors would like to thank Jan Maas, Mahdi Soltanolkotabi, and Daniel Soudry for the helpful discussions, Marius Kloft, Matthias Hein and Quoc Dinh Tran for proofreading portions of a prior version of this paper, and James Martens for a clarification concerning LeCun’s initialization. M. Mondelli was partially supported by the 2019 Lopez-Loreta Prize. Q. Nguyen was partially supported by the German Research Foundation (DFG) award KL 2698/2-1.","intvolume":"        33"}