--- _id: '15172' abstract: - lang: eng text: 'We propose a novel approach to concentration for non-independent random variables. The main idea is to “pretend” that the random variables are independent and pay a multiplicative price measuring how far they are from actually being independent. This price is encapsulated in the Hellinger integral between the joint and the product of the marginals, which is then upper bounded leveraging tensorisation properties. Our bounds represent a natural generalisation of concentration inequalities in the presence of dependence: we recover exactly the classical bounds (McDiarmid’s inequality) when the random variables are independent. Furthermore, in a “large deviations” regime, we obtain the same decay in the probability as for the independent case, even when the random variables display non-trivial dependencies. To show this, we consider a number of applications of interest. First, we provide a bound for Markov chains with finite state space. Then, we consider the Simple Symmetric Random Walk, which is a non-contracting Markov chain, and a non-Markovian setting in which the stochastic process depends on its entire past. To conclude, we propose an application to Markov Chain Monte Carlo methods, where our approach leads to an improved lower bound on the minimum burn-in period required to reach a certain accuracy. In all of these settings, we provide a regime of parameters in which our bound fares better than what the state of the art can provide.' article_processing_charge: No article_type: original author: - first_name: Amedeo Roberto full_name: Esposito, Amedeo Roberto id: 9583e921-e1ad-11ec-9862-cef099626dc9 last_name: Esposito - first_name: Marco full_name: Mondelli, Marco id: 27EB676C-8706-11E9-9510-7717E6697425 last_name: Mondelli orcid: 0000-0002-3242-7020 citation: ama: Esposito AR, Mondelli M. Concentration without independence via information measures. IEEE Transactions on Information Theory. doi:10.1109/TIT.2024.3367767 apa: Esposito, A. R., & Mondelli, M. (n.d.). Concentration without independence via information measures. IEEE Transactions on Information Theory. IEEE. https://doi.org/10.1109/TIT.2024.3367767 chicago: Esposito, Amedeo Roberto, and Marco Mondelli. “Concentration without Independence via Information Measures.” IEEE Transactions on Information Theory. IEEE, n.d. https://doi.org/10.1109/TIT.2024.3367767. ieee: A. R. Esposito and M. Mondelli, “Concentration without independence via information measures,” IEEE Transactions on Information Theory. IEEE. ista: Esposito AR, Mondelli M. Concentration without independence via information measures. IEEE Transactions on Information Theory. mla: Esposito, Amedeo Roberto, and Marco Mondelli. “Concentration without Independence via Information Measures.” IEEE Transactions on Information Theory, IEEE, doi:10.1109/TIT.2024.3367767. short: A.R. Esposito, M. Mondelli, IEEE Transactions on Information Theory (n.d.). date_created: 2024-03-24T23:01:00Z date_published: 2024-02-20T00:00:00Z date_updated: 2024-03-25T07:15:51Z day: '20' department: - _id: MaMo doi: 10.1109/TIT.2024.3367767 external_id: arxiv: - '2303.07245' language: - iso: eng month: '02' oa_version: None project: - _id: 059876FA-7A3F-11EA-A408-12923DDC885E name: Prix Lopez-Loretta 2019 - Marco Mondelli publication: IEEE Transactions on Information Theory publication_identifier: eissn: - 1557-9654 issn: - 0018-9448 publication_status: inpress publisher: IEEE quality_controlled: '1' related_material: record: - id: '14922' relation: earlier_version status: public scopus_import: '1' status: public title: Concentration without independence via information measures type: journal_article user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87 year: '2024' ... --- _id: '13315' abstract: - lang: eng text: How do statistical dependencies in measurement noise influence high-dimensional inference? To answer this, we study the paradigmatic spiked matrix model of principal components analysis (PCA), where a rank-one matrix is corrupted by additive noise. We go beyond the usual independence assumption on the noise entries, by drawing the noise from a low-order polynomial orthogonal matrix ensemble. The resulting noise correlations make the setting relevant for applications but analytically challenging. We provide characterization of the Bayes optimal limits of inference in this model. If the spike is rotation invariant, we show that standard spectral PCA is optimal. However, for more general priors, both PCA and the existing approximate message-passing algorithm (AMP) fall short of achieving the information-theoretic limits, which we compute using the replica method from statistical physics. We thus propose an AMP, inspired by the theory of adaptive Thouless–Anderson–Palmer equations, which is empirically observed to saturate the conjectured theoretical limit. This AMP comes with a rigorous state evolution analysis tracking its performance. Although we focus on specific noise distributions, our methodology can be generalized to a wide class of trace matrix ensembles at the cost of more involved expressions. Finally, despite the seemingly strong assumption of rotation-invariant noise, our theory empirically predicts algorithmic performance on real data, pointing at strong universality properties. acknowledgement: J.B. was funded by the European Union (ERC, CHORAL, project number 101039794). Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Research Council. Neither the European Union nor the granting authority can be held responsible for them. M.M. was supported by the 2019 Lopez-Loreta Prize. We would like to thank the reviewers for the insightful comments and, in particular, for suggesting the BAMP-inspired denoisers leading to AMP-AP. article_number: e2302028120 article_processing_charge: Yes (in subscription journal) article_type: original author: - first_name: Jean full_name: Barbier, Jean last_name: Barbier - first_name: Francesco full_name: Camilli, Francesco last_name: Camilli - first_name: Marco full_name: Mondelli, Marco id: 27EB676C-8706-11E9-9510-7717E6697425 last_name: Mondelli orcid: 0000-0002-3242-7020 - first_name: Manuel full_name: Sáenz, Manuel last_name: Sáenz citation: ama: Barbier J, Camilli F, Mondelli M, Sáenz M. Fundamental limits in structured principal component analysis and how to reach them. Proceedings of the National Academy of Sciences of the United States of America. 2023;120(30). doi:10.1073/pnas.2302028120 apa: Barbier, J., Camilli, F., Mondelli, M., & Sáenz, M. (2023). Fundamental limits in structured principal component analysis and how to reach them. Proceedings of the National Academy of Sciences of the United States of America. National Academy of Sciences. https://doi.org/10.1073/pnas.2302028120 chicago: Barbier, Jean, Francesco Camilli, Marco Mondelli, and Manuel Sáenz. “Fundamental Limits in Structured Principal Component Analysis and How to Reach Them.” Proceedings of the National Academy of Sciences of the United States of America. National Academy of Sciences, 2023. https://doi.org/10.1073/pnas.2302028120. ieee: J. Barbier, F. Camilli, M. Mondelli, and M. Sáenz, “Fundamental limits in structured principal component analysis and how to reach them,” Proceedings of the National Academy of Sciences of the United States of America, vol. 120, no. 30. National Academy of Sciences, 2023. ista: Barbier J, Camilli F, Mondelli M, Sáenz M. 2023. Fundamental limits in structured principal component analysis and how to reach them. Proceedings of the National Academy of Sciences of the United States of America. 120(30), e2302028120. mla: Barbier, Jean, et al. “Fundamental Limits in Structured Principal Component Analysis and How to Reach Them.” Proceedings of the National Academy of Sciences of the United States of America, vol. 120, no. 30, e2302028120, National Academy of Sciences, 2023, doi:10.1073/pnas.2302028120. short: J. Barbier, F. Camilli, M. Mondelli, M. Sáenz, Proceedings of the National Academy of Sciences of the United States of America 120 (2023). date_created: 2023-07-30T22:01:02Z date_published: 2023-07-25T00:00:00Z date_updated: 2023-10-17T11:44:55Z day: '25' ddc: - '000' department: - _id: MaMo doi: 10.1073/pnas.2302028120 external_id: pmid: - '37463204' file: - access_level: open_access checksum: 1fc06228afdb3aa80cf8e7766bcf9dc5 content_type: application/pdf creator: dernst date_created: 2023-07-31T07:30:48Z date_updated: 2023-07-31T07:30:48Z file_id: '13323' file_name: 2023_PNAS_Barbier.pdf file_size: 995933 relation: main_file success: 1 file_date_updated: 2023-07-31T07:30:48Z has_accepted_license: '1' intvolume: ' 120' issue: '30' language: - iso: eng month: '07' oa: 1 oa_version: Published Version pmid: 1 project: - _id: 059876FA-7A3F-11EA-A408-12923DDC885E name: Prix Lopez-Loretta 2019 - Marco Mondelli publication: Proceedings of the National Academy of Sciences of the United States of America publication_identifier: eissn: - 1091-6490 publication_status: published publisher: National Academy of Sciences quality_controlled: '1' related_material: link: - relation: software url: https://github.com/fcamilli95/Structured-PCA- scopus_import: '1' status: public title: Fundamental limits in structured principal component analysis and how to reach them tmp: image: /images/cc_by.png legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0) short: CC BY (4.0) type: journal_article user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87 volume: 120 year: '2023' ... --- _id: '14459' abstract: - lang: eng text: Autoencoders are a popular model in many branches of machine learning and lossy data compression. However, their fundamental limits, the performance of gradient methods and the features learnt during optimization remain poorly understood, even in the two-layer setting. In fact, earlier work has considered either linear autoencoders or specific training regimes (leading to vanishing or diverging compression rates). Our paper addresses this gap by focusing on non-linear two-layer autoencoders trained in the challenging proportional regime in which the input dimension scales linearly with the size of the representation. Our results characterize the minimizers of the population risk, and show that such minimizers are achieved by gradient methods; their structure is also unveiled, thus leading to a concise description of the features obtained via training. For the special case of a sign activation function, our analysis establishes the fundamental limits for the lossy compression of Gaussian sources via (shallow) autoencoders. Finally, while the results are proved for Gaussian data, numerical simulations on standard datasets display the universality of the theoretical predictions. acknowledgement: Aleksandr Shevchenko, Kevin Kogler and Marco Mondelli are supported by the 2019 Lopez-Loreta Prize. Hamed Hassani acknowledges the support by the NSF CIF award (1910056) and the NSF Institute for CORE Emerging Methods in Data Science (EnCORE). alternative_title: - PMLR article_processing_charge: No author: - first_name: Aleksandr full_name: Shevchenko, Aleksandr id: F2B06EC2-C99E-11E9-89F0-752EE6697425 last_name: Shevchenko - first_name: Kevin full_name: Kögler, Kevin id: 94ec913c-dc85-11ea-9058-e5051ab2428b last_name: Kögler - first_name: Hamed full_name: Hassani, Hamed last_name: Hassani - first_name: Marco full_name: Mondelli, Marco id: 27EB676C-8706-11E9-9510-7717E6697425 last_name: Mondelli orcid: 0000-0002-3242-7020 citation: ama: 'Shevchenko A, Kögler K, Hassani H, Mondelli M. Fundamental limits of two-layer autoencoders, and achieving them with gradient methods. In: Proceedings of the 40th International Conference on Machine Learning. Vol 202. ML Research Press; 2023:31151-31209.' apa: 'Shevchenko, A., Kögler, K., Hassani, H., & Mondelli, M. (2023). Fundamental limits of two-layer autoencoders, and achieving them with gradient methods. In Proceedings of the 40th International Conference on Machine Learning (Vol. 202, pp. 31151–31209). Honolulu, Hawaii, HI, United States: ML Research Press.' chicago: Shevchenko, Aleksandr, Kevin Kögler, Hamed Hassani, and Marco Mondelli. “Fundamental Limits of Two-Layer Autoencoders, and Achieving Them with Gradient Methods.” In Proceedings of the 40th International Conference on Machine Learning, 202:31151–209. ML Research Press, 2023. ieee: A. Shevchenko, K. Kögler, H. Hassani, and M. Mondelli, “Fundamental limits of two-layer autoencoders, and achieving them with gradient methods,” in Proceedings of the 40th International Conference on Machine Learning, Honolulu, Hawaii, HI, United States, 2023, vol. 202, pp. 31151–31209. ista: 'Shevchenko A, Kögler K, Hassani H, Mondelli M. 2023. Fundamental limits of two-layer autoencoders, and achieving them with gradient methods. Proceedings of the 40th International Conference on Machine Learning. ICML: International Conference on Machine Learning, PMLR, vol. 202, 31151–31209.' mla: Shevchenko, Aleksandr, et al. “Fundamental Limits of Two-Layer Autoencoders, and Achieving Them with Gradient Methods.” Proceedings of the 40th International Conference on Machine Learning, vol. 202, ML Research Press, 2023, pp. 31151–209. short: A. Shevchenko, K. Kögler, H. Hassani, M. Mondelli, in:, Proceedings of the 40th International Conference on Machine Learning, ML Research Press, 2023, pp. 31151–31209. conference: end_date: 2023-07-29 location: Honolulu, Hawaii, HI, United States name: 'ICML: International Conference on Machine Learning' start_date: 2023-07-23 date_created: 2023-10-29T23:01:17Z date_published: 2023-07-30T00:00:00Z date_updated: 2023-10-31T08:52:28Z day: '30' department: - _id: MaMo - _id: DaAl external_id: arxiv: - '2212.13468' intvolume: ' 202' language: - iso: eng main_file_link: - open_access: '1' url: https://doi.org/10.48550/arXiv.2212.13468 month: '07' oa: 1 oa_version: Preprint page: 31151-31209 project: - _id: 059876FA-7A3F-11EA-A408-12923DDC885E name: Prix Lopez-Loretta 2019 - Marco Mondelli publication: Proceedings of the 40th International Conference on Machine Learning publication_identifier: eissn: - 2640-3498 publication_status: published publisher: ML Research Press quality_controlled: '1' scopus_import: '1' status: public title: Fundamental limits of two-layer autoencoders, and achieving them with gradient methods type: conference user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87 volume: 202 year: '2023' ... --- _id: '13321' abstract: - lang: eng text: We consider the problem of reconstructing the signal and the hidden variables from observations coming from a multi-layer network with rotationally invariant weight matrices. The multi-layer structure models inference from deep generative priors, and the rotational invariance imposed on the weights generalizes the i.i.d. Gaussian assumption by allowing for a complex correlation structure, which is typical in applications. In this work, we present a new class of approximate message passing (AMP) algorithms and give a state evolution recursion which precisely characterizes their performance in the large system limit. In contrast with the existing multi-layer VAMP (ML-VAMP) approach, our proposed AMP – dubbed multilayer rotationally invariant generalized AMP (ML-RI-GAMP) – provides a natural generalization beyond Gaussian designs, in the sense that it recovers the existing Gaussian AMP as a special case. Furthermore, ML-RI-GAMP exhibits a significantly lower complexity than ML-VAMP, as the computationally intensive singular value decomposition is replaced by an estimation of the moments of the design matrices. Finally, our numerical results show that this complexity gain comes at little to no cost in the performance of the algorithm. acknowledgement: Marco Mondelli was partially supported by the 2019 Lopez-Loreta prize. article_processing_charge: No author: - first_name: Yizhou full_name: Xu, Yizhou last_name: Xu - first_name: Tian Qi full_name: Hou, Tian Qi last_name: Hou - first_name: Shan Suo full_name: Liang, Shan Suo last_name: Liang - first_name: Marco full_name: Mondelli, Marco id: 27EB676C-8706-11E9-9510-7717E6697425 last_name: Mondelli orcid: 0000-0002-3242-7020 citation: ama: 'Xu Y, Hou TQ, Liang SS, Mondelli M. Approximate message passing for multi-layer estimation in rotationally invariant models. In: 2023 IEEE Information Theory Workshop. Institute of Electrical and Electronics Engineers; 2023:294-298. doi:10.1109/ITW55543.2023.10160238' apa: 'Xu, Y., Hou, T. Q., Liang, S. S., & Mondelli, M. (2023). Approximate message passing for multi-layer estimation in rotationally invariant models. In 2023 IEEE Information Theory Workshop (pp. 294–298). Saint-Malo, France: Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/ITW55543.2023.10160238' chicago: Xu, Yizhou, Tian Qi Hou, Shan Suo Liang, and Marco Mondelli. “Approximate Message Passing for Multi-Layer Estimation in Rotationally Invariant Models.” In 2023 IEEE Information Theory Workshop, 294–98. Institute of Electrical and Electronics Engineers, 2023. https://doi.org/10.1109/ITW55543.2023.10160238. ieee: Y. Xu, T. Q. Hou, S. S. Liang, and M. Mondelli, “Approximate message passing for multi-layer estimation in rotationally invariant models,” in 2023 IEEE Information Theory Workshop, Saint-Malo, France, 2023, pp. 294–298. ista: 'Xu Y, Hou TQ, Liang SS, Mondelli M. 2023. Approximate message passing for multi-layer estimation in rotationally invariant models. 2023 IEEE Information Theory Workshop. ITW: Information Theory Workshop, 294–298.' mla: Xu, Yizhou, et al. “Approximate Message Passing for Multi-Layer Estimation in Rotationally Invariant Models.” 2023 IEEE Information Theory Workshop, Institute of Electrical and Electronics Engineers, 2023, pp. 294–98, doi:10.1109/ITW55543.2023.10160238. short: Y. Xu, T.Q. Hou, S.S. Liang, M. Mondelli, in:, 2023 IEEE Information Theory Workshop, Institute of Electrical and Electronics Engineers, 2023, pp. 294–298. conference: end_date: 2023-04-28 location: Saint-Malo, France name: 'ITW: Information Theory Workshop' start_date: 2023-04-23 date_created: 2023-07-30T22:01:04Z date_published: 2023-05-01T00:00:00Z date_updated: 2023-12-13T11:35:46Z day: '01' department: - _id: MaMo doi: 10.1109/ITW55543.2023.10160238 external_id: arxiv: - '2212.01572' isi: - '001031733100053' isi: 1 language: - iso: eng main_file_link: - open_access: '1' url: https://doi.org/10.48550/arXiv.2212.01572 month: '05' oa: 1 oa_version: Preprint page: 294-298 project: - _id: 059876FA-7A3F-11EA-A408-12923DDC885E name: Prix Lopez-Loretta 2019 - Marco Mondelli publication: 2023 IEEE Information Theory Workshop publication_identifier: eissn: - 2475-4218 isbn: - '9798350301496' publication_status: published publisher: Institute of Electrical and Electronics Engineers quality_controlled: '1' scopus_import: '1' status: public title: Approximate message passing for multi-layer estimation in rotationally invariant models type: conference user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87 year: '2023' ... --- _id: '12859' abstract: - lang: eng text: 'Machine learning models are vulnerable to adversarial perturbations, and a thought-provoking paper by Bubeck and Sellke has analyzed this phenomenon through the lens of over-parameterization: interpolating smoothly the data requires significantly more parameters than simply memorizing it. However, this "universal" law provides only a necessary condition for robustness, and it is unable to discriminate between models. In this paper, we address these gaps by focusing on empirical risk minimization in two prototypical settings, namely, random features and the neural tangent kernel (NTK). We prove that, for random features, the model is not robust for any degree of over-parameterization, even when the necessary condition coming from the universal law of robustness is satisfied. In contrast, for even activations, the NTK model meets the universal lower bound, and it is robust as soon as the necessary condition on over-parameterization is fulfilled. This also addresses a conjecture in prior work by Bubeck, Li and Nagaraj. Our analysis decouples the effect of the kernel of the model from an "interaction matrix", which describes the interaction with the test data and captures the effect of the activation. Our theoretical results are corroborated by numerical evidence on both synthetic and standard datasets (MNIST, CIFAR-10).' acknowledgement: "Simone Bombari and Marco Mondelli were partially supported by the 2019 Lopez-Loreta prize, and\r\nthe authors would like to thank Hamed Hassani for helpful discussions.\r\n" alternative_title: - PMLR article_processing_charge: No author: - first_name: Simone full_name: Bombari, Simone id: ca726dda-de17-11ea-bc14-f9da834f63aa last_name: Bombari - first_name: Shayan full_name: Kiyani, Shayan id: f5a2b424-e339-11ed-8435-ff3b4fe70cf8 last_name: Kiyani - first_name: Marco full_name: Mondelli, Marco id: 27EB676C-8706-11E9-9510-7717E6697425 last_name: Mondelli orcid: 0000-0002-3242-7020 citation: ama: 'Bombari S, Kiyani S, Mondelli M. Beyond the universal law of robustness: Sharper laws for random features and neural tangent kernels. In: Proceedings of the 40th International Conference on Machine Learning. Vol 202. ML Research Press; 2023:2738-2776.' apa: 'Bombari, S., Kiyani, S., & Mondelli, M. (2023). Beyond the universal law of robustness: Sharper laws for random features and neural tangent kernels. In Proceedings of the 40th International Conference on Machine Learning (Vol. 202, pp. 2738–2776). Honolulu, HI, United States: ML Research Press.' chicago: 'Bombari, Simone, Shayan Kiyani, and Marco Mondelli. “Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels.” In Proceedings of the 40th International Conference on Machine Learning, 202:2738–76. ML Research Press, 2023.' ieee: 'S. Bombari, S. Kiyani, and M. Mondelli, “Beyond the universal law of robustness: Sharper laws for random features and neural tangent kernels,” in Proceedings of the 40th International Conference on Machine Learning, Honolulu, HI, United States, 2023, vol. 202, pp. 2738–2776.' ista: 'Bombari S, Kiyani S, Mondelli M. 2023. Beyond the universal law of robustness: Sharper laws for random features and neural tangent kernels. Proceedings of the 40th International Conference on Machine Learning. ICML: International Conference on Machine Learning, PMLR, vol. 202, 2738–2776.' mla: 'Bombari, Simone, et al. “Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels.” Proceedings of the 40th International Conference on Machine Learning, vol. 202, ML Research Press, 2023, pp. 2738–76.' short: S. Bombari, S. Kiyani, M. Mondelli, in:, Proceedings of the 40th International Conference on Machine Learning, ML Research Press, 2023, pp. 2738–2776. conference: end_date: 2023-07-29 location: Honolulu, HI, United States name: 'ICML: International Conference on Machine Learning' start_date: 2023-07-23 date_created: 2023-04-23T16:11:03Z date_published: 2023-10-27T00:00:00Z date_updated: 2024-02-06T07:42:00Z day: '27' department: - _id: GradSch - _id: MaMo external_id: arxiv: - '2302.01629' intvolume: ' 202' language: - iso: eng main_file_link: - open_access: '1' url: https://arxiv.org/abs/2302.01629 month: '10' oa: 1 oa_version: Preprint page: 2738-2776 project: - _id: 059876FA-7A3F-11EA-A408-12923DDC885E name: Prix Lopez-Loretta 2019 - Marco Mondelli publication: Proceedings of the 40th International Conference on Machine Learning publication_status: published publisher: ML Research Press quality_controlled: '1' related_material: link: - relation: software url: https://github.com/simone-bombari/beyond-universal-robustness status: public title: 'Beyond the universal law of robustness: Sharper laws for random features and neural tangent kernels' type: conference user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87 volume: 202 year: '2023' ... --- _id: '14921' abstract: - lang: eng text: Neural collapse (NC) refers to the surprising structure of the last layer of deep neural networks in the terminal phase of gradient descent training. Recently, an increasing amount of experimental evidence has pointed to the propagation of NC to earlier layers of neural networks. However, while the NC in the last layer is well studied theoretically, much less is known about its multi-layered counterpart - deep neural collapse (DNC). In particular, existing work focuses either on linear layers or only on the last two layers at the price of an extra assumption. Our paper fills this gap by generalizing the established analytical framework for NC - the unconstrained features model - to multiple non-linear layers. Our key technical contribution is to show that, in a deep unconstrained features model, the unique global optimum for binary classification exhibits all the properties typical of DNC. This explains the existing experimental evidence of DNC. We also empirically show that (i) by optimizing deep unconstrained features models via gradient descent, the resulting solution agrees well with our theory, and (ii) trained networks recover the unconstrained features suitable for the occurrence of DNC, thus supporting the validity of this modeling principle. acknowledgement: M. M. is partially supported by the 2019 Lopez-Loreta Prize. The authors would like to thank Eugenia Iofinova, Bernd Prach and Simone Bombari for valuable feedback on the manuscript. alternative_title: - NeurIPS article_processing_charge: No author: - first_name: Peter full_name: Súkeník, Peter id: d64d6a8d-eb8e-11eb-b029-96fd216dec3c last_name: Súkeník - first_name: Marco full_name: Mondelli, Marco id: 27EB676C-8706-11E9-9510-7717E6697425 last_name: Mondelli orcid: 0000-0002-3242-7020 - first_name: Christoph full_name: Lampert, Christoph id: 40C20FD2-F248-11E8-B48F-1D18A9856A87 last_name: Lampert orcid: 0000-0001-8622-7887 citation: ama: 'Súkeník P, Mondelli M, Lampert C. Deep neural collapse is provably optimal for the deep unconstrained features model. In: 37th Annual Conference on Neural Information Processing Systems.' apa: Súkeník, P., Mondelli, M., & Lampert, C. (n.d.). Deep neural collapse is provably optimal for the deep unconstrained features model. In 37th Annual Conference on Neural Information Processing Systems. New Orleans, LA, United States. chicago: Súkeník, Peter, Marco Mondelli, and Christoph Lampert. “Deep Neural Collapse Is Provably Optimal for the Deep Unconstrained Features Model.” In 37th Annual Conference on Neural Information Processing Systems, n.d. ieee: P. Súkeník, M. Mondelli, and C. Lampert, “Deep neural collapse is provably optimal for the deep unconstrained features model,” in 37th Annual Conference on Neural Information Processing Systems, New Orleans, LA, United States. ista: 'Súkeník P, Mondelli M, Lampert C. Deep neural collapse is provably optimal for the deep unconstrained features model. 37th Annual Conference on Neural Information Processing Systems. NeurIPS: Neural Information Processing Systems, NeurIPS, .' mla: Súkeník, Peter, et al. “Deep Neural Collapse Is Provably Optimal for the Deep Unconstrained Features Model.” 37th Annual Conference on Neural Information Processing Systems. short: P. Súkeník, M. Mondelli, C. Lampert, in:, 37th Annual Conference on Neural Information Processing Systems, n.d. conference: end_date: 2023-12-16 location: New Orleans, LA, United States name: 'NeurIPS: Neural Information Processing Systems' start_date: 2023-12-10 date_created: 2024-02-02T11:17:41Z date_published: 2023-12-15T00:00:00Z date_updated: 2024-02-06T07:53:26Z day: '15' department: - _id: MaMo - _id: ChLa external_id: arxiv: - '2305.13165' language: - iso: eng main_file_link: - open_access: '1' url: ' https://doi.org/10.48550/arXiv.2305.13165' month: '12' oa: 1 oa_version: Preprint project: - _id: 059876FA-7A3F-11EA-A408-12923DDC885E name: Prix Lopez-Loretta 2019 - Marco Mondelli publication: 37th Annual Conference on Neural Information Processing Systems publication_status: inpress quality_controlled: '1' status: public title: Deep neural collapse is provably optimal for the deep unconstrained features model type: conference user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87 year: '2023' ... --- _id: '14924' abstract: - lang: eng text: "The stochastic heavy ball method (SHB), also known as stochastic gradient descent (SGD) with Polyak's momentum, is widely used in training neural networks. However, despite the remarkable success of such algorithm in practice, its theoretical characterization remains limited. In this paper, we focus on neural networks with two and three layers and provide a rigorous understanding of the properties of the solutions found by SHB: \\emph{(i)} stability after dropping out part of the neurons, \\emph{(ii)} connectivity along a low-loss path, and \\emph{(iii)} convergence to the global optimum.\r\nTo achieve this goal, we take a mean-field view and relate the SHB dynamics to a certain partial differential equation in the limit of large network widths. This mean-field perspective has inspired a recent line of work focusing on SGD while, in contrast, our paper considers an algorithm with momentum. More specifically, after proving existence and uniqueness of the limit differential equations, we show convergence to the global optimum and give a quantitative bound between the mean-field limit and the SHB dynamics of a finite-width network. Armed with this last bound, we are able to establish the dropout-stability and connectivity of SHB solutions." acknowledgement: D. Wu and M. Mondelli are partially supported by the 2019 Lopez-Loreta Prize. V. Kungurtsev was supported by the OP VVV project CZ.02.1.01/0.0/0.0/16_019/0000765 "Research Center for Informatics". alternative_title: - TMLR article_processing_charge: No author: - first_name: Diyuan full_name: Wu, Diyuan id: 1a5914c2-896a-11ed-bdf8-fb80621a0635 last_name: Wu - first_name: Vyacheslav full_name: Kungurtsev, Vyacheslav last_name: Kungurtsev - first_name: Marco full_name: Mondelli, Marco id: 27EB676C-8706-11E9-9510-7717E6697425 last_name: Mondelli orcid: 0000-0002-3242-7020 citation: ama: 'Wu D, Kungurtsev V, Mondelli M. Mean-field analysis for heavy ball methods: Dropout-stability, connectivity, and global convergence. In: Transactions on Machine Learning Research. ML Research Press; 2023.' apa: 'Wu, D., Kungurtsev, V., & Mondelli, M. (2023). Mean-field analysis for heavy ball methods: Dropout-stability, connectivity, and global convergence. In Transactions on Machine Learning Research. ML Research Press.' chicago: 'Wu, Diyuan, Vyacheslav Kungurtsev, and Marco Mondelli. “Mean-Field Analysis for Heavy Ball Methods: Dropout-Stability, Connectivity, and Global Convergence.” In Transactions on Machine Learning Research. ML Research Press, 2023.' ieee: 'D. Wu, V. Kungurtsev, and M. Mondelli, “Mean-field analysis for heavy ball methods: Dropout-stability, connectivity, and global convergence,” in Transactions on Machine Learning Research, 2023.' ista: 'Wu D, Kungurtsev V, Mondelli M. 2023. Mean-field analysis for heavy ball methods: Dropout-stability, connectivity, and global convergence. Transactions on Machine Learning Research. , TMLR, .' mla: 'Wu, Diyuan, et al. “Mean-Field Analysis for Heavy Ball Methods: Dropout-Stability, Connectivity, and Global Convergence.” Transactions on Machine Learning Research, ML Research Press, 2023.' short: D. Wu, V. Kungurtsev, M. Mondelli, in:, Transactions on Machine Learning Research, ML Research Press, 2023. date_created: 2024-02-02T11:21:56Z date_published: 2023-02-28T00:00:00Z date_updated: 2024-02-06T08:53:15Z day: '28' department: - _id: MaMo external_id: arxiv: - '2210.06819' has_accepted_license: '1' language: - iso: eng main_file_link: - open_access: '1' url: https://doi.org/10.48550/arXiv.2210.06819 month: '02' oa: 1 oa_version: Published Version project: - _id: 059876FA-7A3F-11EA-A408-12923DDC885E name: Prix Lopez-Loretta 2019 - Marco Mondelli publication: Transactions on Machine Learning Research publication_status: published publisher: ML Research Press quality_controlled: '1' status: public title: 'Mean-field analysis for heavy ball methods: Dropout-stability, connectivity, and global convergence' tmp: image: /images/cc_by.png legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0) short: CC BY (4.0) type: conference user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87 year: '2023' ... --- _id: '14923' abstract: - lang: eng text: We study the performance of a Bayesian statistician who estimates a rank-one signal corrupted by non-symmetric rotationally invariant noise with a generic distribution of singular values. As the signal-to-noise ratio and the noise structure are unknown, a Gaussian setup is incorrectly assumed. We derive the exact analytic expression for the error of the mismatched Bayes estimator and also provide the analysis of an approximate message passing (AMP) algorithm. The first result exploits the asymptotic behavior of spherical integrals for rectangular matrices and of low-rank matrix perturbations; the second one relies on the design and analysis of an auxiliary AMP. The numerical experiments show that there is a performance gap between the AMP and Bayes estimators, which is due to the incorrect estimation of the signal norm. article_processing_charge: No author: - first_name: Teng full_name: Fu, Teng last_name: Fu - first_name: YuHao full_name: Liu, YuHao last_name: Liu - first_name: Jean full_name: Barbier, Jean last_name: Barbier - first_name: Marco full_name: Mondelli, Marco id: 27EB676C-8706-11E9-9510-7717E6697425 last_name: Mondelli orcid: 0000-0002-3242-7020 - first_name: ShanSuo full_name: Liang, ShanSuo last_name: Liang - first_name: TianQi full_name: Hou, TianQi last_name: Hou citation: ama: 'Fu T, Liu Y, Barbier J, Mondelli M, Liang S, Hou T. Mismatched estimation of non-symmetric rank-one matrices corrupted by structured noise. In: Proceedings of 2023 IEEE International Symposium on Information Theory. IEEE. doi:10.1109/isit54713.2023.10206671' apa: 'Fu, T., Liu, Y., Barbier, J., Mondelli, M., Liang, S., & Hou, T. (n.d.). Mismatched estimation of non-symmetric rank-one matrices corrupted by structured noise. In Proceedings of 2023 IEEE International Symposium on Information Theory. Taipei, Taiwan: IEEE. https://doi.org/10.1109/isit54713.2023.10206671' chicago: Fu, Teng, YuHao Liu, Jean Barbier, Marco Mondelli, ShanSuo Liang, and TianQi Hou. “Mismatched Estimation of Non-Symmetric Rank-One Matrices Corrupted by Structured Noise.” In Proceedings of 2023 IEEE International Symposium on Information Theory. IEEE, n.d. https://doi.org/10.1109/isit54713.2023.10206671. ieee: T. Fu, Y. Liu, J. Barbier, M. Mondelli, S. Liang, and T. Hou, “Mismatched estimation of non-symmetric rank-one matrices corrupted by structured noise,” in Proceedings of 2023 IEEE International Symposium on Information Theory, Taipei, Taiwan. ista: 'Fu T, Liu Y, Barbier J, Mondelli M, Liang S, Hou T. Mismatched estimation of non-symmetric rank-one matrices corrupted by structured noise. Proceedings of 2023 IEEE International Symposium on Information Theory. ISIT: IEEE International Symposium on Information Theory.' mla: Fu, Teng, et al. “Mismatched Estimation of Non-Symmetric Rank-One Matrices Corrupted by Structured Noise.” Proceedings of 2023 IEEE International Symposium on Information Theory, IEEE, doi:10.1109/isit54713.2023.10206671. short: T. Fu, Y. Liu, J. Barbier, M. Mondelli, S. Liang, T. Hou, in:, Proceedings of 2023 IEEE International Symposium on Information Theory, IEEE, n.d. conference: end_date: 2023-06-30 location: Taipei, Taiwan name: 'ISIT: IEEE International Symposium on Information Theory' start_date: 2023-06-25 date_created: 2024-02-02T11:20:39Z date_published: 2023-06-30T00:00:00Z date_updated: 2024-02-14T14:34:03Z day: '30' department: - _id: MaMo doi: 10.1109/isit54713.2023.10206671 external_id: arxiv: - '2302.03306' language: - iso: eng main_file_link: - open_access: '1' url: https://doi.org/10.48550/arXiv.2302.03306 month: '06' oa: 1 oa_version: Preprint publication: Proceedings of 2023 IEEE International Symposium on Information Theory publication_status: inpress publisher: IEEE quality_controlled: '1' status: public title: Mismatched estimation of non-symmetric rank-one matrices corrupted by structured noise type: conference user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87 year: '2023' ... --- _id: '14922' abstract: - lang: eng text: 'We propose a novel approach to concentration for non-independent random variables. The main idea is to ``pretend'''' that the random variables are independent and pay a multiplicative price measuring how far they are from actually being independent. This price is encapsulated in the Hellinger integral between the joint and the product of the marginals, which is then upper bounded leveraging tensorisation properties. Our bounds represent a natural generalisation of concentration inequalities in the presence of dependence: we recover exactly the classical bounds (McDiarmid''s inequality) when the random variables are independent. Furthermore, in a ``large deviations'''' regime, we obtain the same decay in the probability as for the independent case, even when the random variables display non-trivial dependencies. To show this, we consider a number of applications of interest. First, we provide a bound for Markov chains with finite state space. Then, we consider the Simple Symmetric Random Walk, which is a non-contracting Markov chain, and a non-Markovian setting in which the stochastic process depends on its entire past. To conclude, we propose an application to Markov Chain Monte Carlo methods, where our approach leads to an improved lower bound on the minimum burn-in period required to reach a certain accuracy. In all of these settings, we provide a regime of parameters in which our bound fares better than what the state of the art can provide.' acknowledgement: The authors are partially supported by the 2019 Lopez-Loreta Prize. They would also like to thank Professor Jan Maas for providing valuable suggestions and comments on an early version of the work. article_processing_charge: No author: - first_name: Amedeo Roberto full_name: Esposito, Amedeo Roberto id: 9583e921-e1ad-11ec-9862-cef099626dc9 last_name: Esposito - first_name: Marco full_name: Mondelli, Marco id: 27EB676C-8706-11E9-9510-7717E6697425 last_name: Mondelli orcid: 0000-0002-3242-7020 citation: ama: 'Esposito AR, Mondelli M. Concentration without independence via information measures. In: Proceedings of 2023 IEEE International Symposium on Information Theory. IEEE; 2023:400-405. doi:10.1109/isit54713.2023.10206899' apa: 'Esposito, A. R., & Mondelli, M. (2023). Concentration without independence via information measures. In Proceedings of 2023 IEEE International Symposium on Information Theory (pp. 400–405). Taipei, Taiwan: IEEE. https://doi.org/10.1109/isit54713.2023.10206899' chicago: Esposito, Amedeo Roberto, and Marco Mondelli. “Concentration without Independence via Information Measures.” In Proceedings of 2023 IEEE International Symposium on Information Theory, 400–405. IEEE, 2023. https://doi.org/10.1109/isit54713.2023.10206899. ieee: A. R. Esposito and M. Mondelli, “Concentration without independence via information measures,” in Proceedings of 2023 IEEE International Symposium on Information Theory, Taipei, Taiwan, 2023, pp. 400–405. ista: 'Esposito AR, Mondelli M. 2023. Concentration without independence via information measures. Proceedings of 2023 IEEE International Symposium on Information Theory. ISIT: IEEE International Symposium on Information Theory, 400–405.' mla: Esposito, Amedeo Roberto, and Marco Mondelli. “Concentration without Independence via Information Measures.” Proceedings of 2023 IEEE International Symposium on Information Theory, IEEE, 2023, pp. 400–05, doi:10.1109/isit54713.2023.10206899. short: A.R. Esposito, M. Mondelli, in:, Proceedings of 2023 IEEE International Symposium on Information Theory, IEEE, 2023, pp. 400–405. conference: end_date: 2023-06-30 location: Taipei, Taiwan name: 'ISIT: IEEE International Symposium on Information Theory' start_date: 2023-06-25 date_created: 2024-02-02T11:18:40Z date_published: 2023-06-30T00:00:00Z date_updated: 2024-03-25T07:15:51Z day: '30' department: - _id: MaMo doi: 10.1109/isit54713.2023.10206899 external_id: arxiv: - '2303.07245' language: - iso: eng main_file_link: - open_access: '1' url: https://doi.org/10.48550/arXiv.2303.07245 month: '06' oa: 1 oa_version: Preprint page: 400-405 project: - _id: 059876FA-7A3F-11EA-A408-12923DDC885E name: Prix Lopez-Loretta 2019 - Marco Mondelli publication: Proceedings of 2023 IEEE International Symposium on Information Theory publication_identifier: eisbn: - '9781665475549' eissn: - 2157-8117 publication_status: published publisher: IEEE quality_controlled: '1' related_material: record: - id: '15172' relation: later_version status: public status: public title: Concentration without independence via information measures type: conference user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87 year: '2023' ... --- _id: '11420' abstract: - lang: eng text: 'Understanding the properties of neural networks trained via stochastic gradient descent (SGD) is at the heart of the theory of deep learning. In this work, we take a mean-field view, and consider a two-layer ReLU network trained via noisy-SGD for a univariate regularized regression problem. Our main result is that SGD with vanishingly small noise injected in the gradients is biased towards a simple solution: at convergence, the ReLU network implements a piecewise linear map of the inputs, and the number of “knot” points -- i.e., points where the tangent of the ReLU network estimator changes -- between two consecutive training inputs is at most three. In particular, as the number of neurons of the network grows, the SGD dynamics is captured by the solution of a gradient flow and, at convergence, the distribution of the weights approaches the unique minimizer of a related free energy, which has a Gibbs form. Our key technical contribution consists in the analysis of the estimator resulting from this minimizer: we show that its second derivative vanishes everywhere, except at some specific locations which represent the “knot” points. We also provide empirical evidence that knots at locations distinct from the data points might occur, as predicted by our theory.' acknowledgement: "We would like to thank Mert Pilanci for several exploratory discussions in the early stage\r\nof the project, Jan Maas for clarifications about Jordan et al. (1998), and Max Zimmer for\r\nsuggestive numerical experiments. A. Shevchenko and M. Mondelli are partially supported\r\nby the 2019 Lopez-Loreta Prize. V. Kungurtsev acknowledges support to the OP VVV\r\nproject CZ.02.1.01/0.0/0.0/16 019/0000765 Research Center for Informatics.\r\n" article_processing_charge: No article_type: original author: - first_name: Aleksandr full_name: Shevchenko, Aleksandr id: F2B06EC2-C99E-11E9-89F0-752EE6697425 last_name: Shevchenko - first_name: Vyacheslav full_name: Kungurtsev, Vyacheslav last_name: Kungurtsev - first_name: Marco full_name: Mondelli, Marco id: 27EB676C-8706-11E9-9510-7717E6697425 last_name: Mondelli orcid: 0000-0002-3242-7020 citation: ama: Shevchenko A, Kungurtsev V, Mondelli M. Mean-field analysis of piecewise linear solutions for wide ReLU networks. Journal of Machine Learning Research. 2022;23(130):1-55. apa: Shevchenko, A., Kungurtsev, V., & Mondelli, M. (2022). Mean-field analysis of piecewise linear solutions for wide ReLU networks. Journal of Machine Learning Research. Journal of Machine Learning Research. chicago: Shevchenko, Aleksandr, Vyacheslav Kungurtsev, and Marco Mondelli. “Mean-Field Analysis of Piecewise Linear Solutions for Wide ReLU Networks.” Journal of Machine Learning Research. Journal of Machine Learning Research, 2022. ieee: A. Shevchenko, V. Kungurtsev, and M. Mondelli, “Mean-field analysis of piecewise linear solutions for wide ReLU networks,” Journal of Machine Learning Research, vol. 23, no. 130. Journal of Machine Learning Research, pp. 1–55, 2022. ista: Shevchenko A, Kungurtsev V, Mondelli M. 2022. Mean-field analysis of piecewise linear solutions for wide ReLU networks. Journal of Machine Learning Research. 23(130), 1–55. mla: Shevchenko, Aleksandr, et al. “Mean-Field Analysis of Piecewise Linear Solutions for Wide ReLU Networks.” Journal of Machine Learning Research, vol. 23, no. 130, Journal of Machine Learning Research, 2022, pp. 1–55. short: A. Shevchenko, V. Kungurtsev, M. Mondelli, Journal of Machine Learning Research 23 (2022) 1–55. date_created: 2022-05-29T22:01:54Z date_published: 2022-04-01T00:00:00Z date_updated: 2022-05-30T08:34:14Z day: '01' ddc: - '000' department: - _id: MaMo - _id: DaAl external_id: arxiv: - '2111.02278' file: - access_level: open_access checksum: d4ff5d1affb34848b5c5e4002483fc62 content_type: application/pdf creator: cchlebak date_created: 2022-05-30T08:22:55Z date_updated: 2022-05-30T08:22:55Z file_id: '11422' file_name: 21-1365.pdf file_size: 1521701 relation: main_file success: 1 file_date_updated: 2022-05-30T08:22:55Z has_accepted_license: '1' intvolume: ' 23' issue: '130' language: - iso: eng month: '04' oa: 1 oa_version: Published Version page: 1-55 project: - _id: 059876FA-7A3F-11EA-A408-12923DDC885E name: Prix Lopez-Loretta 2019 - Marco Mondelli publication: Journal of Machine Learning Research publication_identifier: eissn: - 1533-7928 issn: - 1532-4435 publication_status: published publisher: Journal of Machine Learning Research quality_controlled: '1' related_material: link: - relation: other url: https://www.jmlr.org/papers/v23/21-1365.html scopus_import: '1' status: public title: Mean-field analysis of piecewise linear solutions for wide ReLU networks tmp: image: /images/cc_by.png legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0) short: CC BY (4.0) type: journal_article user_id: 8b945eb4-e2f2-11eb-945a-df72226e66a9 volume: 23 year: '2022' ... --- _id: '12016' abstract: - lang: eng text: We consider the problem of coded distributed computing using polar codes. The average execution time of a coded computing system is related to the error probability for transmission over the binary erasure channel in recent work by Soleymani, Jamali and Mahdavifar, where the performance of binary linear codes is investigated. In this paper, we focus on polar codes and unveil a connection between the average execution time and the scaling exponent μ of the family of codes. In the finite-length characterization of polar codes, the scaling exponent is a key object capturing the speed of convergence to capacity. In particular, we show that (i) the gap between the normalized average execution time of polar codes and that of optimal MDS codes is O(n –1/μ ), and (ii) this upper bound can be improved to roughly O(n –1/2 ) by considering polar codes with large kernels. We conjecture that these bounds could be improved to O(n –2/μ ) and O(n –1 ), respectively, and provide a heuristic argument as well as numerical evidence supporting this view. acknowledgement: D. Fathollahi and M. Mondelli were partially supported by the 2019 Lopez-Loreta Prize. The authors thank Hamed Hassani and Hessam Mahdavifar for helpful discussions. article_processing_charge: No author: - first_name: Dorsa full_name: Fathollahi, Dorsa last_name: Fathollahi - first_name: Marco full_name: Mondelli, Marco id: 27EB676C-8706-11E9-9510-7717E6697425 last_name: Mondelli orcid: 0000-0002-3242-7020 citation: ama: 'Fathollahi D, Mondelli M. Polar coded computing: The role of the scaling exponent. In: 2022 IEEE International Symposium on Information Theory. Vol 2022. IEEE; 2022:2154-2159. doi:10.1109/ISIT50566.2022.9834712' apa: 'Fathollahi, D., & Mondelli, M. (2022). Polar coded computing: The role of the scaling exponent. In 2022 IEEE International Symposium on Information Theory (Vol. 2022, pp. 2154–2159). Espoo, Finland: IEEE. https://doi.org/10.1109/ISIT50566.2022.9834712' chicago: 'Fathollahi, Dorsa, and Marco Mondelli. “Polar Coded Computing: The Role of the Scaling Exponent.” In 2022 IEEE International Symposium on Information Theory, 2022:2154–59. IEEE, 2022. https://doi.org/10.1109/ISIT50566.2022.9834712.' ieee: 'D. Fathollahi and M. Mondelli, “Polar coded computing: The role of the scaling exponent,” in 2022 IEEE International Symposium on Information Theory, Espoo, Finland, 2022, vol. 2022, pp. 2154–2159.' ista: 'Fathollahi D, Mondelli M. 2022. Polar coded computing: The role of the scaling exponent. 2022 IEEE International Symposium on Information Theory. ISIT: Internation Symposium on Information Theory vol. 2022, 2154–2159.' mla: 'Fathollahi, Dorsa, and Marco Mondelli. “Polar Coded Computing: The Role of the Scaling Exponent.” 2022 IEEE International Symposium on Information Theory, vol. 2022, IEEE, 2022, pp. 2154–59, doi:10.1109/ISIT50566.2022.9834712.' short: D. Fathollahi, M. Mondelli, in:, 2022 IEEE International Symposium on Information Theory, IEEE, 2022, pp. 2154–2159. conference: end_date: 2022-07-01 location: Espoo, Finland name: 'ISIT: Internation Symposium on Information Theory' start_date: 2022-06-26 date_created: 2022-09-04T22:02:05Z date_published: 2022-08-03T00:00:00Z date_updated: 2022-09-05T10:35:40Z day: '03' department: - _id: MaMo doi: 10.1109/ISIT50566.2022.9834712 external_id: arxiv: - '2201.10082' intvolume: ' 2022' language: - iso: eng main_file_link: - open_access: '1' url: https://doi.org/10.48550/arXiv.2201.10082 month: '08' oa: 1 oa_version: Preprint page: 2154-2159 project: - _id: 059876FA-7A3F-11EA-A408-12923DDC885E name: Prix Lopez-Loretta 2019 - Marco Mondelli publication: 2022 IEEE International Symposium on Information Theory publication_identifier: isbn: - '9781665421591' issn: - 2157-8095 publication_status: published publisher: IEEE quality_controlled: '1' scopus_import: '1' status: public title: 'Polar coded computing: The role of the scaling exponent' type: conference user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87 volume: 2022 year: '2022' ... --- _id: '12540' abstract: - lang: eng text: We consider the problem of signal estimation in generalized linear models defined via rotationally invariant design matrices. Since these matrices can have an arbitrary spectral distribution, this model is well suited for capturing complex correlation structures which often arise in applications. We propose a novel family of approximate message passing (AMP) algorithms for signal estimation, and rigorously characterize their performance in the high-dimensional limit via a state evolution recursion. Our rotationally invariant AMP has complexity of the same order as the existing AMP derived under the restrictive assumption of a Gaussian design; our algorithm also recovers this existing AMP as a special case. Numerical results showcase a performance close to Vector AMP (which is conjectured to be Bayes-optimal in some settings), but obtained with a much lower complexity, as the proposed algorithm does not require a computationally expensive singular value decomposition. acknowledgement: The authors would like to thank the anonymous reviewers for their helpful comments. KK and MM were partially supported by the 2019 Lopez-Loreta Prize. article_number: '22' article_processing_charge: No author: - first_name: Ramji full_name: Venkataramanan, Ramji last_name: Venkataramanan - first_name: Kevin full_name: Kögler, Kevin id: 94ec913c-dc85-11ea-9058-e5051ab2428b last_name: Kögler - first_name: Marco full_name: Mondelli, Marco id: 27EB676C-8706-11E9-9510-7717E6697425 last_name: Mondelli orcid: 0000-0002-3242-7020 citation: ama: 'Venkataramanan R, Kögler K, Mondelli M. Estimation in rotationally invariant generalized linear models via approximate message passing. In: Proceedings of the 39th International Conference on Machine Learning. Vol 162. ML Research Press; 2022.' apa: 'Venkataramanan, R., Kögler, K., & Mondelli, M. (2022). Estimation in rotationally invariant generalized linear models via approximate message passing. In Proceedings of the 39th International Conference on Machine Learning (Vol. 162). Baltimore, MD, United States: ML Research Press.' chicago: Venkataramanan, Ramji, Kevin Kögler, and Marco Mondelli. “Estimation in Rotationally Invariant Generalized Linear Models via Approximate Message Passing.” In Proceedings of the 39th International Conference on Machine Learning, Vol. 162. ML Research Press, 2022. ieee: R. Venkataramanan, K. Kögler, and M. Mondelli, “Estimation in rotationally invariant generalized linear models via approximate message passing,” in Proceedings of the 39th International Conference on Machine Learning, Baltimore, MD, United States, 2022, vol. 162. ista: 'Venkataramanan R, Kögler K, Mondelli M. 2022. Estimation in rotationally invariant generalized linear models via approximate message passing. Proceedings of the 39th International Conference on Machine Learning. ICML: International Conference on Machine Learning vol. 162, 22.' mla: Venkataramanan, Ramji, et al. “Estimation in Rotationally Invariant Generalized Linear Models via Approximate Message Passing.” Proceedings of the 39th International Conference on Machine Learning, vol. 162, 22, ML Research Press, 2022. short: R. Venkataramanan, K. Kögler, M. Mondelli, in:, Proceedings of the 39th International Conference on Machine Learning, ML Research Press, 2022. conference: end_date: 2022-07-23 location: Baltimore, MD, United States name: 'ICML: International Conference on Machine Learning' start_date: 2022-07-17 date_created: 2023-02-10T13:49:04Z date_published: 2022-01-01T00:00:00Z date_updated: 2023-02-13T10:54:58Z ddc: - '000' department: - _id: MaMo file: - access_level: open_access checksum: 67436eb0a660789514cdf9db79e84683 content_type: application/pdf creator: dernst date_created: 2023-02-13T10:53:11Z date_updated: 2023-02-13T10:53:11Z file_id: '12547' file_name: 2022_PMLR_Venkataramanan.pdf file_size: 2341343 relation: main_file success: 1 file_date_updated: 2023-02-13T10:53:11Z has_accepted_license: '1' intvolume: ' 162' language: - iso: eng oa: 1 oa_version: Published Version project: - _id: 059876FA-7A3F-11EA-A408-12923DDC885E name: Prix Lopez-Loretta 2019 - Marco Mondelli publication: Proceedings of the 39th International Conference on Machine Learning publication_status: published publisher: ML Research Press quality_controlled: '1' status: public title: Estimation in rotationally invariant generalized linear models via approximate message passing type: conference user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87 volume: 162 year: '2022' ... --- _id: '12536' abstract: - lang: eng text: 'We consider the problem of estimating a rank-1 signal corrupted by structured rotationally invariant noise, and address the following question: how well do inference algorithms perform when the noise statistics is unknown and hence Gaussian noise is assumed? While the matched Bayes-optimal setting with unstructured noise is well understood, the analysis of this mismatched problem is only at its premises. In this paper, we make a step towards understanding the effect of the strong source of mismatch which is the noise statistics. Our main technical contribution is the rigorous analysis of a Bayes estimator and of an approximate message passing (AMP) algorithm, both of which incorrectly assume a Gaussian setup. The first result exploits the theory of spherical integrals and of low-rank matrix perturbations; the idea behind the second one is to design and analyze an artificial AMP which, by taking advantage of the flexibility in the denoisers, is able to "correct" the mismatch. Armed with these sharp asymptotic characterizations, we unveil a rich and often unexpected phenomenology. For example, despite AMP is in principle designed to efficiently compute the Bayes estimator, the former is outperformed by the latter in terms of mean-square error. We show that this performance gap is due to an incorrect estimation of the signal norm. In fact, when the SNR is large enough, the overlaps of the AMP and the Bayes estimator coincide, and they even match those of optimal estimators taking into account the structure of the noise.' article_number: '2205.10009' article_processing_charge: No author: - first_name: Jean full_name: Barbier, Jean last_name: Barbier - first_name: TianQi full_name: Hou, TianQi last_name: Hou - first_name: Marco full_name: Mondelli, Marco id: 27EB676C-8706-11E9-9510-7717E6697425 last_name: Mondelli orcid: 0000-0002-3242-7020 - first_name: Manuel full_name: Saenz, Manuel last_name: Saenz citation: ama: 'Barbier J, Hou T, Mondelli M, Saenz M. The price of ignorance: How much does it cost to forget noise structure in low-rank matrix estimation? arXiv. doi:10.48550/arXiv.2205.10009' apa: 'Barbier, J., Hou, T., Mondelli, M., & Saenz, M. (n.d.). The price of ignorance: How much does it cost to forget noise structure in low-rank matrix estimation? arXiv. https://doi.org/10.48550/arXiv.2205.10009' chicago: 'Barbier, Jean, TianQi Hou, Marco Mondelli, and Manuel Saenz. “The Price of Ignorance: How Much Does It Cost to Forget Noise Structure in Low-Rank Matrix Estimation?” ArXiv, n.d. https://doi.org/10.48550/arXiv.2205.10009.' ieee: 'J. Barbier, T. Hou, M. Mondelli, and M. Saenz, “The price of ignorance: How much does it cost to forget noise structure in low-rank matrix estimation?,” arXiv. .' ista: 'Barbier J, Hou T, Mondelli M, Saenz M. The price of ignorance: How much does it cost to forget noise structure in low-rank matrix estimation? arXiv, 2205.10009.' mla: 'Barbier, Jean, et al. “The Price of Ignorance: How Much Does It Cost to Forget Noise Structure in Low-Rank Matrix Estimation?” ArXiv, 2205.10009, doi:10.48550/arXiv.2205.10009.' short: J. Barbier, T. Hou, M. Mondelli, M. Saenz, ArXiv (n.d.). date_created: 2023-02-10T13:45:41Z date_published: 2022-05-20T00:00:00Z date_updated: 2023-02-16T09:41:25Z day: '20' department: - _id: MaMo doi: 10.48550/arXiv.2205.10009 external_id: arxiv: - '2205.10009' language: - iso: eng main_file_link: - open_access: '1' url: https://doi.org/10.48550/arXiv.2205.10009 month: '05' oa: 1 oa_version: Preprint publication: arXiv publication_status: accepted status: public title: 'The price of ignorance: How much does it cost to forget noise structure in low-rank matrix estimation?' type: preprint user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87 year: '2022' ... --- _id: '12233' abstract: - lang: eng text: A novel recursive list decoding (RLD) algorithm for Reed-Muller (RM) codes based on successive permutations (SP) of the codeword is presented. A low-complexity SP scheme applied to a subset of the symmetry group of RM codes is first proposed to carefully select a good codeword permutation on the fly. Then, the proposed SP technique is integrated into an improved RLD algorithm that initializes different decoding paths with random codeword permutations, which are sampled from the full symmetry group of RM codes. Finally, efficient latency and complexity reduction schemes are introduced that virtually preserve the error-correction performance of the proposed decoder. Simulation results demonstrate that at the target frame error rate of 10−3 for the RM code of length 256 with 163 information bits, the proposed decoder reduces 6% of the computational complexity and 22% of the decoding latency of the state-of-the-art semi-parallel simplified successive-cancellation decoder with fast Hadamard transform (SSC-FHT) that uses 96 permutations from the full symmetry group of RM codes, while relatively maintaining the error-correction performance and memory consumption of the semi-parallel permuted SSC-FHT decoder. article_processing_charge: No article_type: original author: - first_name: Nghia full_name: Doan, Nghia last_name: Doan - first_name: Seyyed Ali full_name: Hashemi, Seyyed Ali last_name: Hashemi - first_name: Marco full_name: Mondelli, Marco id: 27EB676C-8706-11E9-9510-7717E6697425 last_name: Mondelli orcid: 0000-0002-3242-7020 - first_name: Warren J. full_name: Gross, Warren J. last_name: Gross citation: ama: Doan N, Hashemi SA, Mondelli M, Gross WJ. Decoding Reed-Muller codes with successive codeword permutations. IEEE Transactions on Communications. 2022;70(11):7134-7145. doi:10.1109/tcomm.2022.3211101 apa: Doan, N., Hashemi, S. A., Mondelli, M., & Gross, W. J. (2022). Decoding Reed-Muller codes with successive codeword permutations. IEEE Transactions on Communications. Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/tcomm.2022.3211101 chicago: Doan, Nghia, Seyyed Ali Hashemi, Marco Mondelli, and Warren J. Gross. “Decoding Reed-Muller Codes with Successive Codeword Permutations.” IEEE Transactions on Communications. Institute of Electrical and Electronics Engineers, 2022. https://doi.org/10.1109/tcomm.2022.3211101. ieee: N. Doan, S. A. Hashemi, M. Mondelli, and W. J. Gross, “Decoding Reed-Muller codes with successive codeword permutations,” IEEE Transactions on Communications, vol. 70, no. 11. Institute of Electrical and Electronics Engineers, pp. 7134–7145, 2022. ista: Doan N, Hashemi SA, Mondelli M, Gross WJ. 2022. Decoding Reed-Muller codes with successive codeword permutations. IEEE Transactions on Communications. 70(11), 7134–7145. mla: Doan, Nghia, et al. “Decoding Reed-Muller Codes with Successive Codeword Permutations.” IEEE Transactions on Communications, vol. 70, no. 11, Institute of Electrical and Electronics Engineers, 2022, pp. 7134–45, doi:10.1109/tcomm.2022.3211101. short: N. Doan, S.A. Hashemi, M. Mondelli, W.J. Gross, IEEE Transactions on Communications 70 (2022) 7134–7145. date_created: 2023-01-16T09:50:38Z date_published: 2022-11-01T00:00:00Z date_updated: 2023-08-04T09:34:43Z day: '01' department: - _id: MaMo doi: 10.1109/tcomm.2022.3211101 external_id: arxiv: - '2109.02122' isi: - '000937284600006' intvolume: ' 70' isi: 1 issue: '11' language: - iso: eng main_file_link: - open_access: '1' url: ' https://doi.org/10.48550/arXiv.2109.02122' month: '11' oa: 1 oa_version: Preprint page: 7134-7145 publication: IEEE Transactions on Communications publication_identifier: eissn: - 1558-0857 issn: - 0090-6778 publication_status: published publisher: Institute of Electrical and Electronics Engineers quality_controlled: '1' scopus_import: '1' status: public title: Decoding Reed-Muller codes with successive codeword permutations type: journal_article user_id: 4359f0d1-fa6c-11eb-b949-802e58b17ae8 volume: 70 year: '2022' ... --- _id: '10364' abstract: - lang: eng text: 'This paper characterizes the latency of the simplified successive-cancellation (SSC) decoding scheme for polar codes under hardware resource constraints. In particular, when the number of processing elements P that can perform SSC decoding operations in parallel is limited, as is the case in practice, the latency of SSC decoding is O(N1-1/μ + N/P log2 log2 N/P), where N is the block length of the code and μ is the scaling exponent of the channel. Three direct consequences of this bound are presented. First, in a fully-parallel implementation where P = N/2, the latency of SSC decoding is O(N1-1/μ), which is sublinear in the block length. This recovers a result from our earlier work. Second, in a fully-serial implementation where P = 1, the latency of SSC decoding scales as O(N log2 log2 N). The multiplicative constant is also calculated: we show that the latency of SSC decoding when P = 1 is given by (2 + o(1))N log2 log2 N. Third, in a semi-parallel implementation, the smallest P that gives the same latency as that of the fully-parallel implementation is P = N1/μ. The tightness of our bound on SSC decoding latency and the applicability of the foregoing results is validated through extensive simulations.' acknowledgement: "S. A. Hashemi is supported by a Postdoctoral Fellowship from the Natural Sciences and\r\nEngineering Research Council of Canada (NSERC) and by Huawei. M. Mondelli is partially\r\nsupported by the 2019 Lopez-Loreta Prize. A. Fazeli and A. Vardy were supported in part by\r\nthe National Science Foundation under Grant CCF-1764104." article_processing_charge: No article_type: original author: - first_name: Seyyed Ali full_name: Hashemi, Seyyed Ali last_name: Hashemi - first_name: Marco full_name: Mondelli, Marco id: 27EB676C-8706-11E9-9510-7717E6697425 last_name: Mondelli orcid: 0000-0002-3242-7020 - first_name: Arman full_name: Fazeli, Arman last_name: Fazeli - first_name: Alexander full_name: Vardy, Alexander last_name: Vardy - first_name: John full_name: Cioffi, John last_name: Cioffi - first_name: Andrea full_name: Goldsmith, Andrea last_name: Goldsmith citation: ama: Hashemi SA, Mondelli M, Fazeli A, Vardy A, Cioffi J, Goldsmith A. Parallelism versus latency in simplified successive-cancellation decoding of polar codes. IEEE Transactions on Wireless Communications. 2022;21(6):3909-3920. doi:10.1109/TWC.2021.3125626 apa: Hashemi, S. A., Mondelli, M., Fazeli, A., Vardy, A., Cioffi, J., & Goldsmith, A. (2022). Parallelism versus latency in simplified successive-cancellation decoding of polar codes. IEEE Transactions on Wireless Communications. Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/TWC.2021.3125626 chicago: Hashemi, Seyyed Ali, Marco Mondelli, Arman Fazeli, Alexander Vardy, John Cioffi, and Andrea Goldsmith. “Parallelism versus Latency in Simplified Successive-Cancellation Decoding of Polar Codes.” IEEE Transactions on Wireless Communications. Institute of Electrical and Electronics Engineers, 2022. https://doi.org/10.1109/TWC.2021.3125626. ieee: S. A. Hashemi, M. Mondelli, A. Fazeli, A. Vardy, J. Cioffi, and A. Goldsmith, “Parallelism versus latency in simplified successive-cancellation decoding of polar codes,” IEEE Transactions on Wireless Communications, vol. 21, no. 6. Institute of Electrical and Electronics Engineers, pp. 3909–3920, 2022. ista: Hashemi SA, Mondelli M, Fazeli A, Vardy A, Cioffi J, Goldsmith A. 2022. Parallelism versus latency in simplified successive-cancellation decoding of polar codes. IEEE Transactions on Wireless Communications. 21(6), 3909–3920. mla: Hashemi, Seyyed Ali, et al. “Parallelism versus Latency in Simplified Successive-Cancellation Decoding of Polar Codes.” IEEE Transactions on Wireless Communications, vol. 21, no. 6, Institute of Electrical and Electronics Engineers, 2022, pp. 3909–20, doi:10.1109/TWC.2021.3125626. short: S.A. Hashemi, M. Mondelli, A. Fazeli, A. Vardy, J. Cioffi, A. Goldsmith, IEEE Transactions on Wireless Communications 21 (2022) 3909–3920. date_created: 2021-11-28T23:01:29Z date_published: 2022-06-01T00:00:00Z date_updated: 2023-08-14T06:55:57Z day: '01' department: - _id: MaMo doi: 10.1109/TWC.2021.3125626 external_id: arxiv: - '2012.13378' isi: - '000809406400028' intvolume: ' 21' isi: 1 issue: '6' language: - iso: eng main_file_link: - open_access: '1' url: https://arxiv.org/abs/2012.13378 month: '06' oa: 1 oa_version: Preprint page: 3909-3920 project: - _id: 059876FA-7A3F-11EA-A408-12923DDC885E name: Prix Lopez-Loretta 2019 - Marco Mondelli publication: IEEE Transactions on Wireless Communications publication_identifier: eissn: - 1558-2248 issn: - 1536-1276 publication_status: published publisher: Institute of Electrical and Electronics Engineers quality_controlled: '1' related_material: record: - id: '10053' relation: earlier_version status: public scopus_import: '1' status: public title: Parallelism versus latency in simplified successive-cancellation decoding of polar codes type: journal_article user_id: 4359f0d1-fa6c-11eb-b949-802e58b17ae8 volume: 21 year: '2022' ... --- _id: '12538' abstract: - lang: eng text: In this paper, we study the compression of a target two-layer neural network with N nodes into a compressed network with MIEEE Information Theory Workshop. 2022:588-593. doi:10.1109/ITW54588.2022.9965870 apa: 'Amani, M. H., Bombari, S., Mondelli, M., Pukdee, R., & Rini, S. (2022). Sharp asymptotics on the compression of two-layer neural networks. IEEE Information Theory Workshop. Mumbai, India: IEEE. https://doi.org/10.1109/ITW54588.2022.9965870' chicago: Amani, Mohammad Hossein, Simone Bombari, Marco Mondelli, Rattana Pukdee, and Stefano Rini. “Sharp Asymptotics on the Compression of Two-Layer Neural Networks.” IEEE Information Theory Workshop. IEEE, 2022. https://doi.org/10.1109/ITW54588.2022.9965870. ieee: M. H. Amani, S. Bombari, M. Mondelli, R. Pukdee, and S. Rini, “Sharp asymptotics on the compression of two-layer neural networks,” IEEE Information Theory Workshop. IEEE, pp. 588–593, 2022. ista: Amani MH, Bombari S, Mondelli M, Pukdee R, Rini S. 2022. Sharp asymptotics on the compression of two-layer neural networks. IEEE Information Theory Workshop., 588–593. mla: Amani, Mohammad Hossein, et al. “Sharp Asymptotics on the Compression of Two-Layer Neural Networks.” IEEE Information Theory Workshop, IEEE, 2022, pp. 588–93, doi:10.1109/ITW54588.2022.9965870. short: M.H. Amani, S. Bombari, M. Mondelli, R. Pukdee, S. Rini, IEEE Information Theory Workshop (2022) 588–593. conference: end_date: 2022-11-09 location: Mumbai, India name: 'ITW: Information Theory Workshop' start_date: 2022-11-01 date_created: 2023-02-10T13:47:56Z date_published: 2022-11-16T00:00:00Z date_updated: 2023-12-18T11:31:47Z day: '16' department: - _id: MaMo doi: 10.1109/ITW54588.2022.9965870 external_id: arxiv: - '2205.08199' language: - iso: eng main_file_link: - open_access: '1' url: ' https://doi.org/10.48550/arXiv.2205.08199' month: '11' oa: 1 oa_version: Preprint page: 588-593 publication: IEEE Information Theory Workshop publication_identifier: isbn: - '9781665483414' publication_status: published publisher: IEEE quality_controlled: '1' scopus_import: '1' status: public title: Sharp asymptotics on the compression of two-layer neural networks type: journal_article user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87 year: '2022' ... --- _id: '12537' abstract: - lang: eng text: 'The Neural Tangent Kernel (NTK) has emerged as a powerful tool to provide memorization, optimization and generalization guarantees in deep neural networks. A line of work has studied the NTK spectrum for two-layer and deep networks with at least a layer with Ω(N) neurons, N being the number of training samples. Furthermore, there is increasing evidence suggesting that deep networks with sub-linear layer widths are powerful memorizers and optimizers, as long as the number of parameters exceeds the number of samples. Thus, a natural open question is whether the NTK is well conditioned in such a challenging sub-linear setup. In this paper, we answer this question in the affirmative. Our key technical contribution is a lower bound on the smallest NTK eigenvalue for deep networks with the minimum possible over-parameterization: the number of parameters is roughly Ω(N) and, hence, the number of neurons is as little as Ω(N−−√). To showcase the applicability of our NTK bounds, we provide two results concerning memorization capacity and optimization guarantees for gradient descent training.' acknowledgement: "The authors were partially supported by the 2019 Lopez-Loreta prize, and they would like to thank\r\nQuynh Nguyen, Mahdi Soltanolkotabi and Adel Javanmard for helpful discussions.\r\n" article_processing_charge: No author: - first_name: Simone full_name: Bombari, Simone id: ca726dda-de17-11ea-bc14-f9da834f63aa last_name: Bombari - first_name: Mohammad Hossein full_name: Amani, Mohammad Hossein last_name: Amani - first_name: Marco full_name: Mondelli, Marco id: 27EB676C-8706-11E9-9510-7717E6697425 last_name: Mondelli orcid: 0000-0002-3242-7020 citation: ama: 'Bombari S, Amani MH, Mondelli M. Memorization and optimization in deep neural networks with minimum over-parameterization. In: 36th Conference on Neural Information Processing Systems. Vol 35. Curran Associates; 2022:7628-7640.' apa: Bombari, S., Amani, M. H., & Mondelli, M. (2022). Memorization and optimization in deep neural networks with minimum over-parameterization. In 36th Conference on Neural Information Processing Systems (Vol. 35, pp. 7628–7640). Curran Associates. chicago: Bombari, Simone, Mohammad Hossein Amani, and Marco Mondelli. “Memorization and Optimization in Deep Neural Networks with Minimum Over-Parameterization.” In 36th Conference on Neural Information Processing Systems, 35:7628–40. Curran Associates, 2022. ieee: S. Bombari, M. H. Amani, and M. Mondelli, “Memorization and optimization in deep neural networks with minimum over-parameterization,” in 36th Conference on Neural Information Processing Systems, 2022, vol. 35, pp. 7628–7640. ista: Bombari S, Amani MH, Mondelli M. 2022. Memorization and optimization in deep neural networks with minimum over-parameterization. 36th Conference on Neural Information Processing Systems. vol. 35, 7628–7640. mla: Bombari, Simone, et al. “Memorization and Optimization in Deep Neural Networks with Minimum Over-Parameterization.” 36th Conference on Neural Information Processing Systems, vol. 35, Curran Associates, 2022, pp. 7628–40. short: S. Bombari, M.H. Amani, M. Mondelli, in:, 36th Conference on Neural Information Processing Systems, Curran Associates, 2022, pp. 7628–7640. date_created: 2023-02-10T13:46:37Z date_published: 2022-07-24T00:00:00Z date_updated: 2023-12-18T11:39:09Z day: '24' department: - _id: MaMo external_id: arxiv: - '2205.10217' intvolume: ' 35' language: - iso: eng main_file_link: - open_access: '1' url: ' https://doi.org/10.48550/arXiv.2205.10217' month: '07' oa: 1 oa_version: Preprint page: 7628-7640 project: - _id: 059876FA-7A3F-11EA-A408-12923DDC885E name: Prix Lopez-Loretta 2019 - Marco Mondelli publication: 36th Conference on Neural Information Processing Systems publication_identifier: isbn: - '9781713871088' publication_status: published publisher: Curran Associates quality_controlled: '1' status: public title: Memorization and optimization in deep neural networks with minimum over-parameterization type: conference user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87 volume: 35 year: '2022' ... --- _id: '12480' abstract: - lang: eng text: 'We consider the problem of estimating a signal from measurements obtained via a generalized linear model. We focus on estimators based on approximate message passing (AMP), a family of iterative algorithms with many appealing features: the performance of AMP in the high-dimensional limit can be succinctly characterized under suitable model assumptions; AMP can also be tailored to the empirical distribution of the signal entries, and for a wide class of estimation problems, AMP is conjectured to be optimal among all polynomial-time algorithms. However, a major issue of AMP is that in many models (such as phase retrieval), it requires an initialization correlated with the ground-truth signal and independent from the measurement matrix. Assuming that such an initialization is available is typically not realistic. In this paper, we solve this problem by proposing an AMP algorithm initialized with a spectral estimator. With such an initialization, the standard AMP analysis fails since the spectral estimator depends in a complicated way on the design matrix. Our main contribution is a rigorous characterization of the performance of AMP with spectral initialization in the high-dimensional limit. The key technical idea is to define and analyze a two-phase artificial AMP algorithm that first produces the spectral estimator, and then closely approximates the iterates of the true AMP. We also provide numerical results that demonstrate the validity of the proposed approach.' acknowledgement: "The authors would like to thank Andrea Montanari for helpful discussions.\r\nM Mondelli was partially supported by the 2019 Lopez-Loreta Prize. R Venkataramanan was partially supported by the Alan Turing Institute under the EPSRC Grant\r\nEP/N510129/1." article_number: '114003' article_processing_charge: Yes (via OA deal) article_type: original author: - first_name: Marco full_name: Mondelli, Marco id: 27EB676C-8706-11E9-9510-7717E6697425 last_name: Mondelli orcid: 0000-0002-3242-7020 - first_name: Ramji full_name: Venkataramanan, Ramji last_name: Venkataramanan citation: ama: 'Mondelli M, Venkataramanan R. Approximate message passing with spectral initialization for generalized linear models. Journal of Statistical Mechanics: Theory and Experiment. 2022;2022(11). doi:10.1088/1742-5468/ac9828' apa: 'Mondelli, M., & Venkataramanan, R. (2022). Approximate message passing with spectral initialization for generalized linear models. Journal of Statistical Mechanics: Theory and Experiment. IOP Publishing. https://doi.org/10.1088/1742-5468/ac9828' chicago: 'Mondelli, Marco, and Ramji Venkataramanan. “Approximate Message Passing with Spectral Initialization for Generalized Linear Models.” Journal of Statistical Mechanics: Theory and Experiment. IOP Publishing, 2022. https://doi.org/10.1088/1742-5468/ac9828.' ieee: 'M. Mondelli and R. Venkataramanan, “Approximate message passing with spectral initialization for generalized linear models,” Journal of Statistical Mechanics: Theory and Experiment, vol. 2022, no. 11. IOP Publishing, 2022.' ista: 'Mondelli M, Venkataramanan R. 2022. Approximate message passing with spectral initialization for generalized linear models. Journal of Statistical Mechanics: Theory and Experiment. 2022(11), 114003.' mla: 'Mondelli, Marco, and Ramji Venkataramanan. “Approximate Message Passing with Spectral Initialization for Generalized Linear Models.” Journal of Statistical Mechanics: Theory and Experiment, vol. 2022, no. 11, 114003, IOP Publishing, 2022, doi:10.1088/1742-5468/ac9828.' short: 'M. Mondelli, R. Venkataramanan, Journal of Statistical Mechanics: Theory and Experiment 2022 (2022).' date_created: 2023-02-02T08:31:57Z date_published: 2022-11-24T00:00:00Z date_updated: 2024-03-07T10:36:52Z day: '24' ddc: - '510' - '530' department: - _id: MaMo doi: 10.1088/1742-5468/ac9828 external_id: isi: - '000889589900001' file: - access_level: open_access checksum: 01411ffa76d3e380a0446baeb89b1ef7 content_type: application/pdf creator: dernst date_created: 2023-02-02T08:35:52Z date_updated: 2023-02-02T08:35:52Z file_id: '12481' file_name: 2022_JourStatisticalMechanics_Mondelli.pdf file_size: 1729997 relation: main_file success: 1 file_date_updated: 2023-02-02T08:35:52Z has_accepted_license: '1' intvolume: ' 2022' isi: 1 issue: '11' keyword: - Statistics - Probability and Uncertainty - Statistics and Probability - Statistical and Nonlinear Physics language: - iso: eng month: '11' oa: 1 oa_version: Published Version project: - _id: 059876FA-7A3F-11EA-A408-12923DDC885E name: Prix Lopez-Loretta 2019 - Marco Mondelli publication: 'Journal of Statistical Mechanics: Theory and Experiment' publication_identifier: issn: - 1742-5468 publication_status: published publisher: IOP Publishing quality_controlled: '1' related_material: record: - id: '10598' relation: earlier_version status: public scopus_import: '1' status: public title: Approximate message passing with spectral initialization for generalized linear models tmp: image: /images/cc_by.png legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0) short: CC BY (4.0) type: journal_article user_id: 4359f0d1-fa6c-11eb-b949-802e58b17ae8 volume: 2022 year: '2022' ... --- _id: '10595' abstract: - lang: eng text: 'A recent line of work has analyzed the theoretical properties of deep neural networks via the Neural Tangent Kernel (NTK). In particular, the smallest eigenvalue of the NTK has been related to the memorization capacity, the global convergence of gradient descent algorithms and the generalization of deep nets. However, existing results either provide bounds in the two-layer setting or assume that the spectrum of the NTK matrices is bounded away from 0 for multi-layer networks. In this paper, we provide tight bounds on the smallest eigenvalue of NTK matrices for deep ReLU nets, both in the limiting case of infinite widths and for finite widths. In the finite-width setting, the network architectures we consider are fairly general: we require the existence of a wide layer with roughly order of $N$ neurons, $N$ being the number of data samples; and the scaling of the remaining layer widths is arbitrary (up to logarithmic factors). To obtain our results, we analyze various quantities of independent interest: we give lower bounds on the smallest singular value of hidden feature matrices, and upper bounds on the Lipschitz constant of input-output feature maps.' acknowledgement: "The authors would like to thank the anonymous reviewers for their helpful comments. MM was partially supported\r\nby the 2019 Lopez-Loreta Prize. QN and GM acknowledge support from the European Research Council (ERC) under\r\nthe European Union’s Horizon 2020 research and innovation programme (grant agreement no 757983)." alternative_title: - Proceedings of Machine Learning Research article_processing_charge: No author: - first_name: Quynh full_name: Nguyen, Quynh last_name: Nguyen - first_name: Marco full_name: Mondelli, Marco id: 27EB676C-8706-11E9-9510-7717E6697425 last_name: Mondelli orcid: 0000-0002-3242-7020 - first_name: Guido F full_name: Montufar, Guido F last_name: Montufar citation: ama: 'Nguyen Q, Mondelli M, Montufar GF. Tight bounds on the smallest eigenvalue of the neural tangent kernel for deep ReLU networks. In: Meila M, Zhang T, eds. Proceedings of the 38th International Conference on Machine Learning. Vol 139. ML Research Press; 2021:8119-8129.' apa: 'Nguyen, Q., Mondelli, M., & Montufar, G. F. (2021). Tight bounds on the smallest eigenvalue of the neural tangent kernel for deep ReLU networks. In M. Meila & T. Zhang (Eds.), Proceedings of the 38th International Conference on Machine Learning (Vol. 139, pp. 8119–8129). Virtual: ML Research Press.' chicago: Nguyen, Quynh, Marco Mondelli, and Guido F Montufar. “Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks.” In Proceedings of the 38th International Conference on Machine Learning, edited by Marina Meila and Tong Zhang, 139:8119–29. ML Research Press, 2021. ieee: Q. Nguyen, M. Mondelli, and G. F. Montufar, “Tight bounds on the smallest eigenvalue of the neural tangent kernel for deep ReLU networks,” in Proceedings of the 38th International Conference on Machine Learning, Virtual, 2021, vol. 139, pp. 8119–8129. ista: 'Nguyen Q, Mondelli M, Montufar GF. 2021. Tight bounds on the smallest eigenvalue of the neural tangent kernel for deep ReLU networks. Proceedings of the 38th International Conference on Machine Learning. ICML: International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 139, 8119–8129.' mla: Nguyen, Quynh, et al. “Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks.” Proceedings of the 38th International Conference on Machine Learning, edited by Marina Meila and Tong Zhang, vol. 139, ML Research Press, 2021, pp. 8119–29. short: Q. Nguyen, M. Mondelli, G.F. Montufar, in:, M. Meila, T. Zhang (Eds.), Proceedings of the 38th International Conference on Machine Learning, ML Research Press, 2021, pp. 8119–8129. conference: end_date: 2021-07-24 location: Virtual name: 'ICML: International Conference on Machine Learning' start_date: 2021-07-18 date_created: 2022-01-03T10:57:49Z date_published: 2021-01-01T00:00:00Z date_updated: 2022-01-04T09:59:21Z department: - _id: MaMo editor: - first_name: Marina full_name: Meila, Marina last_name: Meila - first_name: Tong full_name: Zhang, Tong last_name: Zhang external_id: arxiv: - '2012.11654' intvolume: ' 139' language: - iso: eng main_file_link: - open_access: '1' url: http://proceedings.mlr.press/v139/nguyen21g.html oa: 1 oa_version: Published Version page: 8119-8129 project: - _id: 059876FA-7A3F-11EA-A408-12923DDC885E name: Prix Lopez-Loretta 2019 - Marco Mondelli publication: Proceedings of the 38th International Conference on Machine Learning publication_status: published publisher: ML Research Press quality_controlled: '1' status: public title: Tight bounds on the smallest eigenvalue of the neural tangent kernel for deep ReLU networks type: conference user_id: 8b945eb4-e2f2-11eb-945a-df72226e66a9 volume: 139 year: '2021' ... --- _id: '10599' abstract: - lang: eng text: A two-part successive syndrome-check decoding of polar codes is proposed with the first part successively refining the received codeword and the second part checking its syndrome. A new formulation of the successive-cancellation (SC) decoding algorithm is presented that allows for successively refining the received codeword by comparing the log-likelihood ratio value of a frozen bit with its predefined value. The syndrome of the refined received codeword is then checked for possible errors. In case there are no errors, the decoding process is terminated. Otherwise, the decoder continues to refine the received codeword. The proposed method is extended to the case of SC list (SCL) decoding by terminating the decoding process when the syndrome of the best candidate in the list indicates no errors. Simulation results show that the proposed method reduces the time-complexity of SC and SCL decoders and their fast variants, especially at high signal-to-noise ratios. acknowledgement: This work is supported in part by ONR grant N00014-18-1-2191. S. A. Hashemi was supported by a Postdoctoral Fellowship from the Natural Sciences and Engineering Research Council of Canada (NSERC) and by Huawei. M. Mondelli was partially supported by the 2019 Lopez-Loreta Prize. article_processing_charge: No author: - first_name: Seyyed Ali full_name: Hashemi, Seyyed Ali last_name: Hashemi - first_name: Marco full_name: Mondelli, Marco id: 27EB676C-8706-11E9-9510-7717E6697425 last_name: Mondelli orcid: 0000-0002-3242-7020 - first_name: John full_name: Cioffi, John last_name: Cioffi - first_name: Andrea full_name: Goldsmith, Andrea last_name: Goldsmith citation: ama: 'Hashemi SA, Mondelli M, Cioffi J, Goldsmith A. Successive syndrome-check decoding of polar codes. In: Proceedings of the 55th Asilomar Conference on Signals, Systems, and Computers. Vol 2021-October. Institute of Electrical and Electronics Engineers; 2021:943-947. doi:10.1109/IEEECONF53345.2021.9723394' apa: 'Hashemi, S. A., Mondelli, M., Cioffi, J., & Goldsmith, A. (2021). Successive syndrome-check decoding of polar codes. In Proceedings of the 55th Asilomar Conference on Signals, Systems, and Computers (Vol. 2021–October, pp. 943–947). Virtual, Pacific Grove, CA, United States: Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/IEEECONF53345.2021.9723394' chicago: Hashemi, Seyyed Ali, Marco Mondelli, John Cioffi, and Andrea Goldsmith. “Successive Syndrome-Check Decoding of Polar Codes.” In Proceedings of the 55th Asilomar Conference on Signals, Systems, and Computers, 2021–October:943–47. Institute of Electrical and Electronics Engineers, 2021. https://doi.org/10.1109/IEEECONF53345.2021.9723394. ieee: S. A. Hashemi, M. Mondelli, J. Cioffi, and A. Goldsmith, “Successive syndrome-check decoding of polar codes,” in Proceedings of the 55th Asilomar Conference on Signals, Systems, and Computers, Virtual, Pacific Grove, CA, United States, 2021, vol. 2021–October, pp. 943–947. ista: 'Hashemi SA, Mondelli M, Cioffi J, Goldsmith A. 2021. Successive syndrome-check decoding of polar codes. Proceedings of the 55th Asilomar Conference on Signals, Systems, and Computers. ACSSC: Asilomar Conference on Signals, Systems, and Computers vol. 2021–October, 943–947.' mla: Hashemi, Seyyed Ali, et al. “Successive Syndrome-Check Decoding of Polar Codes.” Proceedings of the 55th Asilomar Conference on Signals, Systems, and Computers, vol. 2021–October, Institute of Electrical and Electronics Engineers, 2021, pp. 943–47, doi:10.1109/IEEECONF53345.2021.9723394. short: S.A. Hashemi, M. Mondelli, J. Cioffi, A. Goldsmith, in:, Proceedings of the 55th Asilomar Conference on Signals, Systems, and Computers, Institute of Electrical and Electronics Engineers, 2021, pp. 943–947. conference: end_date: 2021-11-03 location: Virtual, Pacific Grove, CA, United States name: 'ACSSC: Asilomar Conference on Signals, Systems, and Computers' start_date: 2021-10-31 date_created: 2022-01-03T11:39:51Z date_published: 2021-11-01T00:00:00Z date_updated: 2023-02-13T10:44:16Z day: '01' department: - _id: MaMo doi: 10.1109/IEEECONF53345.2021.9723394 external_id: arxiv: - '2112.00057' language: - iso: eng main_file_link: - open_access: '1' url: ' https://doi.org/10.48550/arXiv.2112.00057' month: '11' oa: 1 oa_version: Preprint page: 943-947 project: - _id: 059876FA-7A3F-11EA-A408-12923DDC885E name: Prix Lopez-Loretta 2019 - Marco Mondelli publication: Proceedings of the 55th Asilomar Conference on Signals, Systems, and Computers publication_identifier: isbn: - '9781665458283' issn: - 1058-6393 publication_status: published publisher: Institute of Electrical and Electronics Engineers quality_controlled: '1' scopus_import: '1' status: public title: Successive syndrome-check decoding of polar codes type: conference user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87 volume: 2021-October year: '2021' ...