TY - CONF AB - Despite their recent success, deep neural networks continue to perform poorly when they encounter distribution shifts at test time. Many recently proposed approaches try to counter this by aligning the model to the new distribution prior to inference. With no labels available this requires unsupervised objectives to adapt the model on the observed test data. In this paper, we propose Test-Time SelfTraining (TeST): a technique that takes as input a model trained on some source data and a novel data distribution at test time, and learns invariant and robust representations using a student-teacher framework. We find that models adapted using TeST significantly improve over baseline testtime adaptation algorithms. TeST achieves competitive performance to modern domain adaptation algorithms [4, 43], while having access to 5-10x less data at time of adaption. We thoroughly evaluate a variety of baselines on two tasks: object detection and image segmentation and find that models adapted with TeST. We find that TeST sets the new stateof-the art for test-time domain adaptation algorithms. AU - Sinha, Samarth AU - Gehler, Peter AU - Locatello, Francesco AU - Schiele, Bernt ID - 14105 SN - 9781665493475 T2 - 2023 IEEE/CVF Winter Conference on Applications of Computer Vision TI - TeST: Test-time Self-Training under distribution shift ER - TY - JOUR AB - Context. Space asteroseismology is revolutionizing our knowledge of the internal structure and dynamics of stars. A breakthrough is ongoing with the recent discoveries of signatures of strong magnetic fields in the core of red giant stars. The key signature for such a detection is the asymmetry these fields induce in the frequency splittings of observed dipolar mixed gravito-acoustic modes. Aims. We investigate the ability of the observed asymmetries of the frequency splittings of dipolar mixed modes to constrain the geometrical properties of deep magnetic fields. Methods. We used the powerful analytical Racah-Wigner algebra used in quantum mechanics to characterize the geometrical couplings of dipolar mixed oscillation modes with various realistically plausible topologies of fossil magnetic fields. We also computed the induced perturbation of their frequencies. Results. First, in the case of an oblique magnetic dipole, we provide the exact analytical expression of the asymmetry as a function of the angle between the rotation and magnetic axes. Its value provides a direct measure of this angle. Second, considering a combination of axisymmetric dipolar and quadrupolar fields, we show how the asymmetry is blind to the unraveling of the relative strength and sign of each component. Finally, in the case of a given multipole, we show that a negative asymmetry is a signature of non-axisymmetric topologies. Conclusions. Asymmetries of dipolar mixed modes provide a key bit of information on the geometrical topology of deep fossil magnetic fields, but this is insufficient on its own. Asteroseismic constraints should therefore be combined with spectropolarimetric observations and numerical simulations, which aim to predict the more probable stable large-scale geometries. AU - Mathis, S. AU - Bugnet, Lisa Annabelle ID - 14256 JF - Astronomy and Astrophysics SN - 0004-6361 TI - Asymmetries of frequency splittings of dipolar mixed modes: A window on the topology of deep magnetic fields VL - 676 ER - TY - JOUR AB - In this work, a generalized, adapted Numerov implementation capable of determining band structures of periodic quantum systems is outlined. Based on the input potential, the presented approach numerically solves the Schrödinger equation in position space at each momentum space point. Thus, in addition to the band structure, the method inherently provides information about the state functions and probability densities in position space at each momentum space point considered. The generalized, adapted Numerov framework provided reliable estimates for a variety of increasingly complex test suites in one, two, and three dimensions. The accuracy of the proposed methodology was benchmarked against results obtained for the analytically solvable Kronig-Penney model. Furthermore, the presented numerical solver was applied to a model potential representing a 2D optical lattice being a challenging application relevant, for example, in the field of quantum computing. AU - Gamper, Jakob AU - Kluibenschedl, Florian AU - Weiss, Alexander K.H. AU - Hofer, Thomas S. ID - 14261 IS - 33 JF - Journal of Physical Chemistry Letters TI - Accessing position space wave functions in band structure calculations of periodic systems - a generalized, adapted numerov implementation for one-, two-, and three-dimensional quantum problems VL - 14 ER - TY - CONF AB - This paper focuses on over-parameterized deep neural networks (DNNs) with ReLU activation functions and proves that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification while obtaining (nearly) zero-training error under the lazy training regime. For this purpose, we unify three interrelated concepts of overparameterization, benign overfitting, and the Lipschitz constant of DNNs. Our results indicate that interpolating with smoother functions leads to better generalization. Furthermore, we investigate the special case where interpolating smooth ground-truth functions is performed by DNNs under the Neural Tangent Kernel (NTK) regime for generalization. Our result demonstrates that the generalization error converges to a constant order that only depends on label noise and initialization noise, which theoretically verifies benign overfitting. Our analysis provides a tight lower bound on the normalized margin under non-smooth activation functions, as well as the minimum eigenvalue of NTK under high-dimensional settings, which has its own interest in learning theory. AU - Zhu, Zhenyu AU - Liu, Fanghui AU - Chrysos, Grigorios G AU - Locatello, Francesco AU - Cevher, Volkan ID - 14208 T2 - Proceedings of the 40th International Conference on Machine Learning TI - Benign overfitting in deep neural networks under lazy training VL - 202 ER - TY - GEN AB - Diffusion models excel at generating photorealistic images from text-queries. Naturally, many approaches have been proposed to use these generative abilities to augment training datasets for downstream tasks, such as classification. However, diffusion models are themselves trained on large noisily supervised, but nonetheless, annotated datasets. It is an open question whether the generalization capabilities of diffusion models beyond using the additional data of the pre-training process for augmentation lead to improved downstream performance. We perform a systematic evaluation of existing methods to generate images from diffusion models and study new extensions to assess their benefit for data augmentation. While we find that personalizing diffusion models towards the target data outperforms simpler prompting strategies, we also show that using the training data of the diffusion model alone, via a simple nearest neighbor retrieval procedure, leads to even stronger downstream performance. Overall, our study probes the limitations of diffusion models for data augmentation but also highlights its potential in generating new training data to improve performance on simple downstream vision tasks. AU - Burg, Max F. AU - Wenzel, Florian AU - Zietlow, Dominik AU - Horn, Max AU - Makansi, Osama AU - Locatello, Francesco AU - Russell, Chris ID - 14209 T2 - arXiv TI - A data augmentation perspective on diffusion models and retrieval ER -