[{"page":"147","citation":{"apa":"Peste, E.-A. (2023). Efficiency and generalization of sparse neural networks. Institute of Science and Technology Austria. https://doi.org/10.15479/at:ista:13074","ieee":"E.-A. Peste, “Efficiency and generalization of sparse neural networks,” Institute of Science and Technology Austria, 2023.","ista":"Peste E-A. 2023. Efficiency and generalization of sparse neural networks. Institute of Science and Technology Austria.","ama":"Peste E-A. Efficiency and generalization of sparse neural networks. 2023. doi:10.15479/at:ista:13074","chicago":"Peste, Elena-Alexandra. “Efficiency and Generalization of Sparse Neural Networks.” Institute of Science and Technology Austria, 2023. https://doi.org/10.15479/at:ista:13074.","short":"E.-A. Peste, Efficiency and Generalization of Sparse Neural Networks, Institute of Science and Technology Austria, 2023.","mla":"Peste, Elena-Alexandra. Efficiency and Generalization of Sparse Neural Networks. Institute of Science and Technology Austria, 2023, doi:10.15479/at:ista:13074."},"date_published":"2023-05-23T00:00:00Z","day":"23","has_accepted_license":"1","article_processing_charge":"No","status":"public","title":"Efficiency and generalization of sparse neural networks","ddc":["000"],"_id":"13074","user_id":"8b945eb4-e2f2-11eb-945a-df72226e66a9","file":[{"date_created":"2023-05-24T16:11:16Z","date_updated":"2023-05-24T16:11:16Z","checksum":"6b3354968403cb9d48cc5a83611fb571","success":1,"relation":"main_file","file_id":"13087","content_type":"application/pdf","file_size":2152072,"creator":"epeste","file_name":"PhD_Thesis_Alexandra_Peste_final.pdf","access_level":"open_access"},{"relation":"source_file","file_id":"13088","date_created":"2023-05-24T16:12:59Z","date_updated":"2023-05-24T16:12:59Z","checksum":"8d0df94bbcf4db72c991f22503b3fd60","file_name":"PhD_Thesis_APeste.zip","access_level":"closed","content_type":"application/zip","file_size":1658293,"creator":"epeste"}],"oa_version":"Published Version","alternative_title":["ISTA Thesis"],"type":"dissertation","abstract":[{"lang":"eng","text":"Deep learning has become an integral part of a large number of important applications, and many of the recent breakthroughs have been enabled by the ability to train very large models, capable to capture complex patterns and relationships from the data. At the same time, the massive sizes of modern deep learning models have made their deployment to smaller devices more challenging; this is particularly important, as in many applications the users rely on accurate deep learning predictions, but they only have access to devices with limited memory and compute power. One solution to this problem is to prune neural networks, by setting as many of their parameters as possible to zero, to obtain accurate sparse models with lower memory footprint. Despite the great research progress in obtaining sparse models that preserve accuracy, while satisfying memory and computational constraints, there are still many challenges associated with efficiently training sparse models, as well as understanding their generalization properties.\r\n\r\nThe focus of this thesis is to investigate how the training process of sparse models can be made more efficient, and to understand the differences between sparse and dense models in terms of how well they can generalize to changes in the data distribution. We first study a method for co-training sparse and dense models, at a lower cost compared to regular training. With our method we can obtain very accurate sparse networks, and dense models that can recover the baseline accuracy. Furthermore, we are able to more easily analyze the differences, at prediction level, between the sparse-dense model pairs. Next, we investigate the generalization properties of sparse neural networks in more detail, by studying how well different sparse models trained on a larger task can adapt to smaller, more specialized tasks, in a transfer learning scenario. Our analysis across multiple pruning methods and sparsity levels reveals that sparse models provide features that can transfer similarly to or better than the dense baseline. However, the choice of the pruning method plays an important role, and can influence the results when the features are fixed (linear finetuning), or when they are allowed to adapt to the new task (full finetuning). Using sparse models with fixed masks for finetuning on new tasks has an important practical advantage, as it enables training neural networks on smaller devices. However, one drawback of current pruning methods is that the entire training cycle has to be repeated to obtain the initial sparse model, for every sparsity target; in consequence, the entire training process is costly and also multiple models need to be stored. In the last part of the thesis we propose a method that can train accurate dense models that are compressible in a single step, to multiple sparsity levels, without additional finetuning. Our method results in sparse models that can be competitive with existing pruning methods, and which can also successfully generalize to new tasks."}],"project":[{"call_identifier":"H2020","name":"International IST Doctoral Program","_id":"2564DBCA-B435-11E9-9278-68D0E5697425","grant_number":"665385"},{"name":"Elastic Coordination for Scalable Machine Learning","call_identifier":"H2020","_id":"268A44D6-B435-11E9-9278-68D0E5697425","grant_number":"805223"}],"oa":1,"degree_awarded":"PhD","supervisor":[{"id":"40C20FD2-F248-11E8-B48F-1D18A9856A87","orcid":"0000-0001-8622-7887","first_name":"Christoph","last_name":"Lampert","full_name":"Lampert, Christoph"},{"last_name":"Alistarh","first_name":"Dan-Adrian","orcid":"0000-0003-3650-940X","id":"4A899BFC-F248-11E8-B48F-1D18A9856A87","full_name":"Alistarh, Dan-Adrian"}],"acknowledged_ssus":[{"_id":"ScienComp"}],"language":[{"iso":"eng"}],"doi":"10.15479/at:ista:13074","month":"05","publication_identifier":{"issn":["2663-337X"]},"publication_status":"published","publisher":"Institute of Science and Technology Austria","department":[{"_id":"GradSch"},{"_id":"DaAl"},{"_id":"ChLa"}],"year":"2023","date_updated":"2023-08-04T10:33:27Z","date_created":"2023-05-23T17:07:53Z","author":[{"id":"32D78294-F248-11E8-B48F-1D18A9856A87","first_name":"Elena-Alexandra","last_name":"Peste","full_name":"Peste, Elena-Alexandra"}],"related_material":{"record":[{"id":"11458","status":"public","relation":"part_of_dissertation"},{"id":"13053","status":"public","relation":"part_of_dissertation"},{"relation":"part_of_dissertation","status":"public","id":"12299"}]},"file_date_updated":"2023-05-24T16:12:59Z","ec_funded":1},{"oa":1,"project":[{"_id":"2564DBCA-B435-11E9-9278-68D0E5697425","grant_number":"665385","name":"International IST Doctoral Program","call_identifier":"H2020"}],"doi":"10.15479/at:ista:10799","language":[{"iso":"eng"}],"supervisor":[{"last_name":"Lampert","first_name":"Christoph","orcid":"0000-0001-8622-7887","id":"40C20FD2-F248-11E8-B48F-1D18A9856A87","full_name":"Lampert, Christoph"}],"degree_awarded":"PhD","publication_identifier":{"issn":["2663-337X"],"isbn":["978-3-99078-015-2"]},"month":"03","year":"2022","publisher":"Institute of Science and Technology Austria","department":[{"_id":"GradSch"},{"_id":"ChLa"}],"publication_status":"published","related_material":{"record":[{"relation":"part_of_dissertation","status":"public","id":"8724"},{"status":"public","relation":"part_of_dissertation","id":"10803"},{"relation":"part_of_dissertation","status":"public","id":"10802"},{"relation":"part_of_dissertation","status":"public","id":"6590"}]},"author":[{"full_name":"Konstantinov, Nikola H","id":"4B9D76E4-F248-11E8-B48F-1D18A9856A87","last_name":"Konstantinov","first_name":"Nikola H"}],"date_updated":"2023-10-17T12:31:54Z","date_created":"2022-02-28T13:03:49Z","ec_funded":1,"file_date_updated":"2022-03-10T12:11:48Z","citation":{"chicago":"Konstantinov, Nikola H. “Robustness and Fairness in Machine Learning.” Institute of Science and Technology Austria, 2022. https://doi.org/10.15479/at:ista:10799.","mla":"Konstantinov, Nikola H. Robustness and Fairness in Machine Learning. Institute of Science and Technology Austria, 2022, doi:10.15479/at:ista:10799.","short":"N.H. Konstantinov, Robustness and Fairness in Machine Learning, Institute of Science and Technology Austria, 2022.","ista":"Konstantinov NH. 2022. Robustness and fairness in machine learning. Institute of Science and Technology Austria.","ieee":"N. H. Konstantinov, “Robustness and fairness in machine learning,” Institute of Science and Technology Austria, 2022.","apa":"Konstantinov, N. H. (2022). Robustness and fairness in machine learning. Institute of Science and Technology Austria. https://doi.org/10.15479/at:ista:10799","ama":"Konstantinov NH. Robustness and fairness in machine learning. 2022. doi:10.15479/at:ista:10799"},"page":"176","date_published":"2022-03-08T00:00:00Z","keyword":["robustness","fairness","machine learning","PAC learning","adversarial learning"],"has_accepted_license":"1","article_processing_charge":"No","day":"08","user_id":"c635000d-4b10-11ee-a964-aac5a93f6ac1","_id":"10799","ddc":["000"],"status":"public","title":"Robustness and fairness in machine learning","oa_version":"Published Version","file":[{"creator":"nkonstan","content_type":"application/pdf","file_size":4204905,"access_level":"open_access","file_name":"thesis.pdf","success":1,"checksum":"626bc523ae8822d20e635d0e2d95182e","date_created":"2022-03-06T11:42:54Z","date_updated":"2022-03-06T11:42:54Z","file_id":"10823","relation":"main_file"},{"creator":"nkonstan","file_size":22841103,"content_type":"application/x-zip-compressed","file_name":"thesis.zip","access_level":"closed","date_updated":"2022-03-10T12:11:48Z","date_created":"2022-03-06T11:42:57Z","checksum":"e2ca2b88350ac8ea1515b948885cbcb1","file_id":"10824","relation":"source_file"}],"type":"dissertation","alternative_title":["ISTA Thesis"],"abstract":[{"lang":"eng","text":"Because of the increasing popularity of machine learning methods, it is becoming important to understand the impact of learned components on automated decision-making systems and to guarantee that their consequences are beneficial to society. In other words, it is necessary to ensure that machine learning is sufficiently trustworthy to be used in real-world applications. This thesis studies two properties of machine learning models that are highly desirable for the\r\nsake of reliability: robustness and fairness. In the first part of the thesis we study the robustness of learning algorithms to training data corruption. Previous work has shown that machine learning models are vulnerable to a range\r\nof training set issues, varying from label noise through systematic biases to worst-case data manipulations. This is an especially relevant problem from a present perspective, since modern machine learning methods are particularly data hungry and therefore practitioners often have to rely on data collected from various external sources, e.g. from the Internet, from app users or via crowdsourcing. Naturally, such sources vary greatly in the quality and reliability of the\r\ndata they provide. With these considerations in mind, we study the problem of designing machine learning algorithms that are robust to corruptions in data coming from multiple sources. We show that, in contrast to the case of a single dataset with outliers, successful learning within this model is possible both theoretically and practically, even under worst-case data corruptions. The second part of this thesis deals with fairness-aware machine learning. There are multiple areas where machine learning models have shown promising results, but where careful considerations are required, in order to avoid discrimanative decisions taken by such learned components. Ensuring fairness can be particularly challenging, because real-world training datasets are expected to contain various forms of historical bias that may affect the learning process. In this thesis we show that data corruption can indeed render the problem of achieving fairness impossible, by tightly characterizing the theoretical limits of fair learning under worst-case data manipulations. However, assuming access to clean data, we also show how fairness-aware learning can be made practical in contexts beyond binary classification, in particular in the challenging learning to rank setting."}]},{"citation":{"ama":"Phuong M. Underspecification in deep learning. 2021. doi:10.15479/AT:ISTA:9418","ieee":"M. Phuong, “Underspecification in deep learning,” Institute of Science and Technology Austria, 2021.","apa":"Phuong, M. (2021). Underspecification in deep learning. Institute of Science and Technology Austria. https://doi.org/10.15479/AT:ISTA:9418","ista":"Phuong M. 2021. Underspecification in deep learning. Institute of Science and Technology Austria.","short":"M. Phuong, Underspecification in Deep Learning, Institute of Science and Technology Austria, 2021.","mla":"Phuong, Mary. Underspecification in Deep Learning. Institute of Science and Technology Austria, 2021, doi:10.15479/AT:ISTA:9418.","chicago":"Phuong, Mary. “Underspecification in Deep Learning.” Institute of Science and Technology Austria, 2021. https://doi.org/10.15479/AT:ISTA:9418."},"page":"125","date_published":"2021-05-30T00:00:00Z","day":"30","has_accepted_license":"1","article_processing_charge":"No","user_id":"c635000d-4b10-11ee-a964-aac5a93f6ac1","_id":"9418","title":"Underspecification in deep learning","status":"public","ddc":["000"],"oa_version":"Published Version","file":[{"file_id":"9419","relation":"main_file","date_updated":"2021-05-24T11:22:29Z","date_created":"2021-05-24T11:22:29Z","success":1,"checksum":"4f0abe64114cfed264f9d36e8d1197e3","file_name":"mph-thesis-v519-pdfimages.pdf","access_level":"open_access","creator":"bphuong","content_type":"application/pdf","file_size":2673905},{"checksum":"f5699e876bc770a9b0df8345a77720a2","date_updated":"2021-05-24T11:56:02Z","date_created":"2021-05-24T11:56:02Z","file_id":"9420","relation":"source_file","creator":"bphuong","file_size":92995100,"content_type":"application/zip","access_level":"closed","file_name":"thesis.zip"}],"type":"dissertation","alternative_title":["ISTA Thesis"],"abstract":[{"lang":"eng","text":"Deep learning is best known for its empirical success across a wide range of applications\r\nspanning computer vision, natural language processing and speech. Of equal significance,\r\nthough perhaps less known, are its ramifications for learning theory: deep networks have\r\nbeen observed to perform surprisingly well in the high-capacity regime, aka the overfitting\r\nor underspecified regime. Classically, this regime on the far right of the bias-variance curve\r\nis associated with poor generalisation; however, recent experiments with deep networks\r\nchallenge this view.\r\n\r\nThis thesis is devoted to investigating various aspects of underspecification in deep learning.\r\nFirst, we argue that deep learning models are underspecified on two levels: a) any given\r\ntraining dataset can be fit by many different functions, and b) any given function can be\r\nexpressed by many different parameter configurations. We refer to the second kind of\r\nunderspecification as parameterisation redundancy and we precisely characterise its extent.\r\nSecond, we characterise the implicit criteria (the inductive bias) that guide learning in the\r\nunderspecified regime. Specifically, we consider a nonlinear but tractable classification\r\nsetting, and show that given the choice, neural networks learn classifiers with a large margin.\r\nThird, we consider learning scenarios where the inductive bias is not by itself sufficient to\r\ndeal with underspecification. We then study different ways of ‘tightening the specification’: i)\r\nIn the setting of representation learning with variational autoencoders, we propose a hand-\r\ncrafted regulariser based on mutual information. ii) In the setting of binary classification, we\r\nconsider soft-label (real-valued) supervision. We derive a generalisation bound for linear\r\nnetworks supervised in this way and verify that soft labels facilitate fast learning. Finally, we\r\nexplore an application of soft-label supervision to the training of multi-exit models."}],"oa":1,"doi":"10.15479/AT:ISTA:9418","acknowledged_ssus":[{"_id":"ScienComp"},{"_id":"CampIT"},{"_id":"E-Lib"}],"supervisor":[{"full_name":"Lampert, Christoph","last_name":"Lampert","first_name":"Christoph","orcid":"0000-0001-8622-7887","id":"40C20FD2-F248-11E8-B48F-1D18A9856A87"}],"degree_awarded":"PhD","language":[{"iso":"eng"}],"month":"05","publication_identifier":{"issn":["2663-337X"]},"year":"2021","publication_status":"published","department":[{"_id":"GradSch"},{"_id":"ChLa"}],"publisher":"Institute of Science and Technology Austria","author":[{"full_name":"Bui Thi Mai, Phuong","id":"3EC6EE64-F248-11E8-B48F-1D18A9856A87","first_name":"Phuong","last_name":"Bui Thi Mai"}],"related_material":{"record":[{"relation":"part_of_dissertation","status":"deleted","id":"7435"},{"status":"public","relation":"part_of_dissertation","id":"7481"},{"id":"9416","status":"public","relation":"part_of_dissertation"},{"relation":"part_of_dissertation","status":"public","id":"7479"}]},"date_updated":"2023-09-08T11:11:12Z","date_created":"2021-05-24T13:06:23Z","file_date_updated":"2021-05-24T11:56:02Z"},{"acknowledgement":"Last but not least, I would like to acknowledge the support of the IST IT and scientific computing team for helping provide a great work environment.","year":"2020","publisher":"Institute of Science and Technology Austria","department":[{"_id":"ChLa"}],"publication_status":"published","related_material":{"record":[{"relation":"part_of_dissertation","status":"public","id":"7936"},{"status":"public","relation":"part_of_dissertation","id":"7937"},{"id":"8193","status":"public","relation":"part_of_dissertation"},{"status":"public","relation":"part_of_dissertation","id":"8092"},{"relation":"part_of_dissertation","status":"public","id":"911"}]},"author":[{"full_name":"Royer, Amélie","last_name":"Royer","first_name":"Amélie","orcid":"0000-0002-8407-0705","id":"3811D890-F248-11E8-B48F-1D18A9856A87"}],"date_created":"2020-09-14T13:42:09Z","date_updated":"2023-10-16T10:04:02Z","file_date_updated":"2020-09-14T13:39:17Z","license":"https://creativecommons.org/licenses/by-nc-sa/4.0/","tmp":{"name":"Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)","legal_code_url":"https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode","image":"/images/cc_by_nc_sa.png","short":"CC BY-NC-SA (4.0)"},"oa":1,"doi":"10.15479/AT:ISTA:8390","language":[{"iso":"eng"}],"acknowledged_ssus":[{"_id":"CampIT"},{"_id":"ScienComp"}],"supervisor":[{"id":"40C20FD2-F248-11E8-B48F-1D18A9856A87","orcid":"0000-0001-8622-7887","first_name":"Christoph","last_name":"Lampert","full_name":"Lampert, Christoph"}],"degree_awarded":"PhD","publication_identifier":{"isbn":["978-3-99078-007-7"],"issn":["2663-337X"]},"month":"09","_id":"8390","user_id":"c635000d-4b10-11ee-a964-aac5a93f6ac1","status":"public","title":"Leveraging structure in Computer Vision tasks for flexible Deep Learning models","ddc":["000"],"file":[{"file_name":"2020_Thesis_Royer.pdf","access_level":"open_access","creator":"dernst","file_size":30224591,"content_type":"application/pdf","file_id":"8391","relation":"main_file","date_updated":"2020-09-14T13:39:14Z","date_created":"2020-09-14T13:39:14Z","success":1,"checksum":"c914d2f88846032f3d8507734861b6ee"},{"creator":"dernst","file_size":74227627,"content_type":"application/x-zip-compressed","file_name":"thesis_sources.zip","access_level":"closed","date_updated":"2020-09-14T13:39:17Z","date_created":"2020-09-14T13:39:17Z","checksum":"ae98fb35d912cff84a89035ae5794d3c","file_id":"8392","relation":"main_file"}],"oa_version":"Published Version","type":"dissertation","alternative_title":["ISTA Thesis"],"abstract":[{"lang":"eng","text":"Deep neural networks have established a new standard for data-dependent feature extraction pipelines in the Computer Vision literature. Despite their remarkable performance in the standard supervised learning scenario, i.e. when models are trained with labeled data and tested on samples that follow a similar distribution, neural networks have been shown to struggle with more advanced generalization abilities, such as transferring knowledge across visually different domains, or generalizing to new unseen combinations of known concepts. In this thesis we argue that, in contrast to the usual black-box behavior of neural networks, leveraging more structured internal representations is a promising direction\r\nfor tackling such problems. In particular, we focus on two forms of structure. First, we tackle modularity: We show that (i) compositional architectures are a natural tool for modeling reasoning tasks, in that they efficiently capture their combinatorial nature, which is key for generalizing beyond the compositions seen during training. We investigate how to to learn such models, both formally and experimentally, for the task of abstract visual reasoning. Then, we show that (ii) in some settings, modularity allows us to efficiently break down complex tasks into smaller, easier, modules, thereby improving computational efficiency; We study this behavior in the context of generative models for colorization, as well as for small objects detection. Secondly, we investigate the inherently layered structure of representations learned by neural networks, and analyze its role in the context of transfer learning and domain adaptation across visually\r\ndissimilar domains. "}],"citation":{"ieee":"A. Royer, “Leveraging structure in Computer Vision tasks for flexible Deep Learning models,” Institute of Science and Technology Austria, 2020.","apa":"Royer, A. (2020). Leveraging structure in Computer Vision tasks for flexible Deep Learning models. Institute of Science and Technology Austria. https://doi.org/10.15479/AT:ISTA:8390","ista":"Royer A. 2020. Leveraging structure in Computer Vision tasks for flexible Deep Learning models. Institute of Science and Technology Austria.","ama":"Royer A. Leveraging structure in Computer Vision tasks for flexible Deep Learning models. 2020. doi:10.15479/AT:ISTA:8390","chicago":"Royer, Amélie. “Leveraging Structure in Computer Vision Tasks for Flexible Deep Learning Models.” Institute of Science and Technology Austria, 2020. https://doi.org/10.15479/AT:ISTA:8390.","short":"A. Royer, Leveraging Structure in Computer Vision Tasks for Flexible Deep Learning Models, Institute of Science and Technology Austria, 2020.","mla":"Royer, Amélie. Leveraging Structure in Computer Vision Tasks for Flexible Deep Learning Models. Institute of Science and Technology Austria, 2020, doi:10.15479/AT:ISTA:8390."},"page":"197","date_published":"2020-09-14T00:00:00Z","article_processing_charge":"No","has_accepted_license":"1","day":"14"},{"publication_identifier":{"issn":["2663-337X"]},"month":"09","doi":"10.15479/AT:ISTA:TH1048","language":[{"iso":"eng"}],"degree_awarded":"PhD","supervisor":[{"full_name":"Lampert, Christoph","last_name":"Lampert","first_name":"Christoph","orcid":"0000-0001-8622-7887","id":"40C20FD2-F248-11E8-B48F-1D18A9856A87"}],"oa":1,"project":[{"grant_number":"308036","_id":"2532554C-B435-11E9-9278-68D0E5697425","call_identifier":"FP7","name":"Lifelong Learning of Visual Scene Understanding"}],"ec_funded":1,"publist_id":"7986","file_date_updated":"2020-07-14T12:47:40Z","author":[{"full_name":"Zimin, Alexander","last_name":"Zimin","first_name":"Alexander","id":"37099E9C-F248-11E8-B48F-1D18A9856A87"}],"date_created":"2018-12-11T11:44:27Z","date_updated":"2023-09-07T12:29:07Z","year":"2018","department":[{"_id":"ChLa"}],"publisher":"Institute of Science and Technology Austria","publication_status":"published","article_processing_charge":"No","has_accepted_license":"1","day":"01","date_published":"2018-09-01T00:00:00Z","citation":{"short":"A. Zimin, Learning from Dependent Data, Institute of Science and Technology Austria, 2018.","mla":"Zimin, Alexander. Learning from Dependent Data. Institute of Science and Technology Austria, 2018, doi:10.15479/AT:ISTA:TH1048.","chicago":"Zimin, Alexander. “Learning from Dependent Data.” Institute of Science and Technology Austria, 2018. https://doi.org/10.15479/AT:ISTA:TH1048.","ama":"Zimin A. Learning from dependent data. 2018. doi:10.15479/AT:ISTA:TH1048","apa":"Zimin, A. (2018). Learning from dependent data. Institute of Science and Technology Austria. https://doi.org/10.15479/AT:ISTA:TH1048","ieee":"A. Zimin, “Learning from dependent data,” Institute of Science and Technology Austria, 2018.","ista":"Zimin A. 2018. Learning from dependent data. Institute of Science and Technology Austria."},"page":"92","abstract":[{"text":"The most common assumption made in statistical learning theory is the assumption of the independent and identically distributed (i.i.d.) data. While being very convenient mathematically, it is often very clearly violated in practice. This disparity between the machine learning theory and applications underlies a growing demand in the development of algorithms that learn from dependent data and theory that can provide generalization guarantees similar to the independent situations. This thesis is dedicated to two variants of dependencies that can arise in practice. One is a dependence on the level of samples in a single learning task. Another dependency type arises in the multi-task setting when the tasks are dependent on each other even though the data for them can be i.i.d. In both cases we model the data (samples or tasks) as stochastic processes and introduce new algorithms for both settings that take into account and exploit the resulting dependencies. We prove the theoretical guarantees on the performance of the introduced algorithms under different evaluation criteria and, in addition, we compliment the theoretical study by the empirical one, where we evaluate some of the algorithms on two real world datasets to highlight their practical applicability.","lang":"eng"}],"type":"dissertation","alternative_title":["ISTA Thesis"],"pubrep_id":"1048","file":[{"date_updated":"2020-07-14T12:47:40Z","date_created":"2019-04-09T07:32:47Z","checksum":"e849dd40a915e4d6c5572b51b517f098","relation":"main_file","file_id":"6253","content_type":"application/pdf","file_size":1036137,"creator":"dernst","file_name":"2018_Thesis_Zimin.pdf","access_level":"open_access"},{"date_created":"2019-04-09T07:32:47Z","date_updated":"2020-07-14T12:47:40Z","checksum":"da092153cec55c97461bd53c45c5d139","file_id":"6254","relation":"source_file","creator":"dernst","content_type":"application/zip","file_size":637490,"file_name":"2018_Thesis_Zimin_Source.zip","access_level":"closed"}],"oa_version":"Published Version","_id":"68","user_id":"c635000d-4b10-11ee-a964-aac5a93f6ac1","status":"public","ddc":["004","519"],"title":"Learning from dependent data"},{"date_published":"2018-05-25T00:00:00Z","page":"113","citation":{"chicago":"Kolesnikov, Alexander. “Weakly-Supervised Segmentation and Unsupervised Modeling of Natural Images.” Institute of Science and Technology Austria, 2018. https://doi.org/10.15479/AT:ISTA:th_1021.","short":"A. Kolesnikov, Weakly-Supervised Segmentation and Unsupervised Modeling of Natural Images, Institute of Science and Technology Austria, 2018.","mla":"Kolesnikov, Alexander. Weakly-Supervised Segmentation and Unsupervised Modeling of Natural Images. Institute of Science and Technology Austria, 2018, doi:10.15479/AT:ISTA:th_1021.","apa":"Kolesnikov, A. (2018). Weakly-Supervised Segmentation and Unsupervised Modeling of Natural Images. Institute of Science and Technology Austria. https://doi.org/10.15479/AT:ISTA:th_1021","ieee":"A. Kolesnikov, “Weakly-Supervised Segmentation and Unsupervised Modeling of Natural Images,” Institute of Science and Technology Austria, 2018.","ista":"Kolesnikov A. 2018. Weakly-Supervised Segmentation and Unsupervised Modeling of Natural Images. Institute of Science and Technology Austria.","ama":"Kolesnikov A. Weakly-Supervised Segmentation and Unsupervised Modeling of Natural Images. 2018. doi:10.15479/AT:ISTA:th_1021"},"article_processing_charge":"No","has_accepted_license":"1","day":"25","oa_version":"Published Version","file":[{"creator":"system","file_size":12918758,"content_type":"application/pdf","access_level":"open_access","file_name":"IST-2018-1021-v1+1_thesis-unsigned-pdfa.pdf","checksum":"bc678e02468d8ebc39dc7267dfb0a1c4","date_created":"2018-12-12T10:14:57Z","date_updated":"2020-07-14T12:45:22Z","file_id":"5113","relation":"main_file"},{"file_id":"6225","relation":"source_file","checksum":"bc66973b086da5a043f1162dcfb1fde4","date_created":"2019-04-05T09:34:49Z","date_updated":"2020-07-14T12:45:22Z","access_level":"closed","file_name":"2018_Thesis_Kolesnikov_source.zip","creator":"dernst","file_size":55973760,"content_type":"application/zip"}],"pubrep_id":"1021","ddc":["004"],"status":"public","title":"Weakly-Supervised Segmentation and Unsupervised Modeling of Natural Images","user_id":"c635000d-4b10-11ee-a964-aac5a93f6ac1","_id":"197","abstract":[{"lang":"eng","text":"Modern computer vision systems heavily rely on statistical machine learning models, which typically require large amounts of labeled data to be learned reliably. Moreover, very recently computer vision research widely adopted techniques for representation learning, which further increase the demand for labeled data. However, for many important practical problems there is relatively small amount of labeled data available, so it is problematic to leverage full potential of the representation learning methods. One way to overcome this obstacle is to invest substantial resources into producing large labelled datasets. Unfortunately, this can be prohibitively expensive in practice. In this thesis we focus on the alternative way of tackling the aforementioned issue. We concentrate on methods, which make use of weakly-labeled or even unlabeled data. Specifically, the first half of the thesis is dedicated to the semantic image segmentation task. We develop a technique, which achieves competitive segmentation performance and only requires annotations in a form of global image-level labels instead of dense segmentation masks. Subsequently, we present a new methodology, which further improves segmentation performance by leveraging tiny additional feedback from a human annotator. By using our methods practitioners can greatly reduce the amount of data annotation effort, which is required to learn modern image segmentation models. In the second half of the thesis we focus on methods for learning from unlabeled visual data. We study a family of autoregressive models for modeling structure of natural images and discuss potential applications of these models. Moreover, we conduct in-depth study of one of these applications, where we develop the state-of-the-art model for the probabilistic image colorization task."}],"alternative_title":["ISTA Thesis"],"type":"dissertation","language":[{"iso":"eng"}],"supervisor":[{"id":"40C20FD2-F248-11E8-B48F-1D18A9856A87","orcid":"0000-0001-8622-7887","first_name":"Christoph","last_name":"Lampert","full_name":"Lampert, Christoph"}],"degree_awarded":"PhD","doi":"10.15479/AT:ISTA:th_1021","project":[{"_id":"2532554C-B435-11E9-9278-68D0E5697425","grant_number":"308036","name":"Lifelong Learning of Visual Scene Understanding","call_identifier":"FP7"}],"oa":1,"publication_identifier":{"issn":["2663-337X"]},"month":"05","date_created":"2018-12-11T11:45:09Z","date_updated":"2023-09-07T12:51:46Z","author":[{"full_name":"Kolesnikov, Alexander","last_name":"Kolesnikov","first_name":"Alexander","id":"2D157DB6-F248-11E8-B48F-1D18A9856A87"}],"publisher":"Institute of Science and Technology Austria","department":[{"_id":"ChLa"}],"publication_status":"published","year":"2018","acknowledgement":"I also gratefully acknowledge the support of NVIDIA Corporation with the donation of the GPUs used for this research.","ec_funded":1,"publist_id":"7718","file_date_updated":"2020-07-14T12:45:22Z"},{"file":[{"content_type":"application/pdf","file_size":2140062,"creator":"system","file_name":"IST-2017-776-v1+1_Pentina_Thesis_2016.pdf","access_level":"open_access","date_updated":"2018-12-12T10:14:07Z","date_created":"2018-12-12T10:14:07Z","relation":"main_file","file_id":"5056"}],"oa_version":"Published Version","pubrep_id":"776","ddc":["006"],"title":"Theoretical foundations of multi-task lifelong learning","status":"public","user_id":"c635000d-4b10-11ee-a964-aac5a93f6ac1","_id":"1126","abstract":[{"text":"Traditionally machine learning has been focusing on the problem of solving a single\r\ntask in isolation. While being quite well understood, this approach disregards an\r\nimportant aspect of human learning: when facing a new problem, humans are able to\r\nexploit knowledge acquired from previously learned tasks. Intuitively, access to several\r\nproblems simultaneously or sequentially could also be advantageous for a machine\r\nlearning system, especially if these tasks are closely related. Indeed, results of many\r\nempirical studies have provided justification for this intuition. However, theoretical\r\njustifications of this idea are rather limited.\r\nThe focus of this thesis is to expand the understanding of potential benefits of information\r\ntransfer between several related learning problems. We provide theoretical\r\nanalysis for three scenarios of multi-task learning - multiple kernel learning, sequential\r\nlearning and active task selection. We also provide a PAC-Bayesian perspective on\r\nlifelong learning and investigate how the task generation process influences the generalization\r\nguarantees in this scenario. In addition, we show how some of the obtained\r\ntheoretical results can be used to derive principled multi-task and lifelong learning\r\nalgorithms and illustrate their performance on various synthetic and real-world datasets.","lang":"eng"}],"alternative_title":["ISTA Thesis"],"type":"dissertation","date_published":"2016-11-01T00:00:00Z","page":"127","citation":{"mla":"Pentina, Anastasia. Theoretical Foundations of Multi-Task Lifelong Learning. Institute of Science and Technology Austria, 2016, doi:10.15479/AT:ISTA:TH_776.","short":"A. Pentina, Theoretical Foundations of Multi-Task Lifelong Learning, Institute of Science and Technology Austria, 2016.","chicago":"Pentina, Anastasia. “Theoretical Foundations of Multi-Task Lifelong Learning.” Institute of Science and Technology Austria, 2016. https://doi.org/10.15479/AT:ISTA:TH_776.","ama":"Pentina A. Theoretical foundations of multi-task lifelong learning. 2016. doi:10.15479/AT:ISTA:TH_776","ista":"Pentina A. 2016. Theoretical foundations of multi-task lifelong learning. Institute of Science and Technology Austria.","ieee":"A. Pentina, “Theoretical foundations of multi-task lifelong learning,” Institute of Science and Technology Austria, 2016.","apa":"Pentina, A. (2016). Theoretical foundations of multi-task lifelong learning. Institute of Science and Technology Austria. https://doi.org/10.15479/AT:ISTA:TH_776"},"has_accepted_license":"1","article_processing_charge":"No","day":"01","date_created":"2018-12-11T11:50:17Z","date_updated":"2023-09-07T11:52:03Z","author":[{"first_name":"Anastasia","last_name":"Pentina","id":"42E87FC6-F248-11E8-B48F-1D18A9856A87","full_name":"Pentina, Anastasia"}],"department":[{"_id":"ChLa"}],"publisher":"Institute of Science and Technology Austria","publication_status":"published","acknowledgement":"First and foremost I would like to express my gratitude to my supervisor, Christoph\r\nLampert. Thank you for your patience in teaching me all aspects of doing research\r\n(including English grammar), for your trust in my capabilities and endless support. Thank\r\nyou for granting me freedom in my research and, at the same time, having time and\r\nhelping me cope with the consequences whenever I needed it. Thank you for creating\r\nan excellent atmosphere in the group, it was a great pleasure and honor to be a part of\r\nit. There could not have been a better and more inspiring adviser and mentor.\r\nI thank Shai Ben-David for welcoming me into his group at the University of Waterloo,\r\nfor inspiring discussions and support. It was a great pleasure to work together. I am\r\nalso thankful to Ruth Urner for hosting me at the Max-Planck Institute Tübingen, for the\r\nfruitful collaboration and for taking care of me during that not-so-sunny month of May.\r\nI thank Jan Maas for kindly joining my thesis committee despite the short notice and\r\nproviding me with insightful comments.\r\nI would like to thank my colleagues for their support, entertaining conversations and\r\nendless table soccer games we shared together: Georg, Jan, Amelie and Emilie, Michal\r\nand Alex, Alex K. and Alex Z., Thomas, Sameh, Vlad, Mayu, Nathaniel, Silvester, Neel,\r\nCsaba, Vladimir, Morten. Thank you, Mabel and Ram, for the wonderful time we spent\r\ntogether. I am thankful to Shrinu and Samira for taking care of me during my stay at the\r\nUniversity of Waterloo. Special thanks to Viktoriia for her never-ending optimism and for\r\nbeing so inspiring and supportive, especially at the beginning of my PhD journey.\r\nThanks to IST administration, in particular, Vlad and Elisabeth for shielding me from\r\nmost of the bureaucratic paperwork.\r\n\r\nThis dissertation would not have been possible without funding from the European\r\nResearch Council under the European Union's Seventh Framework Programme\r\n(FP7/2007-2013)/ERC grant agreement no 308036.","year":"2016","ec_funded":1,"publist_id":"6234","file_date_updated":"2018-12-12T10:14:07Z","language":[{"iso":"eng"}],"supervisor":[{"id":"40C20FD2-F248-11E8-B48F-1D18A9856A87","orcid":"0000-0001-8622-7887","first_name":"Christoph","last_name":"Lampert","full_name":"Lampert, Christoph"}],"degree_awarded":"PhD","doi":"10.15479/AT:ISTA:TH_776","project":[{"grant_number":"308036","_id":"2532554C-B435-11E9-9278-68D0E5697425","name":"Lifelong Learning of Visual Scene Understanding","call_identifier":"FP7"}],"oa":1,"publication_identifier":{"issn":["2663-337X"]},"month":"11"},{"oa":1,"main_file_link":[{"url":"http://users.sussex.ac.uk/~nq28/viktoriia/Thesis_Sharmanska.pdf"}],"supervisor":[{"full_name":"Lampert, Christoph","orcid":"0000-0001-8622-7887","id":"40C20FD2-F248-11E8-B48F-1D18A9856A87","last_name":"Lampert","first_name":"Christoph"}],"degree_awarded":"PhD","language":[{"iso":"eng"}],"doi":"10.15479/at:ista:1401","month":"04","publication_identifier":{"issn":["2663-337X"]},"publication_status":"published","department":[{"_id":"ChLa"},{"_id":"GradSch"}],"publisher":"Institute of Science and Technology Austria","acknowledgement":"I would like to thank my supervisor, Christoph Lampert, for guidance throughout my studies and for patience in transforming me into a scientist, and my thesis committee, Chris Wojtan and Horst Bischof, for their help and advice. \r\n\r\nI would like to thank Elisabeth Hacker who perfectly assisted all my administrative needs and was always nice and friendly to me, and the campus team for making the IST Austria campus my second home. \r\nI was honored to collaborate with brilliant researchers and to learn from their experience. Undoubtedly, I learned most of all from Novi Quadrianto: brainstorming our projects and getting exciting results was the most enjoyable part of my work – thank you! I am also grateful to David Knowles, Zoubin Ghahramani, Daniel Hernández-Lobato, Kristian Kersting and Anastasia Pentina for the fantastic projects we worked on together, and to Kristen Grauman and Adriana Kovashka for the exceptional experience working with user studies. I would like to thank my colleagues at IST Austria and my office mates who shared their happy moods, scientific breakthroughs and thought-provoking conversations with me: Chao, Filip, Rustem, Asya, Sameh, Alex, Vlad, Mayu, Neel, Csaba, Thomas, Vladimir, Cristina, Alex Z., Avro, Amelie and Emilie, Andreas H. and Andreas E., Chris, Lena, Michael, Ali and Ipek, Vera, Igor, Katia. Special thanks to Morten for the countless games of table soccer we played together and the tournaments we teamed up for: we will definitely win next time:) A very warm hug to Asya for always being so inspiring and supportive to me, and for helping me to increase the proportion of female computer scientists in our group. ","year":"2015","date_created":"2018-12-11T11:51:48Z","date_updated":"2023-09-07T11:40:11Z","author":[{"last_name":"Sharmanska","first_name":"Viktoriia","orcid":"0000-0003-0192-9308","id":"2EA6D09E-F248-11E8-B48F-1D18A9856A87","full_name":"Sharmanska, Viktoriia"}],"file_date_updated":"2021-11-17T13:47:24Z","publist_id":"5806","page":"144","citation":{"ieee":"V. Sharmanska, “Learning with attributes for object recognition: Parametric and non-parametrics views,” Institute of Science and Technology Austria, 2015.","apa":"Sharmanska, V. (2015). Learning with attributes for object recognition: Parametric and non-parametrics views. Institute of Science and Technology Austria. https://doi.org/10.15479/at:ista:1401","ista":"Sharmanska V. 2015. Learning with attributes for object recognition: Parametric and non-parametrics views. Institute of Science and Technology Austria.","ama":"Sharmanska V. Learning with attributes for object recognition: Parametric and non-parametrics views. 2015. doi:10.15479/at:ista:1401","chicago":"Sharmanska, Viktoriia. “Learning with Attributes for Object Recognition: Parametric and Non-Parametrics Views.” Institute of Science and Technology Austria, 2015. https://doi.org/10.15479/at:ista:1401.","short":"V. Sharmanska, Learning with Attributes for Object Recognition: Parametric and Non-Parametrics Views, Institute of Science and Technology Austria, 2015.","mla":"Sharmanska, Viktoriia. Learning with Attributes for Object Recognition: Parametric and Non-Parametrics Views. Institute of Science and Technology Austria, 2015, doi:10.15479/at:ista:1401."},"date_published":"2015-04-01T00:00:00Z","day":"01","has_accepted_license":"1","article_processing_charge":"No","title":"Learning with attributes for object recognition: Parametric and non-parametrics views","status":"public","ddc":["000"],"user_id":"c635000d-4b10-11ee-a964-aac5a93f6ac1","_id":"1401","file":[{"creator":"dernst","content_type":"application/pdf","file_size":7964342,"file_name":"2015_Thesis_Sharmanska.pdf","access_level":"open_access","date_updated":"2021-02-22T11:33:17Z","date_created":"2021-02-22T11:33:17Z","success":1,"checksum":"3605b402bb6934e09ae4cf672c84baf7","file_id":"9177","relation":"main_file"},{"file_name":"2015_Thesis_Sharmanska_pdfa.pdf","access_level":"closed","file_size":7372241,"content_type":"application/pdf","creator":"cchlebak","relation":"main_file","file_id":"10297","date_created":"2021-11-16T14:40:45Z","date_updated":"2021-11-17T13:47:24Z","checksum":"e37593b3ee75bf3180629df2d6ca8f4e"}],"oa_version":"Published Version","alternative_title":["ISTA Thesis"],"type":"dissertation","abstract":[{"text":"The human ability to recognize objects in complex scenes has driven research in the computer vision field over couple of decades. This thesis focuses on the object recognition task in images. That is, given the image, we want the computer system to be able to predict the class of the object that appears in the image. A recent successful attempt to bridge semantic understanding of the image perceived by humans and by computers uses attribute-based models. Attributes are semantic properties of the objects shared across different categories, which humans and computers can decide on. To explore the attribute-based models we take a statistical machine learning approach, and address two key learning challenges in view of object recognition task: learning augmented attributes as mid-level discriminative feature representation, and learning with attributes as privileged information. Our main contributions are parametric and non-parametric models and algorithms to solve these frameworks. In the parametric approach, we explore an autoencoder model combined with the large margin nearest neighbor principle for mid-level feature learning, and linear support vector machines for learning with privileged information. In the non-parametric approach, we propose a supervised Indian Buffet Process for automatic augmentation of semantic attributes, and explore the Gaussian Processes classification framework for learning with privileged information. A thorough experimental analysis shows the effectiveness of the proposed models in both parametric and non-parametric views.","lang":"eng"}]}]