---
_id: '14446'
abstract:
- lang: eng
  text: Recent work has paid close attention to the first principle of Granger causality,
    according to which cause precedes effect. In this context, the question may arise
    whether the detected direction of causality also reverses after the time reversal
    of unidirectionally coupled data. Recently, it has been shown that for unidirectionally
    causally connected autoregressive (AR) processes X → Y, after time reversal of
    data, the opposite causal direction Y → X is indeed detected, although typically
    as part of the bidirectional X↔ Y link. As we argue here, the answer is different
    when the measured data are not from AR processes but from linked deterministic
    systems. When the goal is the usual forward data analysis, cross-mapping-like
    approaches correctly detect X → Y, while Granger causality-like approaches, which
    should not be used for deterministic time series, detect causal independence X
    → Y. The results of backward causal analysis depend on the predictability of the
    reversed data. Unlike AR processes, observables from deterministic dynamical systems,
    even complex nonlinear ones, can be predicted well forward, while backward predictions
    can be difficult (notably when the time reversal of a function leads to one-to-many
    relations). To address this problem, we propose an approach based on models that
    provide multiple candidate predictions for the target, combined with a loss function
    that consideres only the best candidate. The resulting good forward and backward
    predictability supports the view that unidirectionally causally linked deterministic
    dynamical systems X → Y can be expected to detect the same link both before and
    after time reversal.
acknowledgement: The work was supported by the Scientific Grant Agency of the Ministry
  of Education of the Slovak Republic and the Slovak Academy of Sciences, projects
  APVV-21-0216, VEGA2-0096-21 and VEGA 2-0023-22.
article_processing_charge: Yes
article_type: original
author:
- first_name: Jozef
  full_name: Jakubík, Jozef
  last_name: Jakubík
- first_name: Phuong
  full_name: Bui Thi Mai, Phuong
  id: 3EC6EE64-F248-11E8-B48F-1D18A9856A87
  last_name: Bui Thi Mai
- first_name: Martina
  full_name: Chvosteková, Martina
  last_name: Chvosteková
- first_name: Anna
  full_name: Krakovská, Anna
  last_name: Krakovská
citation:
  ama: Jakubík J, Phuong M, Chvosteková M, Krakovská A. Against the flow of time with
    multi-output models. <i>Measurement Science Review</i>. 2023;23(4):175-183. doi:<a
    href="https://doi.org/10.2478/msr-2023-0023">10.2478/msr-2023-0023</a>
  apa: Jakubík, J., Phuong, M., Chvosteková, M., &#38; Krakovská, A. (2023). Against
    the flow of time with multi-output models. <i>Measurement Science Review</i>.
    Sciendo. <a href="https://doi.org/10.2478/msr-2023-0023">https://doi.org/10.2478/msr-2023-0023</a>
  chicago: Jakubík, Jozef, Mary Phuong, Martina Chvosteková, and Anna Krakovská. “Against
    the Flow of Time with Multi-Output Models.” <i>Measurement Science Review</i>.
    Sciendo, 2023. <a href="https://doi.org/10.2478/msr-2023-0023">https://doi.org/10.2478/msr-2023-0023</a>.
  ieee: J. Jakubík, M. Phuong, M. Chvosteková, and A. Krakovská, “Against the flow
    of time with multi-output models,” <i>Measurement Science Review</i>, vol. 23,
    no. 4. Sciendo, pp. 175–183, 2023.
  ista: Jakubík J, Phuong M, Chvosteková M, Krakovská A. 2023. Against the flow of
    time with multi-output models. Measurement Science Review. 23(4), 175–183.
  mla: Jakubík, Jozef, et al. “Against the Flow of Time with Multi-Output Models.”
    <i>Measurement Science Review</i>, vol. 23, no. 4, Sciendo, 2023, pp. 175–83,
    doi:<a href="https://doi.org/10.2478/msr-2023-0023">10.2478/msr-2023-0023</a>.
  short: J. Jakubík, M. Phuong, M. Chvosteková, A. Krakovská, Measurement Science
    Review 23 (2023) 175–183.
date_created: 2023-10-22T22:01:15Z
date_published: 2023-08-01T00:00:00Z
date_updated: 2023-10-31T12:12:47Z
day: '01'
ddc:
- '510'
department:
- _id: ChLa
doi: 10.2478/msr-2023-0023
file:
- access_level: open_access
  checksum: b069cc10fa6a7c96b2bc9f728165f9e6
  content_type: application/pdf
  creator: dernst
  date_created: 2023-10-31T12:07:23Z
  date_updated: 2023-10-31T12:07:23Z
  file_id: '14476'
  file_name: 2023_MeasurementScienceRev_Jakubik.pdf
  file_size: 2639783
  relation: main_file
  success: 1
file_date_updated: 2023-10-31T12:07:23Z
has_accepted_license: '1'
intvolume: '        23'
issue: '4'
language:
- iso: eng
license: https://creativecommons.org/licenses/by-nc-nd/4.0/
month: '08'
oa: 1
oa_version: Published Version
page: 175-183
publication: Measurement Science Review
publication_identifier:
  eissn:
  - 1335-8871
publication_status: published
publisher: Sciendo
quality_controlled: '1'
scopus_import: '1'
status: public
title: Against the flow of time with multi-output models
tmp:
  image: /images/cc_by_nc_nd.png
  legal_code_url: https://creativecommons.org/licenses/by-nc-nd/4.0/legalcode
  name: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International
    (CC BY-NC-ND 4.0)
  short: CC BY-NC-ND (4.0)
type: journal_article
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 23
year: '2023'
...
---
_id: '9416'
abstract:
- lang: eng
  text: 'We study the inductive bias of two-layer ReLU networks trained by gradient
    flow. We identify a class of easy-to-learn (`orthogonally separable'') datasets,
    and characterise the solution that ReLU networks trained on such datasets converge
    to. Irrespective of network width, the solution turns out to be a combination
    of two max-margin classifiers: one corresponding to the positive data subset and
    one corresponding to the negative data subset. The proof is based on the recently
    introduced concept of extremal sectors, for which we prove a number of properties
    in the context of orthogonal separability. In particular, we prove stationarity
    of activation patterns from some time  onwards, which enables a reduction of the
    ReLU network to an ensemble of linear subnetworks.'
article_processing_charge: No
author:
- first_name: Phuong
  full_name: Bui Thi Mai, Phuong
  id: 3EC6EE64-F248-11E8-B48F-1D18A9856A87
  last_name: Bui Thi Mai
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: 'Phuong M, Lampert C. The inductive bias of ReLU networks on orthogonally separable
    data. In: <i>9th International Conference on Learning Representations</i>. ; 2021.'
  apa: Phuong, M., &#38; Lampert, C. (2021). The inductive bias of ReLU networks on
    orthogonally separable data. In <i>9th International Conference on Learning Representations</i>.
    Virtual.
  chicago: Phuong, Mary, and Christoph Lampert. “The Inductive Bias of ReLU Networks
    on Orthogonally Separable Data.” In <i>9th International Conference on Learning
    Representations</i>, 2021.
  ieee: M. Phuong and C. Lampert, “The inductive bias of ReLU networks on orthogonally
    separable data,” in <i>9th International Conference on Learning Representations</i>,
    Virtual, 2021.
  ista: 'Phuong M, Lampert C. 2021. The inductive bias of ReLU networks on orthogonally
    separable data. 9th International Conference on Learning Representations.  ICLR:
    International Conference on Learning Representations.'
  mla: Phuong, Mary, and Christoph Lampert. “The Inductive Bias of ReLU Networks on
    Orthogonally Separable Data.” <i>9th International Conference on Learning Representations</i>,
    2021.
  short: M. Phuong, C. Lampert, in:, 9th International Conference on Learning Representations,
    2021.
conference:
  end_date: 2021-05-07
  location: Virtual
  name: ' ICLR: International Conference on Learning Representations'
  start_date: 2021-05-03
date_created: 2021-05-24T11:16:46Z
date_published: 2021-05-01T00:00:00Z
date_updated: 2023-09-07T13:29:50Z
day: '01'
ddc:
- '000'
department:
- _id: GradSch
- _id: ChLa
file:
- access_level: open_access
  checksum: f34ff17017527db5ba6927f817bdd125
  content_type: application/pdf
  creator: bphuong
  date_created: 2021-05-24T11:15:57Z
  date_updated: 2021-05-24T11:15:57Z
  file_id: '9417'
  file_name: iclr2021_conference.pdf
  file_size: 502356
  relation: main_file
file_date_updated: 2021-05-24T11:15:57Z
has_accepted_license: '1'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://openreview.net/pdf?id=krz7T0xU9Z_
month: '05'
oa: 1
oa_version: Published Version
publication: 9th International Conference on Learning Representations
publication_status: published
quality_controlled: '1'
related_material:
  record:
  - id: '9418'
    relation: dissertation_contains
    status: public
scopus_import: '1'
status: public
title: The inductive bias of ReLU networks on orthogonally separable data
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2021'
...
---
_id: '9418'
abstract:
- lang: eng
  text: "Deep learning is best known for its empirical success across a wide range
    of applications\r\nspanning computer vision, natural language processing and speech.
    Of equal significance,\r\nthough perhaps less known, are its ramifications for
    learning theory: deep networks have\r\nbeen observed to perform surprisingly well
    in the high-capacity regime, aka the overfitting\r\nor underspecified regime.
    Classically, this regime on the far right of the bias-variance curve\r\nis associated
    with poor generalisation; however, recent experiments with deep networks\r\nchallenge
    this view.\r\n\r\nThis thesis is devoted to investigating various aspects of underspecification
    in deep learning.\r\nFirst, we argue that deep learning models are underspecified
    on two levels: a) any given\r\ntraining dataset can be fit by many different functions,
    and b) any given function can be\r\nexpressed by many different parameter configurations.
    We refer to the second kind of\r\nunderspecification as parameterisation redundancy
    and we precisely characterise its extent.\r\nSecond, we characterise the implicit
    criteria (the inductive bias) that guide learning in the\r\nunderspecified regime.
    Specifically, we consider a nonlinear but tractable classification\r\nsetting,
    and show that given the choice, neural networks learn classifiers with a large
    margin.\r\nThird, we consider learning scenarios where the inductive bias is not
    by itself sufficient to\r\ndeal with underspecification. We then study different
    ways of ‘tightening the specification’: i)\r\nIn the setting of representation
    learning with variational autoencoders, we propose a hand-\r\ncrafted regulariser
    based on mutual information. ii) In the setting of binary classification, we\r\nconsider
    soft-label (real-valued) supervision. We derive a generalisation bound for linear\r\nnetworks
    supervised in this way and verify that soft labels facilitate fast learning. Finally,
    we\r\nexplore an application of soft-label supervision to the training of multi-exit
    models."
acknowledged_ssus:
- _id: ScienComp
- _id: CampIT
- _id: E-Lib
alternative_title:
- ISTA Thesis
article_processing_charge: No
author:
- first_name: Phuong
  full_name: Bui Thi Mai, Phuong
  id: 3EC6EE64-F248-11E8-B48F-1D18A9856A87
  last_name: Bui Thi Mai
citation:
  ama: Phuong M. Underspecification in deep learning. 2021. doi:<a href="https://doi.org/10.15479/AT:ISTA:9418">10.15479/AT:ISTA:9418</a>
  apa: Phuong, M. (2021). <i>Underspecification in deep learning</i>. Institute of
    Science and Technology Austria. <a href="https://doi.org/10.15479/AT:ISTA:9418">https://doi.org/10.15479/AT:ISTA:9418</a>
  chicago: Phuong, Mary. “Underspecification in Deep Learning.” Institute of Science
    and Technology Austria, 2021. <a href="https://doi.org/10.15479/AT:ISTA:9418">https://doi.org/10.15479/AT:ISTA:9418</a>.
  ieee: M. Phuong, “Underspecification in deep learning,” Institute of Science and
    Technology Austria, 2021.
  ista: Phuong M. 2021. Underspecification in deep learning. Institute of Science
    and Technology Austria.
  mla: Phuong, Mary. <i>Underspecification in Deep Learning</i>. Institute of Science
    and Technology Austria, 2021, doi:<a href="https://doi.org/10.15479/AT:ISTA:9418">10.15479/AT:ISTA:9418</a>.
  short: M. Phuong, Underspecification in Deep Learning, Institute of Science and
    Technology Austria, 2021.
date_created: 2021-05-24T13:06:23Z
date_published: 2021-05-30T00:00:00Z
date_updated: 2023-09-08T11:11:12Z
day: '30'
ddc:
- '000'
degree_awarded: PhD
department:
- _id: GradSch
- _id: ChLa
doi: 10.15479/AT:ISTA:9418
file:
- access_level: open_access
  checksum: 4f0abe64114cfed264f9d36e8d1197e3
  content_type: application/pdf
  creator: bphuong
  date_created: 2021-05-24T11:22:29Z
  date_updated: 2021-05-24T11:22:29Z
  file_id: '9419'
  file_name: mph-thesis-v519-pdfimages.pdf
  file_size: 2673905
  relation: main_file
  success: 1
- access_level: closed
  checksum: f5699e876bc770a9b0df8345a77720a2
  content_type: application/zip
  creator: bphuong
  date_created: 2021-05-24T11:56:02Z
  date_updated: 2021-05-24T11:56:02Z
  file_id: '9420'
  file_name: thesis.zip
  file_size: 92995100
  relation: source_file
file_date_updated: 2021-05-24T11:56:02Z
has_accepted_license: '1'
language:
- iso: eng
month: '05'
oa: 1
oa_version: Published Version
page: '125'
publication_identifier:
  issn:
  - 2663-337X
publication_status: published
publisher: Institute of Science and Technology Austria
related_material:
  record:
  - id: '7435'
    relation: part_of_dissertation
    status: deleted
  - id: '7481'
    relation: part_of_dissertation
    status: public
  - id: '9416'
    relation: part_of_dissertation
    status: public
  - id: '7479'
    relation: part_of_dissertation
    status: public
status: public
supervisor:
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
title: Underspecification in deep learning
type: dissertation
user_id: c635000d-4b10-11ee-a964-aac5a93f6ac1
year: '2021'
...
---
_id: '7481'
abstract:
- lang: eng
  text: 'We address the following question:  How redundant is the parameterisation
    of ReLU networks? Specifically, we consider transformations of the weight space
    which leave the function implemented by the network intact.  Two such transformations
    are known for feed-forward architectures:  permutation of neurons within a layer,
    and positive scaling of all incoming weights of a neuron coupled with inverse
    scaling of its outgoing weights. In this work, we show for architectures with
    non-increasing widths that permutation and scaling are in fact the only function-preserving
    weight transformations.  For any eligible architecture we give an explicit construction
    of a neural network such that any other network that implements the same function
    can be obtained from the original one by the application of permutations and rescaling.  The
    proof relies on a geometric understanding of boundaries between linear regions
    of ReLU networks, and we hope the developed mathematical tools are of independent
    interest.'
article_processing_charge: No
author:
- first_name: Phuong
  full_name: Bui Thi Mai, Phuong
  id: 3EC6EE64-F248-11E8-B48F-1D18A9856A87
  last_name: Bui Thi Mai
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: 'Phuong M, Lampert C. Functional vs. parametric equivalence of ReLU networks.
    In: <i>8th International Conference on Learning Representations</i>. ; 2020.'
  apa: Phuong, M., &#38; Lampert, C. (2020). Functional vs. parametric equivalence
    of ReLU networks. In <i>8th International Conference on Learning Representations</i>.
    Online.
  chicago: Phuong, Mary, and Christoph Lampert. “Functional vs. Parametric Equivalence
    of ReLU Networks.” In <i>8th International Conference on Learning Representations</i>,
    2020.
  ieee: M. Phuong and C. Lampert, “Functional vs. parametric equivalence of ReLU networks,”
    in <i>8th International Conference on Learning Representations</i>, Online, 2020.
  ista: 'Phuong M, Lampert C. 2020. Functional vs. parametric equivalence of ReLU
    networks. 8th International Conference on Learning Representations. ICLR: International
    Conference on Learning Representations.'
  mla: Phuong, Mary, and Christoph Lampert. “Functional vs. Parametric Equivalence
    of ReLU Networks.” <i>8th International Conference on Learning Representations</i>,
    2020.
  short: M. Phuong, C. Lampert, in:, 8th International Conference on Learning Representations,
    2020.
conference:
  end_date: 2020-04-30
  location: Online
  name: 'ICLR: International Conference on Learning Representations'
  start_date: 2020-04-27
date_created: 2020-02-11T09:07:37Z
date_published: 2020-04-26T00:00:00Z
date_updated: 2023-09-07T13:29:50Z
day: '26'
ddc:
- '000'
department:
- _id: ChLa
file:
- access_level: open_access
  checksum: 8d372ea5defd8cb8fdc430111ed754a9
  content_type: application/pdf
  creator: bphuong
  date_created: 2020-02-11T09:07:27Z
  date_updated: 2020-07-14T12:47:59Z
  file_id: '7482'
  file_name: main.pdf
  file_size: 405469
  relation: main_file
file_date_updated: 2020-07-14T12:47:59Z
has_accepted_license: '1'
language:
- iso: eng
month: '04'
oa: 1
oa_version: Published Version
publication: 8th International Conference on Learning Representations
publication_status: published
quality_controlled: '1'
related_material:
  link:
  - relation: supplementary_material
    url: https://iclr.cc/virtual_2020/poster_Bylx-TNKvH.html
  record:
  - id: '9418'
    relation: dissertation_contains
    status: public
status: public
title: Functional vs. parametric equivalence of ReLU networks
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2020'
...
---
_id: '7479'
abstract:
- lang: eng
  text: "Multi-exit architectures, in which a stack of processing layers is interleaved
    with early output layers, allow the processing of a test example to stop early
    and thus save computation time and/or energy.  In this work, we propose a new
    training procedure for multi-exit architectures based on the principle of knowledge
    distillation. The method encourage searly exits to mimic later, more accurate
    exits, by matching their output probabilities.\r\nExperiments  on  CIFAR100  and
    \ ImageNet  show  that distillation-based training significantly improves the
    accuracy of early exits while maintaining state-of-the-art accuracy  for  late
    \ ones.   The  method  is  particularly  beneficial when  training  data  is  limited
    \ and  it  allows  a  straightforward extension to semi-supervised learning,i.e.
    making use of unlabeled data at training time. Moreover, it takes only afew lines
    to implement and incurs almost no computational overhead at training time, and
    none at all at test time."
article_processing_charge: No
author:
- first_name: Phuong
  full_name: Bui Thi Mai, Phuong
  id: 3EC6EE64-F248-11E8-B48F-1D18A9856A87
  last_name: Bui Thi Mai
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: 'Phuong M, Lampert C. Distillation-based training for multi-exit architectures.
    In: <i>IEEE International Conference on Computer Vision</i>. Vol 2019-October.
    IEEE; 2019:1355-1364. doi:<a href="https://doi.org/10.1109/ICCV.2019.00144">10.1109/ICCV.2019.00144</a>'
  apa: 'Phuong, M., &#38; Lampert, C. (2019). Distillation-based training for multi-exit
    architectures. In <i>IEEE International Conference on Computer Vision</i> (Vol.
    2019–October, pp. 1355–1364). Seoul, Korea: IEEE. <a href="https://doi.org/10.1109/ICCV.2019.00144">https://doi.org/10.1109/ICCV.2019.00144</a>'
  chicago: Phuong, Mary, and Christoph Lampert. “Distillation-Based Training for Multi-Exit
    Architectures.” In <i>IEEE International Conference on Computer Vision</i>, 2019–October:1355–64.
    IEEE, 2019. <a href="https://doi.org/10.1109/ICCV.2019.00144">https://doi.org/10.1109/ICCV.2019.00144</a>.
  ieee: M. Phuong and C. Lampert, “Distillation-based training for multi-exit architectures,”
    in <i>IEEE International Conference on Computer Vision</i>, Seoul, Korea, 2019,
    vol. 2019–October, pp. 1355–1364.
  ista: 'Phuong M, Lampert C. 2019. Distillation-based training for multi-exit architectures.
    IEEE International Conference on Computer Vision. ICCV: International Conference
    on Computer Vision vol. 2019–October, 1355–1364.'
  mla: Phuong, Mary, and Christoph Lampert. “Distillation-Based Training for Multi-Exit
    Architectures.” <i>IEEE International Conference on Computer Vision</i>, vol.
    2019–October, IEEE, 2019, pp. 1355–64, doi:<a href="https://doi.org/10.1109/ICCV.2019.00144">10.1109/ICCV.2019.00144</a>.
  short: M. Phuong, C. Lampert, in:, IEEE International Conference on Computer Vision,
    IEEE, 2019, pp. 1355–1364.
conference:
  end_date: 2019-11-02
  location: Seoul, Korea
  name: 'ICCV: International Conference on Computer Vision'
  start_date: 2019-10-27
date_created: 2020-02-11T09:06:57Z
date_published: 2019-10-01T00:00:00Z
date_updated: 2023-09-08T11:11:12Z
day: '01'
ddc:
- '000'
department:
- _id: ChLa
doi: 10.1109/ICCV.2019.00144
ec_funded: 1
external_id:
  isi:
  - '000531438101047'
file:
- access_level: open_access
  checksum: 7b77fb5c2d27c4c37a7612ba46a66117
  content_type: application/pdf
  creator: bphuong
  date_created: 2020-02-11T09:06:39Z
  date_updated: 2020-07-14T12:47:59Z
  file_id: '7480'
  file_name: main.pdf
  file_size: 735768
  relation: main_file
file_date_updated: 2020-07-14T12:47:59Z
has_accepted_license: '1'
isi: 1
language:
- iso: eng
month: '10'
oa: 1
oa_version: Submitted Version
page: 1355-1364
project:
- _id: 2532554C-B435-11E9-9278-68D0E5697425
  call_identifier: FP7
  grant_number: '308036'
  name: Lifelong Learning of Visual Scene Understanding
publication: IEEE International Conference on Computer Vision
publication_identifier:
  isbn:
  - '9781728148038'
  issn:
  - '15505499'
publication_status: published
publisher: IEEE
quality_controlled: '1'
related_material:
  record:
  - id: '9418'
    relation: dissertation_contains
    status: public
scopus_import: '1'
status: public
title: Distillation-based training for multi-exit architectures
type: conference
user_id: c635000d-4b10-11ee-a964-aac5a93f6ac1
volume: 2019-October
year: '2019'
...
---
_id: '6569'
abstract:
- lang: eng
  text: 'Knowledge distillation, i.e. one classifier being trained on the outputs
    of another classifier, is an empirically very successful technique for knowledge
    transfer between classifiers. It has even been observed that classifiers learn
    much faster and more reliably if trained with the outputs of another classifier
    as soft labels, instead of from ground truth data. So far, however, there is no
    satisfactory theoretical explanation of this phenomenon. In this work, we provide
    the first insights into the working mechanisms of distillation by studying the
    special case of linear and deep linear classifiers.  Specifically,  we prove a
    generalization bound that establishes fast convergence of the expected risk of
    a distillation-trained linear classifier. From the bound and its proof we extract
    three keyfactors that determine the success of distillation: data geometry – geometric
    properties of the datadistribution, in particular class separation, has an immediate
    influence on the convergence speed of the risk; optimization bias– gradient descentoptimization
    finds a very favorable minimum of the distillation objective; and strong monotonicity–
    the expected risk of the student classifier always decreases when the size of
    the training set grows.'
article_processing_charge: No
author:
- first_name: Phuong
  full_name: Bui Thi Mai, Phuong
  id: 3EC6EE64-F248-11E8-B48F-1D18A9856A87
  last_name: Bui Thi Mai
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0001-8622-7887
citation:
  ama: 'Phuong M, Lampert C. Towards understanding knowledge distillation. In: <i>Proceedings
    of the 36th International Conference on Machine Learning</i>. Vol 97. ML Research
    Press; 2019:5142-5151.'
  apa: 'Phuong, M., &#38; Lampert, C. (2019). Towards understanding knowledge distillation.
    In <i>Proceedings of the 36th International Conference on Machine Learning</i>
    (Vol. 97, pp. 5142–5151). Long Beach, CA, United States: ML Research Press.'
  chicago: Phuong, Mary, and Christoph Lampert. “Towards Understanding Knowledge Distillation.”
    In <i>Proceedings of the 36th International Conference on Machine Learning</i>,
    97:5142–51. ML Research Press, 2019.
  ieee: M. Phuong and C. Lampert, “Towards understanding knowledge distillation,”
    in <i>Proceedings of the 36th International Conference on Machine Learning</i>,
    Long Beach, CA, United States, 2019, vol. 97, pp. 5142–5151.
  ista: 'Phuong M, Lampert C. 2019. Towards understanding knowledge distillation.
    Proceedings of the 36th International Conference on Machine Learning. ICML: International
    Conference on Machine Learning vol. 97, 5142–5151.'
  mla: Phuong, Mary, and Christoph Lampert. “Towards Understanding Knowledge Distillation.”
    <i>Proceedings of the 36th International Conference on Machine Learning</i>, vol.
    97, ML Research Press, 2019, pp. 5142–51.
  short: M. Phuong, C. Lampert, in:, Proceedings of the 36th International Conference
    on Machine Learning, ML Research Press, 2019, pp. 5142–5151.
conference:
  end_date: 2019-06-15
  location: Long Beach, CA, United States
  name: 'ICML: International Conference on Machine Learning'
  start_date: 2019-06-10
date_created: 2019-06-20T18:23:03Z
date_published: 2019-06-13T00:00:00Z
date_updated: 2023-10-17T12:31:38Z
day: '13'
ddc:
- '000'
department:
- _id: ChLa
file:
- access_level: open_access
  checksum: a66d00e2694d749250f8507f301320ca
  content_type: application/pdf
  creator: bphuong
  date_created: 2019-06-20T18:22:56Z
  date_updated: 2020-07-14T12:47:33Z
  file_id: '6570'
  file_name: paper.pdf
  file_size: 686432
  relation: main_file
file_date_updated: 2020-07-14T12:47:33Z
has_accepted_license: '1'
intvolume: '        97'
language:
- iso: eng
month: '06'
oa: 1
oa_version: Published Version
page: 5142-5151
publication: Proceedings of the 36th International Conference on Machine Learning
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
scopus_import: '1'
status: public
title: Towards understanding knowledge distillation
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 97
year: '2019'
...