---
_id: '15011'
abstract:
- lang: eng
text: Pruning large language models (LLMs) from the BERT family has emerged as a
standard compression benchmark, and several pruning methods have been proposed
for this task. The recent “Sparsity May Cry” (SMC) benchmark put into question
the validity of all existing methods, exhibiting a more complex setup where many
known pruning methods appear to fail. We revisit the question of accurate BERT-pruning
during fine-tuning on downstream datasets, and propose a set of general guidelines
for successful pruning, even on the challenging SMC benchmark. First, we perform
a cost-vs-benefits analysis of pruning model components, such as the embeddings
and the classification head; second, we provide a simple-yet-general way of scaling
training, sparsification and learning rate schedules relative to the desired target
sparsity; finally, we investigate the importance of proper parametrization for
Knowledge Distillation in the context of LLMs. Our simple insights lead to state-of-the-art
results, both on classic BERT-pruning benchmarks, as well as on the SMC benchmark,
showing that even classic gradual magnitude pruning (GMP) can yield competitive
results, with the right approach.
alternative_title:
- PMLR
article_processing_charge: No
author:
- first_name: Eldar
full_name: Kurtic, Eldar
id: 47beb3a5-07b5-11eb-9b87-b108ec578218
last_name: Kurtic
- first_name: Torsten
full_name: Hoefler, Torsten
last_name: Hoefler
- first_name: Dan-Adrian
full_name: Alistarh, Dan-Adrian
id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
last_name: Alistarh
orcid: 0000-0003-3650-940X
citation:
ama: 'Kurtic E, Hoefler T, Alistarh D-A. How to prune your language model: Recovering
accuracy on the “Sparsity May Cry” benchmark. In: Proceedings of Machine Learning
Research. Vol 234. ML Research Press; 2024:542-553.'
apa: 'Kurtic, E., Hoefler, T., & Alistarh, D.-A. (2024). How to prune your language
model: Recovering accuracy on the “Sparsity May Cry” benchmark. In Proceedings
of Machine Learning Research (Vol. 234, pp. 542–553). Hongkong, China: ML
Research Press.'
chicago: 'Kurtic, Eldar, Torsten Hoefler, and Dan-Adrian Alistarh. “How to Prune
Your Language Model: Recovering Accuracy on the ‘Sparsity May Cry’ Benchmark.”
In Proceedings of Machine Learning Research, 234:542–53. ML Research Press,
2024.'
ieee: 'E. Kurtic, T. Hoefler, and D.-A. Alistarh, “How to prune your language model:
Recovering accuracy on the ‘Sparsity May Cry’ benchmark,” in Proceedings of
Machine Learning Research, Hongkong, China, 2024, vol. 234, pp. 542–553.'
ista: 'Kurtic E, Hoefler T, Alistarh D-A. 2024. How to prune your language model:
Recovering accuracy on the ‘Sparsity May Cry’ benchmark. Proceedings of Machine
Learning Research. CPAL: Conference on Parsimony and Learning, PMLR, vol. 234,
542–553.'
mla: 'Kurtic, Eldar, et al. “How to Prune Your Language Model: Recovering Accuracy
on the ‘Sparsity May Cry’ Benchmark.” Proceedings of Machine Learning Research,
vol. 234, ML Research Press, 2024, pp. 542–53.'
short: E. Kurtic, T. Hoefler, D.-A. Alistarh, in:, Proceedings of Machine Learning
Research, ML Research Press, 2024, pp. 542–553.
conference:
end_date: 2024-01-06
location: Hongkong, China
name: 'CPAL: Conference on Parsimony and Learning'
start_date: 2024-01-03
date_created: 2024-02-18T23:01:03Z
date_published: 2024-01-08T00:00:00Z
date_updated: 2024-02-26T10:30:52Z
day: '08'
department:
- _id: DaAl
external_id:
arxiv:
- '2312.13547'
intvolume: ' 234'
language:
- iso: eng
main_file_link:
- open_access: '1'
url: https://proceedings.mlr.press/v234/kurtic24a
month: '01'
oa: 1
oa_version: Preprint
page: 542-553
publication: Proceedings of Machine Learning Research
publication_identifier:
eissn:
- 2640-3498
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
scopus_import: '1'
status: public
title: 'How to prune your language model: Recovering accuracy on the "Sparsity May
Cry" benchmark'
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 234
year: '2024'
...
---
_id: '12735'
abstract:
- lang: eng
text: "Asynchronous programming has gained significant popularity over the last
decade: support for this programming pattern is available in many popular languages
via libraries and native language implementations, typically in the form of coroutines
or the async/await construct. Instead of programming via shared memory, this concept
assumes implicit synchronization through message passing. The key data structure
enabling such communication is the rendezvous channel. Roughly, a rendezvous channel
is a blocking queue of size zero, so both send(e) and receive() operations wait
for each other, performing a rendezvous when they meet. To optimize the message
passing pattern, channels are usually equipped with a fixed-size buffer, so sends
do not suspend and put elements into the buffer until its capacity is exceeded.
This primitive is known as a buffered channel.\r\n\r\nThis paper presents a fast
and scalable algorithm for both rendezvous and buffered channels. Similarly to
modern queues, our solution is based on an infinite array with two positional
counters for send(e) and receive() operations, leveraging the unconditional Fetch-And-Add
instruction to update them. Yet, the algorithm requires non-trivial modifications
of this classic pattern, in order to support the full channel semantics, such
as buffering and cancellation of waiting requests. We compare the performance
of our solution to that of the Kotlin implementation, as well as against other
academic proposals, showing up to 9.8× speedup. To showcase its expressiveness
and performance, we also integrated the proposed algorithm into the standard Kotlin
Coroutines library, replacing the previous channel implementations."
article_processing_charge: No
author:
- first_name: Nikita
full_name: Koval, Nikita
id: 2F4DB10C-F248-11E8-B48F-1D18A9856A87
last_name: Koval
- first_name: Dan-Adrian
full_name: Alistarh, Dan-Adrian
id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
last_name: Alistarh
orcid: 0000-0003-3650-940X
- first_name: Roman
full_name: Elizarov, Roman
last_name: Elizarov
citation:
ama: 'Koval N, Alistarh D-A, Elizarov R. Fast and scalable channels in Kotlin Coroutines.
In: Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of
Parallel Programming. Association for Computing Machinery; 2023:107-118. doi:10.1145/3572848.3577481'
apa: 'Koval, N., Alistarh, D.-A., & Elizarov, R. (2023). Fast and scalable channels
in Kotlin Coroutines. In Proceedings of the ACM SIGPLAN Symposium on Principles
and Practice of Parallel Programming (pp. 107–118). Montreal, QC, Canada:
Association for Computing Machinery. https://doi.org/10.1145/3572848.3577481'
chicago: Koval, Nikita, Dan-Adrian Alistarh, and Roman Elizarov. “Fast and Scalable
Channels in Kotlin Coroutines.” In Proceedings of the ACM SIGPLAN Symposium
on Principles and Practice of Parallel Programming, 107–18. Association for
Computing Machinery, 2023. https://doi.org/10.1145/3572848.3577481.
ieee: N. Koval, D.-A. Alistarh, and R. Elizarov, “Fast and scalable channels in
Kotlin Coroutines,” in Proceedings of the ACM SIGPLAN Symposium on Principles
and Practice of Parallel Programming, Montreal, QC, Canada, 2023, pp. 107–118.
ista: 'Koval N, Alistarh D-A, Elizarov R. 2023. Fast and scalable channels in Kotlin
Coroutines. Proceedings of the ACM SIGPLAN Symposium on Principles and Practice
of Parallel Programming. PPoPP: Sympopsium on Principles and Practice of Parallel
Programming, 107–118.'
mla: Koval, Nikita, et al. “Fast and Scalable Channels in Kotlin Coroutines.” Proceedings
of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming,
Association for Computing Machinery, 2023, pp. 107–18, doi:10.1145/3572848.3577481.
short: N. Koval, D.-A. Alistarh, R. Elizarov, in:, Proceedings of the ACM SIGPLAN
Symposium on Principles and Practice of Parallel Programming, Association for
Computing Machinery, 2023, pp. 107–118.
conference:
end_date: 2023-03-01
location: Montreal, QC, Canada
name: 'PPoPP: Sympopsium on Principles and Practice of Parallel Programming'
start_date: 2023-02-25
date_created: 2023-03-19T23:00:58Z
date_published: 2023-02-25T00:00:00Z
date_updated: 2023-03-20T07:29:28Z
day: '25'
department:
- _id: DaAl
doi: 10.1145/3572848.3577481
external_id:
arxiv:
- '2211.04986'
language:
- iso: eng
main_file_link:
- open_access: '1'
url: https://doi.org/10.48550/arXiv.2211.04986
month: '02'
oa: 1
oa_version: Preprint
page: 107-118
publication: Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of
Parallel Programming
publication_identifier:
isbn:
- '9798400700156'
publication_status: published
publisher: Association for Computing Machinery
quality_controlled: '1'
scopus_import: '1'
status: public
title: Fast and scalable channels in Kotlin Coroutines
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '13053'
abstract:
- lang: eng
text: 'Deep neural networks (DNNs) often have to be compressed, via pruning and/or
quantization, before they can be deployed in practical settings. In this work
we propose a new compression-aware minimizer dubbed CrAM that modifies the optimization
step in a principled way, in order to produce models whose local loss behavior
is stable under compression operations such as pruning. Thus, dense models trained
via CrAM should be compressible post-training, in a single step, without significant
accuracy loss. Experimental results on standard benchmarks, such as residual networks
for ImageNet classification and BERT models for language modelling, show that
CrAM produces dense models that can be more accurate than the standard SGD/Adam-based
baselines, but which are stable under weight pruning: specifically, we can prune
models in one-shot to 70-80% sparsity with almost no accuracy loss, and to 90%
with reasonable (∼1%) accuracy loss, which is competitive with gradual compression
methods. Additionally, CrAM can produce sparse models which perform well for transfer
learning, and it also works for semi-structured 2:4 pruning patterns supported
by GPU hardware. The code for reproducing the results is available at this https
URL .'
acknowledged_ssus:
- _id: ScienComp
acknowledgement: "AP, EK, DA received funding from the European Research Council (ERC)
under the European\r\nUnion’s Horizon 2020 research and innovation programme (grant
agreement No 805223 ScaleML). AV acknowledges the support of the French Agence Nationale
de la Recherche (ANR), under grant ANR-21-CE48-0016 (project COMCOPT). We further
acknowledge the support from the Scientific Service Units (SSU) of ISTA through
resources provided by Scientific Computing (SciComp)-"
article_processing_charge: No
author:
- first_name: Elena-Alexandra
full_name: Peste, Elena-Alexandra
id: 32D78294-F248-11E8-B48F-1D18A9856A87
last_name: Peste
- first_name: Adrian
full_name: Vladu, Adrian
last_name: Vladu
- first_name: Eldar
full_name: Kurtic, Eldar
id: 47beb3a5-07b5-11eb-9b87-b108ec578218
last_name: Kurtic
- first_name: Christoph
full_name: Lampert, Christoph
id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
last_name: Lampert
orcid: 0000-0001-8622-7887
- first_name: Dan-Adrian
full_name: Alistarh, Dan-Adrian
id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
last_name: Alistarh
orcid: 0000-0003-3650-940X
citation:
ama: 'Peste E-A, Vladu A, Kurtic E, Lampert C, Alistarh D-A. CrAM: A Compression-Aware
Minimizer. In: 11th International Conference on Learning Representations .'
apa: 'Peste, E.-A., Vladu, A., Kurtic, E., Lampert, C., & Alistarh, D.-A. (n.d.).
CrAM: A Compression-Aware Minimizer. In 11th International Conference on Learning
Representations . Kigali, Rwanda .'
chicago: 'Peste, Elena-Alexandra, Adrian Vladu, Eldar Kurtic, Christoph Lampert,
and Dan-Adrian Alistarh. “CrAM: A Compression-Aware Minimizer.” In 11th International
Conference on Learning Representations , n.d.'
ieee: 'E.-A. Peste, A. Vladu, E. Kurtic, C. Lampert, and D.-A. Alistarh, “CrAM:
A Compression-Aware Minimizer,” in 11th International Conference on Learning
Representations , Kigali, Rwanda .'
ista: 'Peste E-A, Vladu A, Kurtic E, Lampert C, Alistarh D-A. CrAM: A Compression-Aware
Minimizer. 11th International Conference on Learning Representations . ICLR: International
Conference on Learning Representations.'
mla: 'Peste, Elena-Alexandra, et al. “CrAM: A Compression-Aware Minimizer.” 11th
International Conference on Learning Representations .'
short: E.-A. Peste, A. Vladu, E. Kurtic, C. Lampert, D.-A. Alistarh, in:, 11th International
Conference on Learning Representations , n.d.
conference:
end_date: 2023-05-05
location: 'Kigali, Rwanda '
name: 'ICLR: International Conference on Learning Representations'
start_date: 2023-05-01
date_created: 2023-05-23T11:36:18Z
date_published: 2023-05-01T00:00:00Z
date_updated: 2023-06-01T12:54:45Z
department:
- _id: GradSch
- _id: DaAl
- _id: ChLa
ec_funded: 1
external_id:
arxiv:
- '2207.14200'
language:
- iso: eng
main_file_link:
- open_access: '1'
url: https://openreview.net/pdf?id=_eTZBs-yedr
month: '05'
oa: 1
oa_version: Preprint
project:
- _id: 268A44D6-B435-11E9-9278-68D0E5697425
call_identifier: H2020
grant_number: '805223'
name: Elastic Coordination for Scalable Machine Learning
publication: '11th International Conference on Learning Representations '
publication_status: accepted
quality_controlled: '1'
related_material:
record:
- id: '13074'
relation: dissertation_contains
status: public
status: public
title: 'CrAM: A Compression-Aware Minimizer'
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '13179'
abstract:
- lang: eng
text: "Writing concurrent code that is both correct and efficient is notoriously
difficult. Thus, programmers often prefer to use synchronization abstractions,
which render code simpler and easier to reason about. Despite a wealth of work
on this topic, there is still a gap between the rich semantics provided by synchronization
abstractions in modern programming languages—specifically, fair FIFO ordering
of synchronization requests and support for abortable operations—and frameworks
for implementing it correctly and efficiently. Supporting such semantics is critical
given the rising popularity of constructs for asynchronous programming, such as
coroutines, which abort frequently and are cheaper to suspend and resume compared
to native threads.\r\n\r\nThis paper introduces a new framework called CancellableQueueSynchronizer
(CQS), which enables simple yet efficient implementations of a wide range of fair
and abortable synchronization primitives: mutexes, semaphores, barriers, count-down
latches, and blocking pools. Our main contribution is algorithmic, as implementing
both fairness and abortability efficiently at this level of generality is non-trivial.
Importantly, all our algorithms, including the CQS framework and the primitives
built on top of it, come with formal proofs in the Iris framework for Coq for
many of their properties. These proofs are modular, so it is easy to show correctness
for new primitives implemented on top of CQS. From a practical perspective, implementation
of CQS for native threads on the JVM improves throughput by up to two orders of
magnitude over Java’s AbstractQueuedSynchronizer, the only practical abstraction
offering similar semantics. Further, we successfully integrated CQS as a core
component of the popular Kotlin Coroutines library, validating the framework’s
practical impact and expressiveness in a real-world environment. In sum, CancellableQueueSynchronizer
is the first framework to combine expressiveness with formal guarantees and solid
practical performance. Our approach should be extensible to other languages and
families of synchronization primitives."
article_number: '116'
article_processing_charge: No
article_type: original
author:
- first_name: Nikita
full_name: Koval, Nikita
id: 2F4DB10C-F248-11E8-B48F-1D18A9856A87
last_name: Koval
- first_name: Dmitry
full_name: Khalanskiy, Dmitry
last_name: Khalanskiy
- first_name: Dan-Adrian
full_name: Alistarh, Dan-Adrian
id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
last_name: Alistarh
orcid: 0000-0003-3650-940X
citation:
ama: 'Koval N, Khalanskiy D, Alistarh D-A. CQS: A formally-verified framework for
fair and abortable synchronization. Proceedings of the ACM on Programming Languages.
2023;7. doi:10.1145/3591230'
apa: 'Koval, N., Khalanskiy, D., & Alistarh, D.-A. (2023). CQS: A formally-verified
framework for fair and abortable synchronization. Proceedings of the ACM on
Programming Languages. Association for Computing Machinery . https://doi.org/10.1145/3591230'
chicago: 'Koval, Nikita, Dmitry Khalanskiy, and Dan-Adrian Alistarh. “CQS: A Formally-Verified
Framework for Fair and Abortable Synchronization.” Proceedings of the ACM on
Programming Languages. Association for Computing Machinery , 2023. https://doi.org/10.1145/3591230.'
ieee: 'N. Koval, D. Khalanskiy, and D.-A. Alistarh, “CQS: A formally-verified framework
for fair and abortable synchronization,” Proceedings of the ACM on Programming
Languages, vol. 7. Association for Computing Machinery , 2023.'
ista: 'Koval N, Khalanskiy D, Alistarh D-A. 2023. CQS: A formally-verified framework
for fair and abortable synchronization. Proceedings of the ACM on Programming
Languages. 7, 116.'
mla: 'Koval, Nikita, et al. “CQS: A Formally-Verified Framework for Fair and Abortable
Synchronization.” Proceedings of the ACM on Programming Languages, vol.
7, 116, Association for Computing Machinery , 2023, doi:10.1145/3591230.'
short: N. Koval, D. Khalanskiy, D.-A. Alistarh, Proceedings of the ACM on Programming
Languages 7 (2023).
date_created: 2023-07-02T22:00:43Z
date_published: 2023-06-06T00:00:00Z
date_updated: 2023-07-17T08:43:19Z
day: '06'
ddc:
- '000'
department:
- _id: DaAl
doi: 10.1145/3591230
file:
- access_level: open_access
checksum: 5dba6e73f0ed79adbdae14d165bc2f68
content_type: application/pdf
creator: alisjak
date_created: 2023-07-03T13:09:39Z
date_updated: 2023-07-03T13:09:39Z
file_id: '13187'
file_name: 2023_ACMProgram.Lang._Koval.pdf
file_size: 1266773
relation: main_file
success: 1
file_date_updated: 2023-07-03T13:09:39Z
has_accepted_license: '1'
intvolume: ' 7'
language:
- iso: eng
license: https://creativecommons.org/licenses/by/4.0/
month: '06'
oa: 1
oa_version: Published Version
publication: Proceedings of the ACM on Programming Languages
publication_identifier:
eissn:
- 2475-1421
publication_status: published
publisher: 'Association for Computing Machinery '
quality_controlled: '1'
scopus_import: '1'
status: public
title: 'CQS: A formally-verified framework for fair and abortable synchronization'
tmp:
image: /images/cc_by.png
legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
short: CC BY (4.0)
type: journal_article
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 7
year: '2023'
...
---
_id: '13262'
abstract:
- lang: eng
text: 'Determining the degree of inherent parallelism in classical sequential algorithms
and leveraging it for fast parallel execution is a key topic in parallel computing,
and detailed analyses are known for a wide range of classical algorithms. In this
paper, we perform the first such analysis for the fundamental Union-Find problem,
in which we are given a graph as a sequence of edges, and must maintain its connectivity
structure under edge additions. We prove that classic sequential algorithms for
this problem are well-parallelizable under reasonable assumptions, addressing
a conjecture by [Blelloch, 2017]. More precisely, we show via a new potential
argument that, under uniform random edge ordering, parallel union-find operations
are unlikely to interfere: T concurrent threads processing the graph in parallel
will encounter memory contention O(T2 · log |V| · log |E|) times in expectation,
where |E| and |V| are the number of edges and nodes in the graph, respectively.
We leverage this result to design a new parallel Union-Find algorithm that is
both internally deterministic, i.e., its results are guaranteed to match those
of a sequential execution, but also work-efficient and scalable, as long as the
number of threads T is O(|E|1 over 3 - ε), for an arbitrarily small constant ε
> 0, which holds for most large real-world graphs. We present lower bounds which
show that our analysis is close to optimal, and experimental results suggesting
that the performance cost of internal determinism is limited.'
article_processing_charge: Yes (in subscription journal)
author:
- first_name: Alexander
full_name: Fedorov, Alexander
id: 2e711909-896a-11ed-bdf8-eb0f5a2984c6
last_name: Fedorov
- first_name: Diba
full_name: Hashemi, Diba
id: ed9595ea-2f8f-11ee-ba95-d2b546540783
last_name: Hashemi
- first_name: Giorgi
full_name: Nadiradze, Giorgi
id: 3279A00C-F248-11E8-B48F-1D18A9856A87
last_name: Nadiradze
- first_name: Dan-Adrian
full_name: Alistarh, Dan-Adrian
id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
last_name: Alistarh
orcid: 0000-0003-3650-940X
citation:
ama: 'Fedorov A, Hashemi D, Nadiradze G, Alistarh D-A. Provably-efficient and internally-deterministic
parallel Union-Find. In: Proceedings of the 35th ACM Symposium on Parallelism
in Algorithms and Architectures. Association for Computing Machinery; 2023:261-271.
doi:10.1145/3558481.3591082'
apa: 'Fedorov, A., Hashemi, D., Nadiradze, G., & Alistarh, D.-A. (2023). Provably-efficient
and internally-deterministic parallel Union-Find. In Proceedings of the 35th
ACM Symposium on Parallelism in Algorithms and Architectures (pp. 261–271).
Orlando, FL, United States: Association for Computing Machinery. https://doi.org/10.1145/3558481.3591082'
chicago: Fedorov, Alexander, Diba Hashemi, Giorgi Nadiradze, and Dan-Adrian Alistarh.
“Provably-Efficient and Internally-Deterministic Parallel Union-Find.” In Proceedings
of the 35th ACM Symposium on Parallelism in Algorithms and Architectures,
261–71. Association for Computing Machinery, 2023. https://doi.org/10.1145/3558481.3591082.
ieee: A. Fedorov, D. Hashemi, G. Nadiradze, and D.-A. Alistarh, “Provably-efficient
and internally-deterministic parallel Union-Find,” in Proceedings of the 35th
ACM Symposium on Parallelism in Algorithms and Architectures, Orlando, FL,
United States, 2023, pp. 261–271.
ista: 'Fedorov A, Hashemi D, Nadiradze G, Alistarh D-A. 2023. Provably-efficient
and internally-deterministic parallel Union-Find. Proceedings of the 35th ACM
Symposium on Parallelism in Algorithms and Architectures. SPAA: Symposium on Parallelism
in Algorithms and Architectures, 261–271.'
mla: Fedorov, Alexander, et al. “Provably-Efficient and Internally-Deterministic
Parallel Union-Find.” Proceedings of the 35th ACM Symposium on Parallelism
in Algorithms and Architectures, Association for Computing Machinery, 2023,
pp. 261–71, doi:10.1145/3558481.3591082.
short: A. Fedorov, D. Hashemi, G. Nadiradze, D.-A. Alistarh, in:, Proceedings of
the 35th ACM Symposium on Parallelism in Algorithms and Architectures, Association
for Computing Machinery, 2023, pp. 261–271.
conference:
end_date: 2023-06-19
location: Orlando, FL, United States
name: 'SPAA: Symposium on Parallelism in Algorithms and Architectures'
start_date: 2023-06-17
date_created: 2023-07-23T22:01:12Z
date_published: 2023-06-17T00:00:00Z
date_updated: 2023-07-31T10:54:32Z
day: '17'
ddc:
- '000'
department:
- _id: DaAl
- _id: GradSch
doi: 10.1145/3558481.3591082
external_id:
arxiv:
- '2304.09331'
file:
- access_level: open_access
checksum: 72e312aabf0c5248c99b5cd3a88e4c88
content_type: application/pdf
creator: dernst
date_created: 2023-07-31T10:53:08Z
date_updated: 2023-07-31T10:53:08Z
file_id: '13334'
file_name: 2023_SPAA_Fedorov.pdf
file_size: 2087937
relation: main_file
success: 1
file_date_updated: 2023-07-31T10:53:08Z
has_accepted_license: '1'
language:
- iso: eng
month: '06'
oa: 1
oa_version: Published Version
page: 261-271
publication: Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and
Architectures
publication_identifier:
isbn:
- '9781450395458'
publication_status: published
publisher: Association for Computing Machinery
quality_controlled: '1'
scopus_import: '1'
status: public
title: Provably-efficient and internally-deterministic parallel Union-Find
tmp:
image: /images/cc_by.png
legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
short: CC BY (4.0)
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '12566'
abstract:
- lang: eng
text: "Approximate agreement is one of the few variants of consensus that can be
solved in a wait-free manner in asynchronous systems where processes communicate
by reading and writing to shared memory. In this work, we consider a natural generalisation
of approximate agreement on arbitrary undirected connected graphs. Each process
is given a node of the graph as input and, if non-faulty, must output a node such
that\r\n– all the outputs are within distance 1 of one another, and\r\n– each
output value lies on a shortest path between two input values.\r\nFrom prior work,
it is known that there is no wait-free algorithm among processes for this problem
on any cycle of length , by reduction from 2-set agreement (Castañeda et al.,
2018).\r\n\r\nIn this work, we investigate the solvability of this task on general
graphs. We give a new, direct proof of the impossibility of approximate agreement
on cycles of length , via a generalisation of Sperner's Lemma to convex polygons.
We also extend the reduction from 2-set agreement to a larger class of graphs,
showing that approximate agreement on these graphs is unsolvable. On the positive
side, we present a wait-free algorithm for a different class of graphs, which
properly contains the class of chordal graphs."
acknowledgement: This project has received funding from the European Research Council
(ERC) under the European Union’s Horizon 2020 research and innovation programme
(grant agreement No. 805223 ScaleML) and under the Marie Skłodowska-Curie grant
agreement No. 840605 and from the Natural Sciences and Engineering Research Council
of Canada grant RGPIN-2020-04178. Part of this work was done while Faith Ellen was
visiting IST Austria.
article_number: '113733'
article_processing_charge: Yes (via OA deal)
article_type: original
author:
- first_name: Dan-Adrian
full_name: Alistarh, Dan-Adrian
id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
last_name: Alistarh
orcid: 0000-0003-3650-940X
- first_name: Faith
full_name: Ellen, Faith
last_name: Ellen
- first_name: Joel
full_name: Rybicki, Joel
id: 334EFD2E-F248-11E8-B48F-1D18A9856A87
last_name: Rybicki
orcid: 0000-0002-6432-6646
citation:
ama: Alistarh D-A, Ellen F, Rybicki J. Wait-free approximate agreement on graphs.
Theoretical Computer Science. 2023;948(2). doi:10.1016/j.tcs.2023.113733
apa: Alistarh, D.-A., Ellen, F., & Rybicki, J. (2023). Wait-free approximate
agreement on graphs. Theoretical Computer Science. Elsevier. https://doi.org/10.1016/j.tcs.2023.113733
chicago: Alistarh, Dan-Adrian, Faith Ellen, and Joel Rybicki. “Wait-Free Approximate
Agreement on Graphs.” Theoretical Computer Science. Elsevier, 2023. https://doi.org/10.1016/j.tcs.2023.113733.
ieee: D.-A. Alistarh, F. Ellen, and J. Rybicki, “Wait-free approximate agreement
on graphs,” Theoretical Computer Science, vol. 948, no. 2. Elsevier, 2023.
ista: Alistarh D-A, Ellen F, Rybicki J. 2023. Wait-free approximate agreement on
graphs. Theoretical Computer Science. 948(2), 113733.
mla: Alistarh, Dan-Adrian, et al. “Wait-Free Approximate Agreement on Graphs.” Theoretical
Computer Science, vol. 948, no. 2, 113733, Elsevier, 2023, doi:10.1016/j.tcs.2023.113733.
short: D.-A. Alistarh, F. Ellen, J. Rybicki, Theoretical Computer Science 948 (2023).
date_created: 2023-02-19T23:00:55Z
date_published: 2023-02-28T00:00:00Z
date_updated: 2023-08-01T13:17:20Z
day: '28'
ddc:
- '000'
department:
- _id: DaAl
doi: 10.1016/j.tcs.2023.113733
ec_funded: 1
external_id:
isi:
- '000934262700001'
file:
- access_level: open_access
checksum: b27c5290f2f1500c403494364ee39c9f
content_type: application/pdf
creator: dernst
date_created: 2023-02-20T07:30:20Z
date_updated: 2023-02-20T07:30:20Z
file_id: '12570'
file_name: 2023_TheoreticalCompScience_Alistarh.pdf
file_size: 602333
relation: main_file
success: 1
file_date_updated: 2023-02-20T07:30:20Z
has_accepted_license: '1'
intvolume: ' 948'
isi: 1
issue: '2'
language:
- iso: eng
month: '02'
oa: 1
oa_version: Published Version
project:
- _id: 268A44D6-B435-11E9-9278-68D0E5697425
call_identifier: H2020
grant_number: '805223'
name: Elastic Coordination for Scalable Machine Learning
- _id: 26A5D39A-B435-11E9-9278-68D0E5697425
call_identifier: H2020
grant_number: '840605'
name: Coordination in constrained and natural distributed systems
publication: Theoretical Computer Science
publication_identifier:
issn:
- 0304-3975
publication_status: published
publisher: Elsevier
quality_controlled: '1'
scopus_import: '1'
status: public
title: Wait-free approximate agreement on graphs
tmp:
image: /images/cc_by.png
legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
short: CC BY (4.0)
type: journal_article
user_id: 4359f0d1-fa6c-11eb-b949-802e58b17ae8
volume: 948
year: '2023'
...
---
_id: '12330'
abstract:
- lang: eng
text: 'The design and implementation of efficient concurrent data structures has
seen significant attention. However, most of this work has focused on concurrent
data structures providing good worst-case guarantees, although, in real workloads,
objects are often accessed at different rates. Efficient distribution-adaptive
data structures, such as splay-trees, are known in the sequential case; however,
they often are hard to translate efficiently to the concurrent case. We investigate
distribution-adaptive concurrent data structures, and propose a new design called
the splay-list. At a high level, the splay-list is similar to a standard skip-list,
with the key distinction that the height of each element adapts dynamically to
its access rate: popular elements “move up,” whereas rarely-accessed elements
decrease in height. We show that the splay-list provides order-optimal amortized
complexity bounds for a subset of operations, while being amenable to efficient
concurrent implementation. Experiments show that the splay-list can leverage distribution-adaptivity
for performance, and can outperform the only previously-known distribution-adaptive
concurrent design in certain workloads.'
article_processing_charge: No
article_type: original
author:
- first_name: Vitalii
full_name: Aksenov, Vitalii
id: 2980135A-F248-11E8-B48F-1D18A9856A87
last_name: Aksenov
- first_name: Dan-Adrian
full_name: Alistarh, Dan-Adrian
id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
last_name: Alistarh
orcid: 0000-0003-3650-940X
- first_name: Alexandra
full_name: Drozdova, Alexandra
last_name: Drozdova
- first_name: Amirkeivan
full_name: Mohtashami, Amirkeivan
last_name: Mohtashami
citation:
ama: 'Aksenov V, Alistarh D-A, Drozdova A, Mohtashami A. The splay-list: A distribution-adaptive
concurrent skip-list. Distributed Computing. 2023;36:395-418. doi:10.1007/s00446-022-00441-x'
apa: 'Aksenov, V., Alistarh, D.-A., Drozdova, A., & Mohtashami, A. (2023). The
splay-list: A distribution-adaptive concurrent skip-list. Distributed Computing.
Springer Nature. https://doi.org/10.1007/s00446-022-00441-x'
chicago: 'Aksenov, Vitalii, Dan-Adrian Alistarh, Alexandra Drozdova, and Amirkeivan
Mohtashami. “The Splay-List: A Distribution-Adaptive Concurrent Skip-List.” Distributed
Computing. Springer Nature, 2023. https://doi.org/10.1007/s00446-022-00441-x.'
ieee: 'V. Aksenov, D.-A. Alistarh, A. Drozdova, and A. Mohtashami, “The splay-list:
A distribution-adaptive concurrent skip-list,” Distributed Computing, vol.
36. Springer Nature, pp. 395–418, 2023.'
ista: 'Aksenov V, Alistarh D-A, Drozdova A, Mohtashami A. 2023. The splay-list:
A distribution-adaptive concurrent skip-list. Distributed Computing. 36, 395–418.'
mla: 'Aksenov, Vitalii, et al. “The Splay-List: A Distribution-Adaptive Concurrent
Skip-List.” Distributed Computing, vol. 36, Springer Nature, 2023, pp.
395–418, doi:10.1007/s00446-022-00441-x.'
short: V. Aksenov, D.-A. Alistarh, A. Drozdova, A. Mohtashami, Distributed Computing
36 (2023) 395–418.
date_created: 2023-01-22T23:00:55Z
date_published: 2023-09-01T00:00:00Z
date_updated: 2023-08-14T12:54:32Z
day: '01'
department:
- _id: DaAl
doi: 10.1007/s00446-022-00441-x
external_id:
arxiv:
- '2008.01009'
isi:
- '000913424000001'
intvolume: ' 36'
isi: 1
language:
- iso: eng
main_file_link:
- open_access: '1'
url: https://doi.org/10.48550/arXiv.2008.01009
month: '09'
oa: 1
oa_version: Preprint
page: 395-418
publication: Distributed Computing
publication_identifier:
eissn:
- 1432-0452
issn:
- 0178-2770
publication_status: published
publisher: Springer Nature
quality_controlled: '1'
scopus_import: '1'
status: public
title: 'The splay-list: A distribution-adaptive concurrent skip-list'
type: journal_article
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 36
year: '2023'
...
---
_id: '14461'
abstract:
- lang: eng
text: 'Communication-reduction techniques are a popular way to improve scalability
in data-parallel training of deep neural networks (DNNs). The recent emergence
of large language models such as GPT has created the need for new approaches to
exploit data-parallelism. Among these, fully-sharded data parallel (FSDP) training
is highly popular, yet it still encounters scalability bottlenecks. One reason
is that applying compression techniques to FSDP is challenging: as the vast majority
of the communication involves the model’s weights, direct compression alters convergence
and leads to accuracy loss. We present QSDP, a variant of FSDP which supports
both gradient and weight quantization with theoretical guarantees, is simple to
implement and has essentially no overheads. To derive QSDP we prove that a natural
modification of SGD achieves convergence even when we only maintain quantized
weights, and thus the domain over which we train consists of quantized points
and is, therefore, highly non-convex. We validate this approach by training GPT-family
models with up to 1.3 billion parameters on a multi-node cluster. Experiments
show that QSDP preserves model accuracy, while completely removing the communication
bottlenecks of FSDP, providing end-to-end speedups of up to 2.2x.'
acknowledged_ssus:
- _id: ScienComp
acknowledgement: The authors gratefully acknowledge funding from the European Research
Council (ERC) under the European Union’s Horizon 2020 research and innovation programme
(grant agreement No 805223 ScaleML), as well as experimental support from the IST
Austria IT department, in particular Stefano Elefante, Andrei Hornoiu, and Alois
Schloegl. AV acknowledges the support of the French Agence Nationale de la Recherche
(ANR), under grant ANR-21-CE48-0016 (project COMCOPT), the support of Fondation
Hadamard with a PRMO grant, and the support of CNRS with a CoopIntEER IEA grant
(project ALFRED).
alternative_title:
- PMLR
article_processing_charge: No
author:
- first_name: Ilia
full_name: Markov, Ilia
id: D0CF4148-C985-11E9-8066-0BDEE5697425
last_name: Markov
- first_name: Adrian
full_name: Vladu, Adrian
last_name: Vladu
- first_name: Qi
full_name: Guo, Qi
last_name: Guo
- first_name: Dan-Adrian
full_name: Alistarh, Dan-Adrian
id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
last_name: Alistarh
orcid: 0000-0003-3650-940X
citation:
ama: 'Markov I, Vladu A, Guo Q, Alistarh D-A. Quantized distributed training of
large models with convergence guarantees. In: Proceedings of the 40th International
Conference on Machine Learning. Vol 202. ML Research Press; 2023:24020-24044.'
apa: 'Markov, I., Vladu, A., Guo, Q., & Alistarh, D.-A. (2023). Quantized distributed
training of large models with convergence guarantees. In Proceedings of the
40th International Conference on Machine Learning (Vol. 202, pp. 24020–24044).
Honolulu, Hawaii, HI, United States: ML Research Press.'
chicago: Markov, Ilia, Adrian Vladu, Qi Guo, and Dan-Adrian Alistarh. “Quantized
Distributed Training of Large Models with Convergence Guarantees.” In Proceedings
of the 40th International Conference on Machine Learning, 202:24020–44. ML
Research Press, 2023.
ieee: I. Markov, A. Vladu, Q. Guo, and D.-A. Alistarh, “Quantized distributed training
of large models with convergence guarantees,” in Proceedings of the 40th International
Conference on Machine Learning, Honolulu, Hawaii, HI, United States, 2023,
vol. 202, pp. 24020–24044.
ista: 'Markov I, Vladu A, Guo Q, Alistarh D-A. 2023. Quantized distributed training
of large models with convergence guarantees. Proceedings of the 40th International
Conference on Machine Learning. ICML: International Conference on Machine Learning,
PMLR, vol. 202, 24020–24044.'
mla: Markov, Ilia, et al. “Quantized Distributed Training of Large Models with Convergence
Guarantees.” Proceedings of the 40th International Conference on Machine Learning,
vol. 202, ML Research Press, 2023, pp. 24020–44.
short: I. Markov, A. Vladu, Q. Guo, D.-A. Alistarh, in:, Proceedings of the 40th
International Conference on Machine Learning, ML Research Press, 2023, pp. 24020–24044.
conference:
end_date: 2023-07-29
location: Honolulu, Hawaii, HI, United States
name: 'ICML: International Conference on Machine Learning'
start_date: 2023-07-23
date_created: 2023-10-29T23:01:17Z
date_published: 2023-07-30T00:00:00Z
date_updated: 2023-10-31T09:40:45Z
day: '30'
department:
- _id: DaAl
ec_funded: 1
external_id:
arxiv:
- '2302.02390'
intvolume: ' 202'
language:
- iso: eng
main_file_link:
- open_access: '1'
url: https://doi.org/10.48550/arXiv.2302.02390
month: '07'
oa: 1
oa_version: Preprint
page: 24020-24044
project:
- _id: 268A44D6-B435-11E9-9278-68D0E5697425
call_identifier: H2020
grant_number: '805223'
name: Elastic Coordination for Scalable Machine Learning
publication: Proceedings of the 40th International Conference on Machine Learning
publication_identifier:
eissn:
- 2640-3498
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
scopus_import: '1'
status: public
title: Quantized distributed training of large models with convergence guarantees
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 202
year: '2023'
...
---
_id: '14460'
abstract:
- lang: eng
text: We provide an efficient implementation of the backpropagation algorithm, specialized
to the case where the weights of the neural network being trained are sparse.
Our algorithm is general, as it applies to arbitrary (unstructured) sparsity and
common layer types (e.g., convolutional or linear). We provide a fast vectorized
implementation on commodity CPUs, and show that it can yield speedups in end-to-end
runtime experiments, both in transfer learning using already-sparsified networks,
and in training sparse networks from scratch. Thus, our results provide the first
support for sparse training on commodity hardware.
acknowledgement: 'We would like to thank Elias Frantar for his valuable assistance
and support at the outset of this project, and the anonymous ICML and SNN reviewers
for very constructive feedback. EI was supported in part by the FWF DK VGSCO, grant
agreement number W1260-N35. DA acknowledges generous ERC support, via Starting Grant
805223 ScaleML. '
alternative_title:
- PMLR
article_processing_charge: No
author:
- first_name: Mahdi
full_name: Nikdan, Mahdi
id: 66374281-f394-11eb-9cf6-869147deecc0
last_name: Nikdan
- first_name: Tommaso
full_name: Pegolotti, Tommaso
last_name: Pegolotti
- first_name: Eugenia B
full_name: Iofinova, Eugenia B
id: f9a17499-f6e0-11ea-865d-fdf9a3f77117
last_name: Iofinova
orcid: 0000-0002-7778-3221
- first_name: Eldar
full_name: Kurtic, Eldar
id: 47beb3a5-07b5-11eb-9b87-b108ec578218
last_name: Kurtic
- first_name: Dan-Adrian
full_name: Alistarh, Dan-Adrian
id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
last_name: Alistarh
orcid: 0000-0003-3650-940X
citation:
ama: 'Nikdan M, Pegolotti T, Iofinova EB, Kurtic E, Alistarh D-A. SparseProp: Efficient
sparse backpropagation for faster training of neural networks at the edge. In:
Proceedings of the 40th International Conference on Machine Learning. Vol
202. ML Research Press; 2023:26215-26227.'
apa: 'Nikdan, M., Pegolotti, T., Iofinova, E. B., Kurtic, E., & Alistarh, D.-A.
(2023). SparseProp: Efficient sparse backpropagation for faster training of neural
networks at the edge. In Proceedings of the 40th International Conference on
Machine Learning (Vol. 202, pp. 26215–26227). Honolulu, Hawaii, HI, United
States: ML Research Press.'
chicago: 'Nikdan, Mahdi, Tommaso Pegolotti, Eugenia B Iofinova, Eldar Kurtic, and
Dan-Adrian Alistarh. “SparseProp: Efficient Sparse Backpropagation for Faster
Training of Neural Networks at the Edge.” In Proceedings of the 40th International
Conference on Machine Learning, 202:26215–27. ML Research Press, 2023.'
ieee: 'M. Nikdan, T. Pegolotti, E. B. Iofinova, E. Kurtic, and D.-A. Alistarh, “SparseProp:
Efficient sparse backpropagation for faster training of neural networks at the
edge,” in Proceedings of the 40th International Conference on Machine Learning,
Honolulu, Hawaii, HI, United States, 2023, vol. 202, pp. 26215–26227.'
ista: 'Nikdan M, Pegolotti T, Iofinova EB, Kurtic E, Alistarh D-A. 2023. SparseProp:
Efficient sparse backpropagation for faster training of neural networks at the
edge. Proceedings of the 40th International Conference on Machine Learning. ICML:
International Conference on Machine Learning, PMLR, vol. 202, 26215–26227.'
mla: 'Nikdan, Mahdi, et al. “SparseProp: Efficient Sparse Backpropagation for Faster
Training of Neural Networks at the Edge.” Proceedings of the 40th International
Conference on Machine Learning, vol. 202, ML Research Press, 2023, pp. 26215–27.'
short: M. Nikdan, T. Pegolotti, E.B. Iofinova, E. Kurtic, D.-A. Alistarh, in:, Proceedings
of the 40th International Conference on Machine Learning, ML Research Press, 2023,
pp. 26215–26227.
conference:
end_date: 2023-07-29
location: Honolulu, Hawaii, HI, United States
name: 'ICML: International Conference on Machine Learning'
start_date: 2023-07-23
date_created: 2023-10-29T23:01:17Z
date_published: 2023-07-30T00:00:00Z
date_updated: 2023-10-31T09:33:51Z
day: '30'
department:
- _id: DaAl
ec_funded: 1
external_id:
arxiv:
- '2302.04852'
intvolume: ' 202'
language:
- iso: eng
main_file_link:
- open_access: '1'
url: https://doi.org/10.48550/arXiv.2302.04852
month: '07'
oa: 1
oa_version: Preprint
page: 26215-26227
project:
- _id: 268A44D6-B435-11E9-9278-68D0E5697425
call_identifier: H2020
grant_number: '805223'
name: Elastic Coordination for Scalable Machine Learning
publication: Proceedings of the 40th International Conference on Machine Learning
publication_identifier:
eissn:
- 2640-3498
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
scopus_import: '1'
status: public
title: 'SparseProp: Efficient sparse backpropagation for faster training of neural
networks at the edge'
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 202
year: '2023'
...
---
_id: '14458'
abstract:
- lang: eng
text: 'We show for the first time that large-scale generative pretrained transformer
(GPT) family models can be pruned to at least 50% sparsity in one-shot, without
any retraining, at minimal loss of accuracy. This is achieved via a new pruning
method called SparseGPT, specifically designed to work efficiently and accurately
on massive GPT-family models. We can execute SparseGPT on the largest available
open-source models, OPT-175B and BLOOM-176B, in under 4.5 hours, and can reach
60% unstructured sparsity with negligible increase in perplexity: remarkably,
more than 100 billion weights from these models can be ignored at inference time.
SparseGPT generalizes to semi-structured (2:4 and 4:8) patterns, and is compatible
with weight quantization approaches. The code is available at: https://github.com/IST-DASLab/sparsegpt.'
acknowledged_ssus:
- _id: ScienComp
acknowledgement: The authors gratefully acknowledge funding from the European Research
Council (ERC) under the European Union’s Horizon 2020 programme (grant agreement
No. 805223 ScaleML), as well as experimental support from Eldar Kurtic, and from
the IST Austria IT department, in particular Stefano Elefante, Andrei Hornoiu, and
Alois Schloegl.
alternative_title:
- PMLR
article_processing_charge: No
author:
- first_name: Elias
full_name: Frantar, Elias
id: 09a8f98d-ec99-11ea-ae11-c063a7b7fe5f
last_name: Frantar
- first_name: Dan-Adrian
full_name: Alistarh, Dan-Adrian
id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
last_name: Alistarh
orcid: 0000-0003-3650-940X
citation:
ama: 'Frantar E, Alistarh D-A. SparseGPT: Massive language models can be accurately
pruned in one-shot. In: Proceedings of the 40th International Conference on
Machine Learning. Vol 202. ML Research Press; 2023:10323-10337.'
apa: 'Frantar, E., & Alistarh, D.-A. (2023). SparseGPT: Massive language models
can be accurately pruned in one-shot. In Proceedings of the 40th International
Conference on Machine Learning (Vol. 202, pp. 10323–10337). Honolulu, Hawaii,
HI, United States: ML Research Press.'
chicago: 'Frantar, Elias, and Dan-Adrian Alistarh. “SparseGPT: Massive Language
Models Can Be Accurately Pruned in One-Shot.” In Proceedings of the 40th International
Conference on Machine Learning, 202:10323–37. ML Research Press, 2023.'
ieee: 'E. Frantar and D.-A. Alistarh, “SparseGPT: Massive language models can be
accurately pruned in one-shot,” in Proceedings of the 40th International Conference
on Machine Learning, Honolulu, Hawaii, HI, United States, 2023, vol. 202,
pp. 10323–10337.'
ista: 'Frantar E, Alistarh D-A. 2023. SparseGPT: Massive language models can be
accurately pruned in one-shot. Proceedings of the 40th International Conference
on Machine Learning. ICML: International Conference on Machine Learning, PMLR,
vol. 202, 10323–10337.'
mla: 'Frantar, Elias, and Dan-Adrian Alistarh. “SparseGPT: Massive Language Models
Can Be Accurately Pruned in One-Shot.” Proceedings of the 40th International
Conference on Machine Learning, vol. 202, ML Research Press, 2023, pp. 10323–37.'
short: E. Frantar, D.-A. Alistarh, in:, Proceedings of the 40th International Conference
on Machine Learning, ML Research Press, 2023, pp. 10323–10337.
conference:
end_date: 2023-07-29
location: Honolulu, Hawaii, HI, United States
name: 'ICML: International Conference on Machine Learning'
start_date: 2023-07-23
date_created: 2023-10-29T23:01:16Z
date_published: 2023-07-30T00:00:00Z
date_updated: 2023-10-31T09:59:42Z
day: '30'
department:
- _id: DaAl
ec_funded: 1
external_id:
arxiv:
- '2301.00774'
intvolume: ' 202'
language:
- iso: eng
main_file_link:
- open_access: '1'
url: https://doi.org/10.48550/arXiv.2301.00774
month: '07'
oa: 1
oa_version: Preprint
page: 10323-10337
project:
- _id: 268A44D6-B435-11E9-9278-68D0E5697425
call_identifier: H2020
grant_number: '805223'
name: Elastic Coordination for Scalable Machine Learning
publication: Proceedings of the 40th International Conference on Machine Learning
publication_identifier:
eissn:
- 2640-3498
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
scopus_import: '1'
status: public
title: 'SparseGPT: Massive language models can be accurately pruned in one-shot'
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 202
year: '2023'
...
---
_id: '14364'
abstract:
- lang: eng
text: We introduce extension-based proofs, a class of impossibility proofs that
includes valency arguments. They are modelled as an interaction between a prover
and a protocol. Using proofs based on combinatorial topology, it has been shown
that it is impossible to deterministically solve -set agreement among processes
or approximate agreement on a cycle of length 4 among processes in a wait-free
manner in asynchronous models where processes communicate using objects that can
be constructed from shared registers. However, it was unknown whether proofs based
on simpler techniques were possible. We show that these impossibility results
cannot be obtained by extension-based proofs in the iterated snapshot model and,
hence, extension-based proofs are limited in power.
acknowledgement: "We would like to thank Valerie King, Toniann Pitassi, and Michael
Saks for helpful discussions and Shi Hao Liu for his useful feedback.\r\nThis research
was supported by the Natural Science and Engineering Research Council of Canada
under grants RGPIN-2015-05080 and RGPIN-2020-04178, a postgraduate scholarship,
and a postdoctoral fellowship; a University of Toronto postdoctoral fellowship;
the National Science Foundation under grants CCF-1217921, CCF-1301926, CCF-1637385,
CCF-1650596, and IIS-1447786; the U.S. Department of Energy under grant ER26116/DE-SC0008923;
the European Research Council (ERC) under the European Union’s Horizon 2020 research
and innovation programme grant agreement 805223 ScaleML; and the Oracle and Intel
corporations. Some of the work on this paper was done while Faith Ellen was visiting
IST Austria."
article_processing_charge: No
article_type: original
author:
- first_name: Dan-Adrian
full_name: Alistarh, Dan-Adrian
id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
last_name: Alistarh
orcid: 0000-0003-3650-940X
- first_name: James
full_name: Aspnes, James
last_name: Aspnes
- first_name: Faith
full_name: Ellen, Faith
last_name: Ellen
- first_name: Rati
full_name: Gelashvili, Rati
last_name: Gelashvili
- first_name: Leqi
full_name: Zhu, Leqi
id: a2117c59-cee4-11ed-b9d0-874ecf0f8ac5
last_name: Zhu
citation:
ama: Alistarh D-A, Aspnes J, Ellen F, Gelashvili R, Zhu L. Why extension-based proofs
fail. SIAM Journal on Computing. 2023;52(4):913-944. doi:10.1137/20M1375851
apa: Alistarh, D.-A., Aspnes, J., Ellen, F., Gelashvili, R., & Zhu, L. (2023).
Why extension-based proofs fail. SIAM Journal on Computing. Society for
Industrial and Applied Mathematics. https://doi.org/10.1137/20M1375851
chicago: Alistarh, Dan-Adrian, James Aspnes, Faith Ellen, Rati Gelashvili, and Leqi
Zhu. “Why Extension-Based Proofs Fail.” SIAM Journal on Computing. Society
for Industrial and Applied Mathematics, 2023. https://doi.org/10.1137/20M1375851.
ieee: D.-A. Alistarh, J. Aspnes, F. Ellen, R. Gelashvili, and L. Zhu, “Why extension-based
proofs fail,” SIAM Journal on Computing, vol. 52, no. 4. Society for Industrial
and Applied Mathematics, pp. 913–944, 2023.
ista: Alistarh D-A, Aspnes J, Ellen F, Gelashvili R, Zhu L. 2023. Why extension-based
proofs fail. SIAM Journal on Computing. 52(4), 913–944.
mla: Alistarh, Dan-Adrian, et al. “Why Extension-Based Proofs Fail.” SIAM Journal
on Computing, vol. 52, no. 4, Society for Industrial and Applied Mathematics,
2023, pp. 913–44, doi:10.1137/20M1375851.
short: D.-A. Alistarh, J. Aspnes, F. Ellen, R. Gelashvili, L. Zhu, SIAM Journal
on Computing 52 (2023) 913–944.
date_created: 2023-09-24T22:01:11Z
date_published: 2023-07-25T00:00:00Z
date_updated: 2023-12-13T12:28:29Z
day: '25'
department:
- _id: DaAl
doi: 10.1137/20M1375851
ec_funded: 1
external_id:
arxiv:
- '1811.01421'
isi:
- '001082972300004'
intvolume: ' 52'
isi: 1
issue: '4'
language:
- iso: eng
main_file_link:
- open_access: '1'
url: https://arxiv.org/abs/1811.01421
month: '07'
oa: 1
oa_version: Preprint
page: 913-944
project:
- _id: 268A44D6-B435-11E9-9278-68D0E5697425
call_identifier: H2020
grant_number: '805223'
name: Elastic Coordination for Scalable Machine Learning
publication: SIAM Journal on Computing
publication_identifier:
eissn:
- 1095-7111
issn:
- 0097-5397
publication_status: published
publisher: Society for Industrial and Applied Mathematics
quality_controlled: '1'
related_material:
record:
- id: '6676'
relation: earlier_version
status: public
scopus_import: '1'
status: public
title: Why extension-based proofs fail
type: journal_article
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 52
year: '2023'
...
---
_id: '14771'
abstract:
- lang: eng
text: Pruning—that is, setting a significant subset of the parameters of a neural
network to zero—is one of the most popular methods of model compression. Yet,
several recent works have raised the issue that pruning may induce or exacerbate
bias in the output of the compressed model. Despite existing evidence for this
phenomenon, the relationship between neural network pruning and induced bias is
not well-understood. In this work, we systematically investigate and characterize
this phenomenon in Convolutional Neural Networks for computer vision. First, we
show that it is in fact possible to obtain highly-sparse models, e.g. with less
than 10% remaining weights, which do not decrease in accuracy nor substantially
increase in bias when compared to dense models. At the same time, we also find
that, at higher sparsities, pruned models exhibit higher uncertainty in their
outputs, as well as increased correlations, which we directly link to increased
bias. We propose easy-to-use criteria which, based only on the uncompressed model,
establish whether bias will increase with pruning, and identify the samples most
susceptible to biased predictions post-compression. Our code can be found at https://github.com/IST-DASLab/pruned-vision-model-bias.
acknowledgement: The authors would like to sincerely thank Sara Hooker for her feedback
during the development of this work. EI was supported in part by the FWF DK VGSCO,
grant agreement number W1260-N35. AP and DA acknowledge generous ERC support, via
Starting Grant 805223 ScaleML.
article_processing_charge: No
author:
- first_name: Eugenia B
full_name: Iofinova, Eugenia B
id: f9a17499-f6e0-11ea-865d-fdf9a3f77117
last_name: Iofinova
orcid: 0000-0002-7778-3221
- first_name: Elena-Alexandra
full_name: Peste, Elena-Alexandra
id: 32D78294-F248-11E8-B48F-1D18A9856A87
last_name: Peste
- first_name: Dan-Adrian
full_name: Alistarh, Dan-Adrian
id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
last_name: Alistarh
orcid: 0000-0003-3650-940X
citation:
ama: 'Iofinova EB, Peste E-A, Alistarh D-A. Bias in pruned vision models: In-depth
analysis and countermeasures. In: 2023 IEEE/CVF Conference on Computer Vision
and Pattern Recognition. IEEE; 2023:24364-24373. doi:10.1109/cvpr52729.2023.02334'
apa: 'Iofinova, E. B., Peste, E.-A., & Alistarh, D.-A. (2023). Bias in pruned
vision models: In-depth analysis and countermeasures. In 2023 IEEE/CVF Conference
on Computer Vision and Pattern Recognition (pp. 24364–24373). Vancouver, BC,
Canada: IEEE. https://doi.org/10.1109/cvpr52729.2023.02334'
chicago: 'Iofinova, Eugenia B, Elena-Alexandra Peste, and Dan-Adrian Alistarh. “Bias
in Pruned Vision Models: In-Depth Analysis and Countermeasures.” In 2023 IEEE/CVF
Conference on Computer Vision and Pattern Recognition, 24364–73. IEEE, 2023.
https://doi.org/10.1109/cvpr52729.2023.02334.'
ieee: 'E. B. Iofinova, E.-A. Peste, and D.-A. Alistarh, “Bias in pruned vision models:
In-depth analysis and countermeasures,” in 2023 IEEE/CVF Conference on Computer
Vision and Pattern Recognition, Vancouver, BC, Canada, 2023, pp. 24364–24373.'
ista: 'Iofinova EB, Peste E-A, Alistarh D-A. 2023. Bias in pruned vision models:
In-depth analysis and countermeasures. 2023 IEEE/CVF Conference on Computer Vision
and Pattern Recognition. CVPR: Conference on Computer Vision and Pattern Recognition,
24364–24373.'
mla: 'Iofinova, Eugenia B., et al. “Bias in Pruned Vision Models: In-Depth Analysis
and Countermeasures.” 2023 IEEE/CVF Conference on Computer Vision and Pattern
Recognition, IEEE, 2023, pp. 24364–73, doi:10.1109/cvpr52729.2023.02334.'
short: E.B. Iofinova, E.-A. Peste, D.-A. Alistarh, in:, 2023 IEEE/CVF Conference
on Computer Vision and Pattern Recognition, IEEE, 2023, pp. 24364–24373.
conference:
end_date: 2023-06-24
location: Vancouver, BC, Canada
name: 'CVPR: Conference on Computer Vision and Pattern Recognition'
start_date: 2023-06-17
date_created: 2024-01-10T08:42:40Z
date_published: 2023-08-22T00:00:00Z
date_updated: 2024-01-10T08:59:26Z
day: '22'
department:
- _id: DaAl
- _id: ChLa
doi: 10.1109/cvpr52729.2023.02334
ec_funded: 1
external_id:
arxiv:
- '2304.12622'
isi:
- '001062531308068'
isi: 1
language:
- iso: eng
main_file_link:
- open_access: '1'
url: https://doi.org/10.48550/arXiv.2304.12622
month: '08'
oa: 1
oa_version: Preprint
page: 24364-24373
project:
- _id: 9B9290DE-BA93-11EA-9121-9846C619BF3A
grant_number: ' W1260-N35'
name: Vienna Graduate School on Computational Optimization
- _id: 268A44D6-B435-11E9-9278-68D0E5697425
call_identifier: H2020
grant_number: '805223'
name: Elastic Coordination for Scalable Machine Learning
publication: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition
publication_identifier:
eisbn:
- '9798350301298'
eissn:
- 2575-7075
publication_status: published
publisher: IEEE
quality_controlled: '1'
related_material:
link:
- relation: software
url: https://github.com/IST-DASLab/pruned-vision-model-bias
status: public
title: 'Bias in pruned vision models: In-depth analysis and countermeasures'
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '14260'
abstract:
- lang: eng
text: "This paper presents Lincheck, a new practical and user-friendly framework
for testing concurrent algorithms on the Java Virtual Machine (JVM). Lincheck
provides a simple and declarative way to write concurrent tests: instead of describing
how to perform the test, users specify what to test by declaring all the operations
to examine; the framework automatically handles the rest. As a result, tests written
with Lincheck are concise and easy to understand. The framework automatically
generates a set of concurrent scenarios, examines them using stress-testing or
bounded model checking, and verifies that the results of each invocation are correct.
Notably, if an error is detected via model checking, Lincheck provides an easy-to-follow
trace to reproduce it, significantly simplifying the bug investigation.\r\n\r\nTo
the best of our knowledge, Lincheck is the first production-ready tool on the
JVM that offers such a simple way of writing concurrent tests, without requiring
special skills or expertise. We successfully integrated Lincheck in the development
process of several large projects, such as Kotlin Coroutines, and identified new
bugs in popular concurrency libraries, such as a race in Java’s standard ConcurrentLinkedDeque
and a liveliness bug in Java’s AbstractQueuedSynchronizer framework, which is
used in most of the synchronization primitives. We believe that Lincheck can significantly
improve the quality and productivity of concurrent algorithms research and development
and become the state-of-the-art tool for checking their correctness."
alternative_title:
- LNCS
article_processing_charge: Yes (in subscription journal)
author:
- first_name: Nikita
full_name: Koval, Nikita
id: 2F4DB10C-F248-11E8-B48F-1D18A9856A87
last_name: Koval
- first_name: Alexander
full_name: Fedorov, Alexander
id: 2e711909-896a-11ed-bdf8-eb0f5a2984c6
last_name: Fedorov
- first_name: Maria
full_name: Sokolova, Maria
last_name: Sokolova
- first_name: Dmitry
full_name: Tsitelov, Dmitry
last_name: Tsitelov
- first_name: Dan-Adrian
full_name: Alistarh, Dan-Adrian
id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
last_name: Alistarh
orcid: 0000-0003-3650-940X
citation:
ama: 'Koval N, Fedorov A, Sokolova M, Tsitelov D, Alistarh D-A. Lincheck: A practical
framework for testing concurrent data structures on JVM. In: 35th International
Conference on Computer Aided Verification . Vol 13964. Springer Nature; 2023:156-169.
doi:10.1007/978-3-031-37706-8_8'
apa: 'Koval, N., Fedorov, A., Sokolova, M., Tsitelov, D., & Alistarh, D.-A.
(2023). Lincheck: A practical framework for testing concurrent data structures
on JVM. In 35th International Conference on Computer Aided Verification
(Vol. 13964, pp. 156–169). Paris, France: Springer Nature. https://doi.org/10.1007/978-3-031-37706-8_8'
chicago: 'Koval, Nikita, Alexander Fedorov, Maria Sokolova, Dmitry Tsitelov, and
Dan-Adrian Alistarh. “Lincheck: A Practical Framework for Testing Concurrent Data
Structures on JVM.” In 35th International Conference on Computer Aided Verification
, 13964:156–69. Springer Nature, 2023. https://doi.org/10.1007/978-3-031-37706-8_8.'
ieee: 'N. Koval, A. Fedorov, M. Sokolova, D. Tsitelov, and D.-A. Alistarh, “Lincheck:
A practical framework for testing concurrent data structures on JVM,” in 35th
International Conference on Computer Aided Verification , Paris, France, 2023,
vol. 13964, pp. 156–169.'
ista: 'Koval N, Fedorov A, Sokolova M, Tsitelov D, Alistarh D-A. 2023. Lincheck:
A practical framework for testing concurrent data structures on JVM. 35th International
Conference on Computer Aided Verification . CAV: Computer Aided Verification,
LNCS, vol. 13964, 156–169.'
mla: 'Koval, Nikita, et al. “Lincheck: A Practical Framework for Testing Concurrent
Data Structures on JVM.” 35th International Conference on Computer Aided Verification
, vol. 13964, Springer Nature, 2023, pp. 156–69, doi:10.1007/978-3-031-37706-8_8.'
short: N. Koval, A. Fedorov, M. Sokolova, D. Tsitelov, D.-A. Alistarh, in:, 35th
International Conference on Computer Aided Verification , Springer Nature, 2023,
pp. 156–169.
conference:
end_date: 2023-07-22
location: Paris, France
name: 'CAV: Computer Aided Verification'
start_date: 2023-07-17
date_created: 2023-09-03T22:01:16Z
date_published: 2023-07-17T00:00:00Z
date_updated: 2024-02-27T07:46:52Z
day: '17'
ddc:
- '000'
department:
- _id: DaAl
- _id: GradSch
doi: 10.1007/978-3-031-37706-8_8
file:
- access_level: open_access
checksum: c346016393123a0a2338ad4d976f61bc
content_type: application/pdf
creator: dernst
date_created: 2023-09-06T08:16:25Z
date_updated: 2023-09-06T08:16:25Z
file_id: '14275'
file_name: 2023_LNCS_Koval.pdf
file_size: 421408
relation: main_file
success: 1
file_date_updated: 2023-09-06T08:16:25Z
has_accepted_license: '1'
intvolume: ' 13964'
language:
- iso: eng
month: '07'
oa: 1
oa_version: Published Version
page: 156-169
publication: '35th International Conference on Computer Aided Verification '
publication_identifier:
eissn:
- 1611-3349
isbn:
- '9783031377051'
issn:
- 0302-9743
publication_status: published
publisher: Springer Nature
quality_controlled: '1'
related_material:
record:
- id: '14995'
relation: research_data
status: public
scopus_import: '1'
status: public
title: 'Lincheck: A practical framework for testing concurrent data structures on JVM'
tmp:
image: /images/cc_by.png
legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
short: CC BY (4.0)
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 13964
year: '2023'
...
---
_id: '14995'
abstract:
- lang: eng
text: "Lincheck is a new practical and user-friendly framework for testing concurrent
data structures on the Java Virtual Machine (JVM). It provides a simple and declarative
way to write concurrent tests. Instead of describing how to perform the test,
users specify what to test by declaring all the operations to examine; the framework
automatically handles the rest. As a result, tests written with Lincheck are concise
and easy to understand. \r\nThe artifact presents a collection of Lincheck tests
that discover new bugs in popular libraries and implementations from the concurrency
literature -- they are listed in Table 1, Section 3. To evaluate the performance
of Lincheck analysis, the collection of tests also includes those which check
correct data structures and, thus, always succeed. Similarly to Table 2, Section
3, the experiments demonstrate the reasonable time to perform a test. Finally,
Lincheck provides user-friendly output with an easy-to-follow trace to reproduce
a detected error, significantly simplifying further investigation."
article_processing_charge: No
author:
- first_name: Nikita
full_name: Koval, Nikita
id: 2F4DB10C-F248-11E8-B48F-1D18A9856A87
last_name: Koval
- first_name: Alexander
full_name: Fedorov, Alexander
id: 2e711909-896a-11ed-bdf8-eb0f5a2984c6
last_name: Fedorov
- first_name: Maria
full_name: Sokolova, Maria
last_name: Sokolova
- first_name: Dmitry
full_name: Tsitelov, Dmitry
last_name: Tsitelov
- first_name: Dan-Adrian
full_name: Alistarh, Dan-Adrian
id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
last_name: Alistarh
orcid: 0000-0003-3650-940X
citation:
ama: 'Koval N, Fedorov A, Sokolova M, Tsitelov D, Alistarh D-A. Lincheck: A practical
framework for testing concurrent data structures on JVM. 2023. doi:10.5281/ZENODO.7877757'
apa: 'Koval, N., Fedorov, A., Sokolova, M., Tsitelov, D., & Alistarh, D.-A.
(2023). Lincheck: A practical framework for testing concurrent data structures
on JVM. Zenodo. https://doi.org/10.5281/ZENODO.7877757'
chicago: 'Koval, Nikita, Alexander Fedorov, Maria Sokolova, Dmitry Tsitelov, and
Dan-Adrian Alistarh. “Lincheck: A Practical Framework for Testing Concurrent Data
Structures on JVM.” Zenodo, 2023. https://doi.org/10.5281/ZENODO.7877757.'
ieee: 'N. Koval, A. Fedorov, M. Sokolova, D. Tsitelov, and D.-A. Alistarh, “Lincheck:
A practical framework for testing concurrent data structures on JVM.” Zenodo,
2023.'
ista: 'Koval N, Fedorov A, Sokolova M, Tsitelov D, Alistarh D-A. 2023. Lincheck:
A practical framework for testing concurrent data structures on JVM, Zenodo, 10.5281/ZENODO.7877757.'
mla: 'Koval, Nikita, et al. Lincheck: A Practical Framework for Testing Concurrent
Data Structures on JVM. Zenodo, 2023, doi:10.5281/ZENODO.7877757.'
short: N. Koval, A. Fedorov, M. Sokolova, D. Tsitelov, D.-A. Alistarh, (2023).
date_created: 2024-02-14T15:14:13Z
date_published: 2023-04-28T00:00:00Z
date_updated: 2024-02-27T07:46:52Z
day: '28'
ddc:
- '000'
department:
- _id: DaAl
doi: 10.5281/ZENODO.7877757
main_file_link:
- open_access: '1'
url: https://doi.org/10.5281/zenodo.7877757
month: '04'
oa: 1
oa_version: Published Version
publisher: Zenodo
related_material:
record:
- id: '14260'
relation: used_in_publication
status: public
status: public
title: 'Lincheck: A practical framework for testing concurrent data structures on
JVM'
type: research_data_reference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2023'
...
---
_id: '11184'
abstract:
- lang: eng
text: "Let G be a graph on n nodes. In the stochastic population protocol model,
a collection of n indistinguishable, resource-limited nodes collectively solve
tasks via pairwise interactions. In each interaction, two randomly chosen neighbors
first read each other’s states, and then update their local states. A rich line
of research has established tight upper and lower bounds on the complexity of
fundamental tasks, such as majority and leader election, in this model, when G
is a clique. Specifically, in the clique, these tasks can be solved fast, i.e.,
in n polylog n pairwise interactions, with high probability, using at most polylog
n states per node.\r\nIn this work, we consider the more general setting where
G is an arbitrary regular graph, and present a technique for simulating protocols
designed for fully-connected networks in any connected regular graph. Our main
result is a simulation that is efficient on many interesting graph families: roughly,
the simulation overhead is polylogarithmic in the number of nodes, and quadratic
in the conductance of the graph. As a sample application, we show that, in any
regular graph with conductance φ, both leader election and exact majority can
be solved in φ^{-2} ⋅ n polylog n pairwise interactions, with high probability,
using at most φ^{-2} ⋅ polylog n states per node. This shows that there are fast
and space-efficient population protocols for leader election and exact majority
on graphs with good expansion properties. We believe our results will prove generally
useful, as they allow efficient technology transfer between the well-mixed (clique)
case, and the under-explored spatial setting."
acknowledgement: "Dan Alistarh: This project has received funding from the European
Research Council (ERC)\r\nunder the European Union’s Horizon 2020 research and innovation
programme (grant agreement No.805223 ScaleML).\r\nJoel Rybicki: This project has
received from the European Union’s Horizon 2020 research and\r\ninnovation programme
under the Marie Skłodowska-Curie grant agreement No. 840605.\r\nAcknowledgements
We grateful to Giorgi Nadiradze for pointing out a generalisation of the phase clock
construction to non-regular graphs. We also thank anonymous reviewers for their
useful comments on earlier versions of this manuscript."
alternative_title:
- LIPIcs
article_number: '14'
article_processing_charge: No
author:
- first_name: Dan-Adrian
full_name: Alistarh, Dan-Adrian
id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
last_name: Alistarh
orcid: 0000-0003-3650-940X
- first_name: Rati
full_name: Gelashvili, Rati
last_name: Gelashvili
- first_name: Joel
full_name: Rybicki, Joel
id: 334EFD2E-F248-11E8-B48F-1D18A9856A87
last_name: Rybicki
orcid: 0000-0002-6432-6646
citation:
ama: 'Alistarh D-A, Gelashvili R, Rybicki J. Fast graphical population protocols.
In: Bramas Q, Gramoli V, Milani A, eds. 25th International Conference on Principles
of Distributed Systems. Vol 217. Schloss Dagstuhl - Leibniz-Zentrum für Informatik;
2022. doi:10.4230/LIPIcs.OPODIS.2021.14'
apa: 'Alistarh, D.-A., Gelashvili, R., & Rybicki, J. (2022). Fast graphical
population protocols. In Q. Bramas, V. Gramoli, & A. Milani (Eds.), 25th
International Conference on Principles of Distributed Systems (Vol. 217).
Strasbourg, France: Schloss Dagstuhl - Leibniz-Zentrum für Informatik. https://doi.org/10.4230/LIPIcs.OPODIS.2021.14'
chicago: Alistarh, Dan-Adrian, Rati Gelashvili, and Joel Rybicki. “Fast Graphical
Population Protocols.” In 25th International Conference on Principles of Distributed
Systems, edited by Quentin Bramas, Vincent Gramoli, and Alessia Milani, Vol.
217. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022. https://doi.org/10.4230/LIPIcs.OPODIS.2021.14.
ieee: D.-A. Alistarh, R. Gelashvili, and J. Rybicki, “Fast graphical population
protocols,” in 25th International Conference on Principles of Distributed Systems,
Strasbourg, France, 2022, vol. 217.
ista: Alistarh D-A, Gelashvili R, Rybicki J. 2022. Fast graphical population protocols.
25th International Conference on Principles of Distributed Systems. OPODIS, LIPIcs,
vol. 217, 14.
mla: Alistarh, Dan-Adrian, et al. “Fast Graphical Population Protocols.” 25th
International Conference on Principles of Distributed Systems, edited by Quentin
Bramas et al., vol. 217, 14, Schloss Dagstuhl - Leibniz-Zentrum für Informatik,
2022, doi:10.4230/LIPIcs.OPODIS.2021.14.
short: D.-A. Alistarh, R. Gelashvili, J. Rybicki, in:, Q. Bramas, V. Gramoli, A.
Milani (Eds.), 25th International Conference on Principles of Distributed Systems,
Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022.
conference:
end_date: 2021-12-15
location: Strasbourg, France
name: OPODIS
start_date: 2021-12-13
date_created: 2022-04-17T22:01:47Z
date_published: 2022-02-01T00:00:00Z
date_updated: 2022-05-02T08:09:39Z
day: '01'
ddc:
- '510'
department:
- _id: DaAl
doi: 10.4230/LIPIcs.OPODIS.2021.14
ec_funded: 1
editor:
- first_name: Quentin
full_name: Bramas, Quentin
last_name: Bramas
- first_name: Vincent
full_name: Gramoli, Vincent
last_name: Gramoli
- first_name: Alessia
full_name: Milani, Alessia
last_name: Milani
external_id:
arxiv:
- '2102.08808'
file:
- access_level: open_access
checksum: 2c7c982174c6f98c4ca6e92539d15086
content_type: application/pdf
creator: dernst
date_created: 2022-05-02T08:06:33Z
date_updated: 2022-05-02T08:06:33Z
file_id: '11346'
file_name: 2022_LIPICs_Alistarh.pdf
file_size: 959406
relation: main_file
success: 1
file_date_updated: 2022-05-02T08:06:33Z
has_accepted_license: '1'
intvolume: ' 217'
language:
- iso: eng
month: '02'
oa: 1
oa_version: Published Version
project:
- _id: 268A44D6-B435-11E9-9278-68D0E5697425
call_identifier: H2020
grant_number: '805223'
name: Elastic Coordination for Scalable Machine Learning
- _id: 26A5D39A-B435-11E9-9278-68D0E5697425
call_identifier: H2020
grant_number: '840605'
name: Coordination in constrained and natural distributed systems
publication: 25th International Conference on Principles of Distributed Systems
publication_identifier:
isbn:
- '9783959772198'
issn:
- 1868-8969
publication_status: published
publisher: Schloss Dagstuhl - Leibniz-Zentrum für Informatik
quality_controlled: '1'
scopus_import: '1'
status: public
title: Fast graphical population protocols
tmp:
image: /images/cc_by.png
legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
short: CC BY (4.0)
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 217
year: '2022'
...
---
_id: '12780'
abstract:
- lang: eng
text: "The ability to scale out training workloads has been one of the key performance
enablers of deep learning. The main scaling approach is data-parallel GPU-based
training, which has been boosted by hardware and software support for highly efficient
point-to-point communication, and in particular via hardware bandwidth over-provisioning.
Overprovisioning comes at a cost: there is an order of magnitude price difference
between \"cloud-grade\" servers with such support, relative to their popular \"consumer-grade\"
counterparts, although single server-grade and consumer-grade GPUs can have similar
computational envelopes.\r\n\r\nIn this paper, we show that the costly hardware
overprovisioning approach can be supplanted via algorithmic and system design,
and propose a framework called CGX, which provides efficient software support
for compressed communication in ML applications, for both multi-GPU single-node
training, as well as larger-scale multi-node training. CGX is based on two technical
advances: At the system level, it relies on a re-developed communication stack
for ML frameworks, which provides flexible, highly-efficient support for compressed
communication. At the application level, it provides seamless, parameter-free
integration with popular frameworks, so that end-users do not have to modify training
recipes, nor significant training code. This is complemented by a layer-wise adaptive
compression technique which dynamically balances compression gains with accuracy
preservation. CGX integrates with popular ML frameworks, providing up to 3X speedups
for multi-GPU nodes based on commodity hardware, and order-of-magnitude improvements
in the multi-node setting, with negligible impact on accuracy."
acknowledgement: The authors sincerely thank Nikoli Dryden, Tal Ben-Nun, Torsten Hoefler
and Bapi Chatterjee for useful discussions throughout the development of this project.
article_processing_charge: Yes (via OA deal)
author:
- first_name: Ilia
full_name: Markov, Ilia
id: D0CF4148-C985-11E9-8066-0BDEE5697425
last_name: Markov
- first_name: Hamidreza
full_name: Ramezanikebrya, Hamidreza
last_name: Ramezanikebrya
- first_name: Dan-Adrian
full_name: Alistarh, Dan-Adrian
id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
last_name: Alistarh
orcid: 0000-0003-3650-940X
citation:
ama: 'Markov I, Ramezanikebrya H, Alistarh D-A. CGX: Adaptive system support for
communication-efficient deep learning. In: Proceedings of the 23rd ACM/IFIP
International Middleware Conference. Association for Computing Machinery;
2022:241-254. doi:10.1145/3528535.3565248'
apa: 'Markov, I., Ramezanikebrya, H., & Alistarh, D.-A. (2022). CGX: Adaptive
system support for communication-efficient deep learning. In Proceedings of
the 23rd ACM/IFIP International Middleware Conference (pp. 241–254). Quebec,
QC, Canada: Association for Computing Machinery. https://doi.org/10.1145/3528535.3565248'
chicago: 'Markov, Ilia, Hamidreza Ramezanikebrya, and Dan-Adrian Alistarh. “CGX:
Adaptive System Support for Communication-Efficient Deep Learning.” In Proceedings
of the 23rd ACM/IFIP International Middleware Conference, 241–54. Association
for Computing Machinery, 2022. https://doi.org/10.1145/3528535.3565248.'
ieee: 'I. Markov, H. Ramezanikebrya, and D.-A. Alistarh, “CGX: Adaptive system support
for communication-efficient deep learning,” in Proceedings of the 23rd ACM/IFIP
International Middleware Conference, Quebec, QC, Canada, 2022, pp. 241–254.'
ista: 'Markov I, Ramezanikebrya H, Alistarh D-A. 2022. CGX: Adaptive system support
for communication-efficient deep learning. Proceedings of the 23rd ACM/IFIP International
Middleware Conference. Middleware: International Middleware Conference, 241–254.'
mla: 'Markov, Ilia, et al. “CGX: Adaptive System Support for Communication-Efficient
Deep Learning.” Proceedings of the 23rd ACM/IFIP International Middleware Conference,
Association for Computing Machinery, 2022, pp. 241–54, doi:10.1145/3528535.3565248.'
short: I. Markov, H. Ramezanikebrya, D.-A. Alistarh, in:, Proceedings of the 23rd
ACM/IFIP International Middleware Conference, Association for Computing Machinery,
2022, pp. 241–254.
conference:
end_date: 2022-11-11
location: Quebec, QC, Canada
name: 'Middleware: International Middleware Conference'
start_date: 2022-11-07
date_created: 2023-03-31T06:17:00Z
date_published: 2022-11-01T00:00:00Z
date_updated: 2023-04-03T06:21:04Z
day: '01'
ddc:
- '000'
department:
- _id: DaAl
doi: 10.1145/3528535.3565248
external_id:
arxiv:
- '2111.08617'
file:
- access_level: open_access
checksum: 1a397746235f245da5468819247ff663
content_type: application/pdf
creator: dernst
date_created: 2023-04-03T06:17:58Z
date_updated: 2023-04-03T06:17:58Z
file_id: '12795'
file_name: 2022_ACMMiddleware_Markov.pdf
file_size: 1514169
relation: main_file
success: 1
file_date_updated: 2023-04-03T06:17:58Z
has_accepted_license: '1'
language:
- iso: eng
month: '11'
oa: 1
oa_version: Published Version
page: 241-254
publication: Proceedings of the 23rd ACM/IFIP International Middleware Conference
publication_identifier:
isbn:
- '9781450393409'
publication_status: published
publisher: Association for Computing Machinery
quality_controlled: '1'
status: public
title: 'CGX: Adaptive system support for communication-efficient deep learning'
tmp:
image: /images/cc_by.png
legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
short: CC BY (4.0)
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2022'
...
---
_id: '11844'
abstract:
- lang: eng
text: "In the stochastic population protocol model, we are given a connected graph
with n nodes, and in every time step, a scheduler samples an edge of the graph
uniformly at random and the nodes connected by this edge interact. A fundamental
task in this model is stable leader election, in which all nodes start in an identical
state and the aim is to reach a configuration in which (1) exactly one node is
elected as leader and (2) this node remains as the unique leader no matter what
sequence of interactions follows. On cliques, the complexity of this problem has
recently been settled: time-optimal protocols stabilize in Θ(n log n) expected
steps using Θ(log log n) states, whereas protocols that use O(1) states require
Θ(n2) expected steps.\r\n\r\nIn this work, we investigate the complexity of stable
leader election on general graphs. We provide the first non-trivial time lower
bounds for leader election on general graphs, showing that, when moving beyond
cliques, the complexity landscape of leader election becomes very diverse: the
time required to elect a leader can range from O(1) to Θ(n3) expected steps. On
the upper bound side, we first observe that there exists a protocol that is time-optimal
on many graph families, but uses polynomially-many states. In contrast, we give
a near-time-optimal protocol that uses only O(log2n) states that is at most a
factor log n slower. Finally, we show that the constant-state protocol of Beauquier
et al. [OPODIS 2013] is at most a factor n log n slower than the fast polynomial-state
protocol. Moreover, among constant-state protocols, this protocol has near-optimal
average case complexity on dense random graphs."
acknowledgement: We thank the anonymous reviewers for their helpful comments. We gratefully
acknowledge funding from the European Research Council (ERC) under the European
Union’s Horizon 2020 research and innovation programme (grant agreement No 805223
ScaleML).
article_processing_charge: Yes (via OA deal)
author:
- first_name: Dan-Adrian
full_name: Alistarh, Dan-Adrian
id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
last_name: Alistarh
orcid: 0000-0003-3650-940X
- first_name: Joel
full_name: Rybicki, Joel
id: 334EFD2E-F248-11E8-B48F-1D18A9856A87
last_name: Rybicki
orcid: 0000-0002-6432-6646
- first_name: Sasha
full_name: Voitovych, Sasha
last_name: Voitovych
citation:
ama: 'Alistarh D-A, Rybicki J, Voitovych S. Near-optimal leader election in population
protocols on graphs. In: Proceedings of the Annual ACM Symposium on Principles
of Distributed Computing. Association for Computing Machinery; 2022:246-256.
doi:10.1145/3519270.3538435'
apa: 'Alistarh, D.-A., Rybicki, J., & Voitovych, S. (2022). Near-optimal leader
election in population protocols on graphs. In Proceedings of the Annual ACM
Symposium on Principles of Distributed Computing (pp. 246–256). Salerno, Italy:
Association for Computing Machinery. https://doi.org/10.1145/3519270.3538435'
chicago: Alistarh, Dan-Adrian, Joel Rybicki, and Sasha Voitovych. “Near-Optimal
Leader Election in Population Protocols on Graphs.” In Proceedings of the Annual
ACM Symposium on Principles of Distributed Computing, 246–56. Association
for Computing Machinery, 2022. https://doi.org/10.1145/3519270.3538435.
ieee: D.-A. Alistarh, J. Rybicki, and S. Voitovych, “Near-optimal leader election
in population protocols on graphs,” in Proceedings of the Annual ACM Symposium
on Principles of Distributed Computing, Salerno, Italy, 2022, pp. 246–256.
ista: 'Alistarh D-A, Rybicki J, Voitovych S. 2022. Near-optimal leader election
in population protocols on graphs. Proceedings of the Annual ACM Symposium on
Principles of Distributed Computing. PODC: Symposium on Principles of Distributed
Computing, 246–256.'
mla: Alistarh, Dan-Adrian, et al. “Near-Optimal Leader Election in Population Protocols
on Graphs.” Proceedings of the Annual ACM Symposium on Principles of Distributed
Computing, Association for Computing Machinery, 2022, pp. 246–56, doi:10.1145/3519270.3538435.
short: D.-A. Alistarh, J. Rybicki, S. Voitovych, in:, Proceedings of the Annual
ACM Symposium on Principles of Distributed Computing, Association for Computing
Machinery, 2022, pp. 246–256.
conference:
end_date: 2022-07-29
location: Salerno, Italy
name: 'PODC: Symposium on Principles of Distributed Computing'
start_date: 2022-07-25
date_created: 2022-08-14T22:01:46Z
date_published: 2022-07-21T00:00:00Z
date_updated: 2023-06-14T12:06:01Z
day: '21'
ddc:
- '000'
department:
- _id: DaAl
doi: 10.1145/3519270.3538435
ec_funded: 1
external_id:
arxiv:
- '2205.12597'
file:
- access_level: open_access
checksum: 4c6b29172b8e355b4fbc364a2e0827b2
content_type: application/pdf
creator: cchlebak
date_created: 2022-08-16T08:05:15Z
date_updated: 2022-08-16T08:05:15Z
file_id: '11854'
file_name: 2022_PODC_Alistarh.pdf
file_size: 1593474
relation: main_file
success: 1
file_date_updated: 2022-08-16T08:05:15Z
has_accepted_license: '1'
language:
- iso: eng
month: '07'
oa: 1
oa_version: Published Version
page: 246-256
project:
- _id: 268A44D6-B435-11E9-9278-68D0E5697425
call_identifier: H2020
grant_number: '805223'
name: Elastic Coordination for Scalable Machine Learning
publication: Proceedings of the Annual ACM Symposium on Principles of Distributed
Computing
publication_identifier:
isbn:
- '9781450392624'
publication_status: published
publisher: Association for Computing Machinery
quality_controlled: '1'
scopus_import: '1'
status: public
title: Near-optimal leader election in population protocols on graphs
tmp:
image: /images/cc_by.png
legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
short: CC BY (4.0)
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2022'
...
---
_id: '11181'
abstract:
- lang: eng
text: 'To maximize the performance of concurrent data structures, researchers have
often turned to highly complex fine-grained techniques, resulting in efficient
and elegant algorithms, which can however be often difficult to understand and
prove correct. While simpler techniques exist, such as transactional memory, they
can have limited performance or portability relative to their fine-grained counterparts.
Approaches at both ends of this complexity-performance spectrum have been extensively
explored, but relatively less is known about the middle ground: approaches that
are willing to sacrifice some performance for simplicity, while remaining competitive
with state-of-the-art handcrafted designs. In this paper, we explore this middle
ground, and present PathCAS, a primitive that combines ideas from multi-word CAS
(KCAS) and transactional memory approaches, while carefully avoiding overhead.
We show how PathCAS can be used to implement efficient search data structures
relatively simply, using an internal binary search tree as an example, then extending
this to an AVL tree. Our best implementations outperform many handcrafted search
trees: in search-heavy workloads, it rivals the BCCO tree [5], the fastest known
concurrent binary tree in terms of search performance [3]. Our results suggest
that PathCAS can yield concurrent data structures that are relatively easy to
build and prove correct, while offering surprisingly high performance.'
acknowledgement: "This work was supported by: the Natural Sciences and Engineering
Research Council of Canada (NSERC) Collaborative Research and Development grant:
CRDPJ 539431-19, the\r\nCanada Foundation for Innovation John R. Evans Leaders Fund
with equal support from the Ontario Research Fund CFI Leaders Opportunity Fund:
38512, Waterloo Huawei Joint Innovation Lab project “Scalable Infrastructure for
Next Generation Data Management Systems”, NSERC Discovery Launch Supplement: DGECR-2019-00048,
NSERC Discovery\r\nProgram under the grants: RGPIN-2019-04227 and RGPIN04512-2018,
and the University of Waterloo. We would also like to thank the reviewers for their
insightful comments."
article_processing_charge: No
author:
- first_name: Trevor A
full_name: Brown, Trevor A
id: 3569F0A0-F248-11E8-B48F-1D18A9856A87
last_name: Brown
- first_name: William
full_name: Sigouin, William
last_name: Sigouin
- first_name: Dan-Adrian
full_name: Alistarh, Dan-Adrian
id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
last_name: Alistarh
orcid: 0000-0003-3650-940X
citation:
ama: 'Brown TA, Sigouin W, Alistarh D-A. PathCAS: An efficient middle ground for
concurrent search data structures. In: Proceedings of the 27th ACM SIGPLAN
Symposium on Principles and Practice of Parallel Programming. Association
for Computing Machinery; 2022:385-399. doi:10.1145/3503221.3508410'
apa: 'Brown, T. A., Sigouin, W., & Alistarh, D.-A. (2022). PathCAS: An efficient
middle ground for concurrent search data structures. In Proceedings of the
27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
(pp. 385–399). Seoul, Republic of Korea: Association for Computing Machinery.
https://doi.org/10.1145/3503221.3508410'
chicago: 'Brown, Trevor A, William Sigouin, and Dan-Adrian Alistarh. “PathCAS: An
Efficient Middle Ground for Concurrent Search Data Structures.” In Proceedings
of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming,
385–99. Association for Computing Machinery, 2022. https://doi.org/10.1145/3503221.3508410.'
ieee: 'T. A. Brown, W. Sigouin, and D.-A. Alistarh, “PathCAS: An efficient middle
ground for concurrent search data structures,” in Proceedings of the 27th ACM
SIGPLAN Symposium on Principles and Practice of Parallel Programming, Seoul,
Republic of Korea, 2022, pp. 385–399.'
ista: 'Brown TA, Sigouin W, Alistarh D-A. 2022. PathCAS: An efficient middle ground
for concurrent search data structures. Proceedings of the 27th ACM SIGPLAN Symposium
on Principles and Practice of Parallel Programming. PPoPP: Sympopsium on Principles
and Practice of Parallel Programming, 385–399.'
mla: 'Brown, Trevor A., et al. “PathCAS: An Efficient Middle Ground for Concurrent
Search Data Structures.” Proceedings of the 27th ACM SIGPLAN Symposium on Principles
and Practice of Parallel Programming, Association for Computing Machinery,
2022, pp. 385–99, doi:10.1145/3503221.3508410.'
short: T.A. Brown, W. Sigouin, D.-A. Alistarh, in:, Proceedings of the 27th ACM
SIGPLAN Symposium on Principles and Practice of Parallel Programming, Association
for Computing Machinery, 2022, pp. 385–399.
conference:
end_date: 2022-04-06
location: Seoul, Republic of Korea
name: 'PPoPP: Sympopsium on Principles and Practice of Parallel Programming'
start_date: 2022-04-02
date_created: 2022-04-17T22:01:46Z
date_published: 2022-04-02T00:00:00Z
date_updated: 2023-08-03T06:49:20Z
day: '02'
ddc:
- '000'
department:
- _id: DaAl
doi: 10.1145/3503221.3508410
external_id:
isi:
- '000883318200027'
file:
- access_level: open_access
checksum: 8ceea411fa133795cd4903529498eb6b
content_type: application/pdf
creator: dernst
date_created: 2022-08-05T09:19:29Z
date_updated: 2022-08-05T09:19:29Z
file_id: '11731'
file_name: 2022_PPoPP_Brown.pdf
file_size: 1128343
relation: main_file
success: 1
file_date_updated: 2022-08-05T09:19:29Z
has_accepted_license: '1'
isi: 1
language:
- iso: eng
month: '04'
oa: 1
oa_version: Published Version
page: 385-399
publication: Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice
of Parallel Programming
publication_identifier:
isbn:
- '9781450392044'
publication_status: published
publisher: Association for Computing Machinery
quality_controlled: '1'
scopus_import: '1'
status: public
title: 'PathCAS: An efficient middle ground for concurrent search data structures'
tmp:
image: /images/cc_by.png
legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
short: CC BY (4.0)
type: conference
user_id: 4359f0d1-fa6c-11eb-b949-802e58b17ae8
year: '2022'
...
---
_id: '11180'
abstract:
- lang: eng
text: "Designing and implementing efficient parallel priority schedulers is an active
research area. An intriguing proposed design is the Multi-Queue: given n threads
and m ≥ n distinct priority queues, task insertions are performed uniformly at
random, while, to delete, a thread picks two queues uniformly at random, and removes
the observed task of higher priority. This approach scales well, and has probabilistic
rank guarantees: roughly, the rank of each task removed, relative to remaining
tasks in all other queues, is O (m) in expectation. Yet, the performance of this
pattern is below that of well-engineered schedulers, which eschew theoretical
guarantees for practical efficiency.\r\n\r\nWe investigate whether it is possible
to design and implement a Multi-Queue-based task scheduler that is both highly-efficient
and has analytical guarantees. We propose a new variant called the Stealing Multi-Queue
(SMQ), a cache-efficient variant of the Multi-Queue, which leverages both queue
affinity---each thread has a local queue, from which tasks are usually removed;
but, with some probability, threads also attempt to steal higher-priority tasks
from the other queues---and task batching, that is, the processing of several
tasks in a single insert / remove step. These ideas are well-known for task scheduling
without priorities; our theoretical contribution is showing that, despite relaxations,
this design can still provide rank guarantees, which in turn implies bounds on
total work performed. We provide a general SMQ implementation which can surpass
state-of-the-art schedulers such as OBIM and PMOD in terms of performance on popular
graph-processing benchmarks. Notably, the performance improvement comes mainly
from the superior rank guarantees provided by our scheduler, confirming that analytically-reasoned
approaches can still provide performance improvements for priority task scheduling."
acknowledgement: We would like to thank the anonymous reviewers for their useful comments.
This project has received funding from the European Research Council (ERC) under
the European Union’s Horizon 2020 research and innovation programme (grant agreement
No 805223 ScaleML).
article_processing_charge: No
author:
- first_name: Anastasiia
full_name: Postnikova, Anastasiia
last_name: Postnikova
- first_name: Nikita
full_name: Koval, Nikita
id: 2F4DB10C-F248-11E8-B48F-1D18A9856A87
last_name: Koval
- first_name: Giorgi
full_name: Nadiradze, Giorgi
id: 3279A00C-F248-11E8-B48F-1D18A9856A87
last_name: Nadiradze
- first_name: Dan-Adrian
full_name: Alistarh, Dan-Adrian
id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
last_name: Alistarh
orcid: 0000-0003-3650-940X
citation:
ama: 'Postnikova A, Koval N, Nadiradze G, Alistarh D-A. Multi-queues can be state-of-the-art
priority schedulers. In: Proceedings of the 27th ACM SIGPLAN Symposium on Principles
and Practice of Parallel Programming. Association for Computing Machinery;
2022:353-367. doi:10.1145/3503221.3508432'
apa: 'Postnikova, A., Koval, N., Nadiradze, G., & Alistarh, D.-A. (2022). Multi-queues
can be state-of-the-art priority schedulers. In Proceedings of the 27th ACM
SIGPLAN Symposium on Principles and Practice of Parallel Programming (pp.
353–367). Seoul, Republic of Korea: Association for Computing Machinery. https://doi.org/10.1145/3503221.3508432'
chicago: Postnikova, Anastasiia, Nikita Koval, Giorgi Nadiradze, and Dan-Adrian
Alistarh. “Multi-Queues Can Be State-of-the-Art Priority Schedulers.” In Proceedings
of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming,
353–67. Association for Computing Machinery, 2022. https://doi.org/10.1145/3503221.3508432.
ieee: A. Postnikova, N. Koval, G. Nadiradze, and D.-A. Alistarh, “Multi-queues can
be state-of-the-art priority schedulers,” in Proceedings of the 27th ACM SIGPLAN
Symposium on Principles and Practice of Parallel Programming, Seoul, Republic
of Korea, 2022, pp. 353–367.
ista: 'Postnikova A, Koval N, Nadiradze G, Alistarh D-A. 2022. Multi-queues can
be state-of-the-art priority schedulers. Proceedings of the 27th ACM SIGPLAN Symposium
on Principles and Practice of Parallel Programming. PPoPP: Sympopsium on Principles
and Practice of Parallel Programming, 353–367.'
mla: Postnikova, Anastasiia, et al. “Multi-Queues Can Be State-of-the-Art Priority
Schedulers.” Proceedings of the 27th ACM SIGPLAN Symposium on Principles and
Practice of Parallel Programming, Association for Computing Machinery, 2022,
pp. 353–67, doi:10.1145/3503221.3508432.
short: A. Postnikova, N. Koval, G. Nadiradze, D.-A. Alistarh, in:, Proceedings of
the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming,
Association for Computing Machinery, 2022, pp. 353–367.
conference:
end_date: 2022-04-06
location: Seoul, Republic of Korea
name: 'PPoPP: Sympopsium on Principles and Practice of Parallel Programming'
start_date: 2022-04-02
date_created: 2022-04-17T22:01:46Z
date_published: 2022-04-02T00:00:00Z
date_updated: 2023-08-03T06:48:35Z
day: '02'
department:
- _id: DaAl
doi: 10.1145/3503221.3508432
ec_funded: 1
external_id:
arxiv:
- '2109.00657'
isi:
- '000883318200025'
isi: 1
language:
- iso: eng
main_file_link:
- open_access: '1'
url: ' https://doi.org/10.48550/arXiv.2109.00657'
month: '04'
oa: 1
oa_version: Preprint
page: 353-367
project:
- _id: 268A44D6-B435-11E9-9278-68D0E5697425
call_identifier: H2020
grant_number: '805223'
name: Elastic Coordination for Scalable Machine Learning
publication: Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice
of Parallel Programming
publication_identifier:
isbn:
- '9781450392044'
publication_status: published
publisher: Association for Computing Machinery
quality_controlled: '1'
related_material:
record:
- id: '13076'
relation: research_data
status: public
scopus_import: '1'
status: public
title: Multi-queues can be state-of-the-art priority schedulers
type: conference
user_id: 4359f0d1-fa6c-11eb-b949-802e58b17ae8
year: '2022'
...
---
_id: '13076'
abstract:
- lang: eng
text: "The source code for replicating experiments presented in the paper.\r\n\r\nThe
implementation of the designed priority schedulers can be found in Galois-2.2.1/include/Galois/WorkList/:\r\nStealingMultiQueue.h
is the StealingMultiQueue.\r\nMQOptimized/ contains MQ Optimized variants.\r\n\r\nWe
provide images that contain all the dependencies and datasets. Images can be pulled
from npostnikova/mq-based-schedulers repository, or downloaded from Zenodo. See
readme for more detail."
article_processing_charge: No
author:
- first_name: Anastasiia
full_name: Postnikova, Anastasiia
last_name: Postnikova
- first_name: Nikita
full_name: Koval, Nikita
id: 2F4DB10C-F248-11E8-B48F-1D18A9856A87
last_name: Koval
- first_name: Giorgi
full_name: Nadiradze, Giorgi
id: 3279A00C-F248-11E8-B48F-1D18A9856A87
last_name: Nadiradze
- first_name: Dan-Adrian
full_name: Alistarh, Dan-Adrian
id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
last_name: Alistarh
orcid: 0000-0003-3650-940X
citation:
ama: Postnikova A, Koval N, Nadiradze G, Alistarh D-A. Multi-queues can be state-of-the-art
priority schedulers. 2022. doi:10.5281/ZENODO.5733408
apa: Postnikova, A., Koval, N., Nadiradze, G., & Alistarh, D.-A. (2022). Multi-queues
can be state-of-the-art priority schedulers. Zenodo. https://doi.org/10.5281/ZENODO.5733408
chicago: Postnikova, Anastasiia, Nikita Koval, Giorgi Nadiradze, and Dan-Adrian
Alistarh. “Multi-Queues Can Be State-of-the-Art Priority Schedulers.” Zenodo,
2022. https://doi.org/10.5281/ZENODO.5733408.
ieee: A. Postnikova, N. Koval, G. Nadiradze, and D.-A. Alistarh, “Multi-queues can
be state-of-the-art priority schedulers.” Zenodo, 2022.
ista: Postnikova A, Koval N, Nadiradze G, Alistarh D-A. 2022. Multi-queues can be
state-of-the-art priority schedulers, Zenodo, 10.5281/ZENODO.5733408.
mla: Postnikova, Anastasiia, et al. Multi-Queues Can Be State-of-the-Art Priority
Schedulers. Zenodo, 2022, doi:10.5281/ZENODO.5733408.
short: A. Postnikova, N. Koval, G. Nadiradze, D.-A. Alistarh, (2022).
date_created: 2023-05-23T17:05:40Z
date_published: 2022-01-03T00:00:00Z
date_updated: 2023-08-03T06:48:34Z
day: '03'
ddc:
- '510'
department:
- _id: DaAl
doi: 10.5281/ZENODO.5733408
main_file_link:
- open_access: '1'
url: https://doi.org/10.5281/zenodo.5813846
month: '01'
oa: 1
oa_version: Published Version
publisher: Zenodo
related_material:
link:
- relation: software
url: https://github.com/npostnikova/mq-based-schedulers/tree/v1.1
record:
- id: '11180'
relation: used_in_publication
status: public
status: public
title: Multi-queues can be state-of-the-art priority schedulers
type: research_data_reference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
year: '2022'
...