Recognizing multimodal entailment

Ilharco C, Shirazi A, Gopalan A, Nagrani A, Bratanič B, Bregler C, Liu C, Ferreira F, Barcik G, Ilharco G, Osang GF, Bulian J, Frank J, Smaira L, Cao Q, Marino R, Patel R, Leung T, Imbrasaite V. 2021. Recognizing multimodal entailment. 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Tutorial Abstracts. ACL: Association for Computational Linguistics ; IJCNLP: International Joint Conference on Natural Language Processing, 29–30.

Download
OA 2021_ACL_Ilharco.pdf 1.23 MB

Conference Paper | Published | English

Scopus indexed
Author
Ilharco, Cesar; Shirazi, Afsaneh; Gopalan, Arjun; Nagrani, Arsha; Bratanič, Blaž; Bregler, Chris; Liu, Christina; Ferreira, Felipe; Barcik, Gabriek; Ilharco, Gabriel; Osang, Georg FISTA; Bulian, Jannis
All
Department
Abstract
How information is created, shared and consumed has changed rapidly in recent decades, in part thanks to new social platforms and technologies on the web. With ever-larger amounts of unstructured and limited labels, organizing and reconciling information from different sources and modalities is a central challenge in machine learning. This cutting-edge tutorial aims to introduce the multimodal entailment task, which can be useful for detecting semantic alignments when a single modality alone does not suffice for a whole content understanding. Starting with a brief overview of natural language processing, computer vision, structured data and neural graph learning, we lay the foundations for the multimodal sections to follow. We then discuss recent multimodal learning literature covering visual, audio and language streams, and explore case studies focusing on tasks which require fine-grained understanding of visual and linguistic semantics question answering, veracity and hatred classification. Finally, we introduce a new dataset for recognizing multimodal entailment, exploring it in a hands-on collaborative section. Overall, this tutorial gives an overview of multimodal learning, introduces a multimodal entailment dataset, and encourages future research in the topic.
Publishing Year
Date Published
2021-08-01
Proceedings Title
59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Tutorial Abstracts
Acknowledgement
We would like to thank Abby Schantz, Abe Ittycheriah, Aliaksei Severyn, Allan Heydon, Aly Grealish, Andrey Vlasov, Arkaitz Zubiaga, Ashwin Kakarla, Chen Sun, Clayton Williams, Cong Yu, Cordelia Schmid, Da-Cheng Juan, Dan Finnie, Dani Valevski, Daniel Rocha, David Price, David Sklar, Devi Krishna, Elena Kochkina, Enrique Alfonseca, Franc¸oise Beaufays, Isabelle Augenstein, Jialu Liu, John Cantwell, John Palowitch, Jordan Boyd-Graber, Lei Shi, Luis Valente, Maria Voitovich, Mehmet Aktuna, Mogan Brown, Mor Naaman, Natalia P, Nidhi Hebbar, Pete Aykroyd, Rahul Sukthankar, Richa Dixit, Steve Pucci, Tania Bedrax-Weiss, Tobias Kaufmann, Tom Boulos, Tu Tsao, Vladimir Chtchetkine, Yair Kurzion, Yifan Xu and Zach Hynes.
Page
29-30
Conference
ACL: Association for Computational Linguistics ; IJCNLP: International Joint Conference on Natural Language Processing
Conference Location
Bangkok, Thailand
Conference Date
2021-08-01 – 2021-08-06
IST-REx-ID

Cite this

Ilharco C, Shirazi A, Gopalan A, et al. Recognizing multimodal entailment. In: 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Tutorial Abstracts. Association for Computational Linguistics; 2021:29-30. doi:10.18653/v1/2021.acl-tutorials.6
Ilharco, C., Shirazi, A., Gopalan, A., Nagrani, A., Bratanič, B., Bregler, C., … Imbrasaite, V. (2021). Recognizing multimodal entailment. In 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Tutorial Abstracts (pp. 29–30). Bangkok, Thailand: Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.acl-tutorials.6
Ilharco, Cesar, Afsaneh Shirazi, Arjun Gopalan, Arsha Nagrani, Blaž Bratanič, Chris Bregler, Christina Liu, et al. “Recognizing Multimodal Entailment.” In 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Tutorial Abstracts, 29–30. Association for Computational Linguistics, 2021. https://doi.org/10.18653/v1/2021.acl-tutorials.6.
C. Ilharco et al., “Recognizing multimodal entailment,” in 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Tutorial Abstracts, Bangkok, Thailand, 2021, pp. 29–30.
Ilharco C, Shirazi A, Gopalan A, Nagrani A, Bratanič B, Bregler C, Liu C, Ferreira F, Barcik G, Ilharco G, Osang GF, Bulian J, Frank J, Smaira L, Cao Q, Marino R, Patel R, Leung T, Imbrasaite V. 2021. Recognizing multimodal entailment. 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Tutorial Abstracts. ACL: Association for Computational Linguistics ; IJCNLP: International Joint Conference on Natural Language Processing, 29–30.
Ilharco, Cesar, et al. “Recognizing Multimodal Entailment.” 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Tutorial Abstracts, Association for Computational Linguistics, 2021, pp. 29–30, doi:10.18653/v1/2021.acl-tutorials.6.
All files available under the following license(s):
Creative Commons Attribution 4.0 International Public License (CC-BY 4.0):
Main File(s)
File Name
Access Level
OA Open Access
Date Uploaded
2021-11-29
MD5 Checksum
b14052a025a6ecf675bdfe51db98c0d7


Link(s) to Main File(s)
Access Level
OA Open Access

Export

Marked Publications

Open Data ISTA Research Explorer

Search this title in

Google Scholar
ISBN Search