Unsupervised object-centric video generation and decomposition in 3D

Henderson PM, Lampert C. 2020. Unsupervised object-centric video generation and decomposition in 3D. 34th Conference on Neural Information Processing Systems. NeurIPS: Neural Information Processing Systems vol. 33, 3106–3117.

Conference Paper | Published | English
Department
Abstract
A natural approach to generative modeling of videos is to represent them as a composition of moving objects. Recent works model a set of 2D sprites over a slowly-varying background, but without considering the underlying 3D scene that gives rise to them. We instead propose to model a video as the view seen while moving through a scene with multiple 3D objects and a 3D background. Our model is trained from monocular videos without any supervision, yet learns to generate coherent 3D scenes containing several moving objects. We conduct detailed experiments on two datasets, going beyond the visual complexity supported by state-of-the-art generative approaches. We evaluate our method on depth-prediction and 3D object detection---tasks which cannot be addressed by those earlier works---and show it out-performs them even on 2D instance segmentation and tracking.
Publishing Year
Date Published
2020-07-07
Proceedings Title
34th Conference on Neural Information Processing Systems
Acknowledgement
This research was supported by the Scientific Service Units (SSU) of IST Austria through resources provided by Scientific Computing (SciComp). PH is employed part-time by Blackford Analysis, but they did not support this project in any way.
Acknowledged SSUs
Volume
33
Page
3106–3117
Conference
NeurIPS: Neural Information Processing Systems
Conference Location
Vancouver, Canada
Conference Date
2020-12-06 – 2020-12-12
IST-REx-ID

Cite this

Henderson PM, Lampert C. Unsupervised object-centric video generation and decomposition in 3D. In: 34th Conference on Neural Information Processing Systems. Vol 33. Curran Associates; 2020:3106–3117.
Henderson, P. M., & Lampert, C. (2020). Unsupervised object-centric video generation and decomposition in 3D. In 34th Conference on Neural Information Processing Systems (Vol. 33, pp. 3106–3117). Vancouver, Canada: Curran Associates.
Henderson, Paul M, and Christoph Lampert. “Unsupervised Object-Centric Video Generation and Decomposition in 3D.” In 34th Conference on Neural Information Processing Systems, 33:3106–3117. Curran Associates, 2020.
P. M. Henderson and C. Lampert, “Unsupervised object-centric video generation and decomposition in 3D,” in 34th Conference on Neural Information Processing Systems, Vancouver, Canada, 2020, vol. 33, pp. 3106–3117.
Henderson PM, Lampert C. 2020. Unsupervised object-centric video generation and decomposition in 3D. 34th Conference on Neural Information Processing Systems. NeurIPS: Neural Information Processing Systems vol. 33, 3106–3117.
Henderson, Paul M., and Christoph Lampert. “Unsupervised Object-Centric Video Generation and Decomposition in 3D.” 34th Conference on Neural Information Processing Systems, vol. 33, Curran Associates, 2020, pp. 3106–3117.
All files available under the following license(s):
Copyright Statement:
This Item is protected by copyright and/or related rights. [...]

Link(s) to Main File(s)
Access Level
OA Open Access

Export

Marked Publications

Open Data IST Research Explorer

Sources

arXiv 2007.06705

Search this title in

Google Scholar