This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

Search for Publication

Year(s) from:  to 
Keywords (separated by spaces):

One-Shot Video Object Segmentation

S. Caelles, K.K. Maninis, J. Pont-Tuset, L. Leal-Taixé, D. Cremers, L. Van Gool


This paper tackles the task of semi-supervised video object segmentation, i.e., the separation of an object from the background in a video, given the mask of the first frame. We present One-Shot Video Object Segmentation (OSVOS), based on a fully-convolutional neural network architecture that is able to successively transfer generic semantic information, learned on ImageNet, to the task of foreground segmentation, and finally to learning the appearance of a single annotated object of the test sequence (hence one-shot). Although all frames are processed independently, the results are temporally coherent and stable. We perform experiments on two annotated video segmentation databases, which show that OSVOS is fast and improves the state of the art by a significant margin 79.8% vs 68.0%).

Download in pdf format
  author = {S. Caelles and K.K. Maninis and J. Pont-Tuset and L. Leal-Taixé and D. Cremers and L. Van Gool},
  title = {One-Shot Video Object Segmentation},
  booktitle = {CVPR},
  year = {2017},
  keywords = {}