The recent success of deep learning has shown that a deep architecture in conjunction with abundant quantities of labeled training data is the most promising approach for many vision tasks. However, annotating a large-scale dataset for training such deep neural networks is costly and time-consuming, even with the availability of scalable crowdsourcing platforms like Amazon’s Mechanical Turk. As a result, there are relatively few public large-scale datasets (e.g., ImageNet and Places2) from which it is possible to learn generic visual representations from scratch.
Thus, it is unsurprising that there is continued interest in developing novel deep learning systems that trained on low-cost data for image and video recognition tasks. Among different solutions, crawling data from Internet and using the web as a source of supervision for learning deep representations has shown promising performance for a variety of important computer vision applications. However, the datasets and tasks differ in various ways, which makes it difficult to fairly evaluate different solutions, and identify the key issues when learning from web data.
This workshop aims at promoting the advance of learning state-of-the-art visual models directly from the web, and bringing together computer vision researchers in this field. To this end, we release a large scale web image dataset named WebVision or visual understanding by learning from web data. The datasets consists of 16 million of web images crawled from Internet for 5,000 visual concepts. A validation set consists of around 290K images with human annotation will be provided for the convenience of algorithmic development.
Based on this dataset, we also organize the 3rd Challenge on Visual Understanding by Learning from Web Data. The final results will be announced at the workshop, and the winners will be invited to present their approaches at the workshop. An invited paper tack will also be included in the workshop.
News 03.03.2019: A benchmark model based on ResNet-50 is released for reference, which achieves 71.49% top5 accuracy on the validation set. Thank Mr. Qin Wang for producing this benchmark model.
News 26.02.2019: The WebVision 2019 challenge will start on March 1st, 2019.
News 03.01.2019: The workshop website is now online.
|Challenge Launch Date||March 1, 2019|
|Challenge Submissions Deadline||June 7, 2019|
|Challenge Award Notification||June 10, 2019|
|Paper Submission Deadline||May 15, 2019|
|Paper Notification||May 30, 2019|
|Workshop date (co-located with CVPR'19)||June 16, 2019|
All deadlines are at 23:59 Pacific Standard Time.
Researchers are invited to participate the WebVision challenge, which aims to advance the area of mining knowledge from noisy web images and meta information. The challenge is based on the WebVision 2.0 dataset, which contains a training set, a validation set, and a test set. The training set is downloaded from Web without any human annotation. The validation and test set are human annotated, where the labels of validation data are provided and the labels of test data are withheld. To imitate the setting of learning from web data, the participants are required to learn their models solely on the training set and submit classification results on the test set. In this sense, the validation data and labels could be simply used to validate their models and cannot be used to learn the model weights.
The WebVision dataset provides the web images and their corresponding meta information (e.g., query, title, comments, etc.). Detailed information regarding the dataset can be found at the dataset page. Learning from web data poses several challenges such as
Participant are encouraged to design new methods to solve these challenges.
A poster session will be held at the workshop. The goal is to provide a stimulating space for researchers to share their works with scientific peers. We welcome researchers to submit their recent works on any topics related to learning from web data.