Find below a selection of datasets maintained by us. All data is only for research purposes, unless stated differently. Please make sure to reference the authors properly when using the data.
GeoZurich: Street-side dataset of the city of Zurich
The dataset, named CVL GeoZurich 2018, consists of about 3 million high-quality images, spanning 70 km in the drive-able street network of Zurich. It consists of a rigid 16 camera setup with 4 stereo pairs and 8 additional view points.
This dataset is not available for the public.
AirZurich: Aerial imagery dataset of the city of Zurich
The dataset, named CVL AirZurich 2018, consists of about 830 high-quality aerial images, spanning across the city of Zurich. It consists of GPS-registered flyover path and 16-bit RGB TIFF images.
This dataset is not available for the public.
DAVIS: Densely Annotated VIdeo Segmentation 2017
The dataset, named DAVIS 2017 (Densely Annotated VIdeo Segmentation), consists of 150 high quality video sequences, spanning multiple occurrences of common video object segmentation challenges such as occlusions, motion-blur and appearance changes. Each video is accompanied by densely annotated, pixel-accurate and per-frame ground truth segmentation of multiple objects.Information, download and evaluation code of DAVIS 2017
DAVIS: Densely Annotated VIdeo Segmentation 2016
The dataset, named DAVIS 2016 (Densely Annotated VIdeo Segmentation), consists of fifty high quality, Full HD video sequences, spanning multiple occurrences of common video object segmentation challenges such as occlusions, motion-blur and appearance changes. Each video is accompanied by densely annotated, pixel-accurate and per-frame ground truth segmentation of a single object.Information, download and evaluation code of DAVIS 2016
IMDB-WIKI – 500k+ face images with age and gender labels
The IMDB-WIKI dataset contains more than 500k face images with gender and age labels for training. We provide pre-trained models for both age and gender prediction. Our method for age estimation was pre-trained on IMDB-WIKI and is the winner (1st place) of the ChaLearn LAP 2015 challenge on apparent age estimation with more than 115 registered teams, significantly outperforming the human reference.Information and download page for IMDB-WIKI dataset and pre-trained models
A dataset for large-scale texture synthesis. It contains 21,302 texture examples. All of them are annotated in terms of their synthesizability: the ‘goodness’ of the synthesized results by four popular example-based texture synthesis methods.Information, code and download page
A data set for recognition of pictured dishes. It contains 101 food categories with in total 101'000 images.
ETHZ RueMonge 2014
Semantical 3D models, e.g. of cities are usually derived from classifying 2D images. The 3D challenge pushes the frontiers on 3D modelling and 3D semantic classification. This dataset consists of 700 meters along a street annotated with pixel-level labels for facade details such as windows, doors, balconies, roof, etc. It is the largest and most detailed dataset available including a dense surface and semantic labels for urban classes.Information and download page for the 3D Challenge
Apparel classification with Style
Dataset accompanying the paper Apparel classification with StyleInformation and download page
Biwi Kinect Head Pose Database
Over 15K images of 20 people recorded with a Kinect while turning their heads around freely. For each frame, depth and rgb images are provided, together with ground in the form of the 3D location of the head and its rotation angles.Information and download page
BIWI 3D Audiovisual Corpus of Affective Communication - B3D(AC)^2
The corpus contains high quality dynamic (25 fps) 3D scans of faces recorded while pronouncing a set of English sentences. Affective states were induced by showing emotional video clips to the speakers. The data has been annotated by tracking all frames using a generic face template, segmenting the speech signal into single phonemes, and evaluating the emotions conveyed by the recorded sequences by means of an online survey.Information and request page
BIWI Walking Pedestrians dataset
Walking pedestrians in busy scenarios from a bird eye view. Manually annotated. Data used for training in our ICCV09 paper "You'll Never Walk Alone: Modeling Social Behavior for Multi-target Tracking"
ETHZ Shape Classes
A dataset for testing object class detection algorithms. It contains 255 test images and features five diverse shape-based classes (apple logos, bottles, giraffes, mugs, and swans).
- V. Ferrari, F. Jurie, and C. Schmid "From Images to Shape Models for Object Detection", International Journal of Computer Vision (IJCV), 2009.
- V. Ferrari, T. Tuytelaars, and L. Van Gool "Object Detection by Contour Segment Networks", European Conference on Computer Vision (ECCV), Graz, May 2006.
- T. Quack, V. Ferrari, B. Leibe, L. Van Gool "Efficient Mining of Frequent and Distinctive Feature Configurations", International Conference on Computer Vision (ICCV), 2007.
ETHZ Extended Shape Classes
The Extended ETHZ shape classes is a larger database of shape categories, created by merging ETHZ shape classes with Konrad Schindler's 4x50 closed shapes. This is (almost) a superset of each of the two older databases. Please refer to the README for details on the differences and how to use the new larger dataset.
- K. Schindler and D. Suter."Object Detection by Global Contour Shape", Pattern Recognition, 41(12), 2008.
- Download: Extended ETHZ shape classes
ETH Face Pose Range Image Data Set
Range images of faces with ground truth used in our CVPR'08 paper "Real-Time Face Pose Estimation from Single Range Images".
Dataset used in our CVPR '07 paper "Dynamic 3D Scene Analysis from a Moving Vehicle"
The sequence contains 1175 stereo camera pairs acquired with setup mounted on top of a moving vehicle. The stereo setup has a fixed baseline, and the cameras are calibrated internally and with respect to each other.
"Central" Pedestrian Crossing Sequences
Three pedestrian crossing sequences used in our ICCV'07 paper. Each sequence comes with ground-truth bounding box annotations for the objects to be tracked, as well as a camera calibration. The annotation files for the pedestrian crossing sequences contain bounding box annotations for every fourth frame.
Dataset used in our ICCV '07 paper "Depth and Appearance for Mobile Scene Analysis"
The set was recorded in Zurich, using a pair of cameras mounted on a mobile platform. It contains 12'298 annotated pedestrians in roughly 2'000 frames.
Zurich Buildings Database
The goal of the ZuBuD Image Database is to share image data sets with researcheres around the world. To facilitate this, we have created this site, which contains over 1005 images about Zurich city building. The detail information about the database can be found on our Technical Report:TR-260.
We will be adding new data to this site as time permits. Furthermore, we will now accept datasets from other researchers, to add to our archive. If you would like to contribute for this, please contact Hao Shao. The full sized images themselves are stored in PNG (Portable Network Graphics) format.
|Created: April 2003|
|ZuBuD Query Images
Ground truth mapping (txt)
|Created: April 2003|
Another 53 objects database
Created: April 2003
4D MRI data
The data contains dynamic sagittal 2D images acquired during free breathing. More specifically
- the motion fields for different breathing cycles of different subjects. Point trajectories for isotropic grids of 15mm and 5mm are provided in both plain text and binary format.
- a reconstructed MRI volume per subject at exhalation in Analyze (*.hdr/img), Dicom (*.dcm) and MATLAB (*.mat) format.
4D MRI lung data
The data contains full 4D-MRI data of two test subjects. For each subject there are
- 200 3D timesteps,
- segmentation mask of left and right lungs,
- 3D B-Spline registration results in ITK format for all 200 timesteps
If you use any of our data in your work please cite
Boye, D. et al. - Population based modeling of respiratory lung motion and prediction from partial information - Proc. SPIE 8669, Medical Imaging 2013: Image Processing, 86690U (March 13, 2013); doi:10.1117/12.2007076.
- Maintained by Christine Tanner
ETHZ Personal Event Collection
A data set for recognition of events in personal photo collections. It contains more than 61'000 images in 807 collections, annotated with 14 diverse social event classes.
2D ultrasound sequences of the liver
Nine 2D ultrasound sequences of the liver of healthy volunteers were acquired during free breathing over a period of 5-10 min. Please refer to the following publications:
- L. Petrusca, P. Cattin, V. De Luca, F. Preiswerk, Z. Celicanin, V. Auboiroux, M. Viallon, P. Arnold, F. Santini, S. Terraz, K. Scheffler, C. D. Becker, R. Salomir, "Hybrid Ultrasound/Magnetic Resonance Simultaneous Acquisition and Image Fusion for Motion Monitoring in the Upper Abdomen", Investigative Radiology, Vol. 48, No. 5, pp. 333-340, 2013.
- V. De Luca, M. Tschannen, G. SzÃ©kely, C. Tanner, "A Learning-based Approach for Fast and Robust Vessel Tracking in Long Ultrasound Sequences", Medical Image Computing and Computer-Assisted Intervention, Springer. volume of LNCS 8149, pp. 518-525, 2013.
Download: Sequences and data details