Semester and Master Projects at BIWI

We constantly offer interesting and challenging semester and master projects for motivated students at our lab. Below, you can find a list of topics that are currently being offered. Not all projects might be listed, if you are generally interested, do not hesitate to contact one of the supervisors, she/he might also give you an overview of other offered projects. Also, don't hesitate to contact us proposing your own ideas for projects, they are more than welcome.

In this presentation you can find an overview of our research topics and available projects.

Common Topics

Where should the tourist go?

Landmarks are photographed from many views from the many international tourists. Flickr provides a rich set of data from all their pictures, yet some views are not as common. This work is to create a tool which visualizes which views are common and suggest new views.

GOAL: Create an android app to display a 2d/3d map of the landmark and possible next locations.

The student will provided the current 3d viewer and the tasks are
- create a framework in android for showing images, pointclouds and meshes
- calculate the next best view based on current state of the art technology
- create an interface to show the next view to the user
- take a new picture and upload it to a server


Interested in cool 3D topics? Contact me! hayko[AT]vision.ee.ethz.ch

Supervisor(s):

Dr. Hayko Riemenschneider, ETF E111, Tel.: +41 44 63 20258

Professor:

Luc Van Gool (vangool), ETF C117, Tel.: +41 44 63 26578

Introduction. Due to recent advances in 3D range imaging highly accurate and large 3D models for real-world environments can easily be obtained and are already available for many city areas. Given structural information about the world, many new opportunities for computer vision (CV) applications in scene understanding arise. Videos are a rich source for capturing and analyzing social activities, human/vehicular traffic and events. This allows for CV applications such as multi-view object tracking, vehicle and pedestrian trajectory analysis, video cutting, multi-video event and scene summarization. Registration of video data to a 3D world model using visual information is an essential requirement for many of these applications.

Goal. The goal of the project (Semester- or Master Project) is to develop a pipeline which enables registration (i.e. finding position and orientation) for all frames of a video with respect to a given 3D (Structure-from-Motion) point cloud.

The starting point are the ideas, code and datasets of this paper. The project scope can be adapted for either a semester or a master thesis.

Requirements. Basic prior knowledge in 3D Multi-View Geometry / Stereo Vision.

If you can program in Matlab and C++ and would like to learn some computer vision, feel free to send a mail.

Type of Work. 40% theory, 60% programming

Supervisor(s):

Till Kroeger, ETF C113.1, Tel.:

Professor:

Luc Van Gool (vangool), ETF C117, Tel.: +41 44 63 26578

Introduction: Image resizing is one of the most common image operations. Almost all display and editing software employs this operation. This is necessary for example when we adapt images to displaying devices of different dimensions, when we want to explore in more details some regions of the image (e.g. in visual surveillance), and so on. The downsampling of images usually does not pose a challenge. However, the up-sampling of images is still an open problem. Super-resolution is a technology that is used to sharpen smooth rough edges and enrich missing textures in images that have been enlarged using a general up-scaling process (such as a bilinear or bicubic process), thereby delivering an image with high-quality resolution. Image super-resolution is normally solved by learning from examples: pairs of high-resolution image patches and the corresponding low-resolution (down-sampled) image patches. The performance of the system largely relies on the `quality' of the image patches collected. However, how to find the most informative / useful image patches is still an open question and has not drawn much attention.
Goal : The goal of the project is to develop an approach to actively explore the most useful images / patches to be used as the examples for image super-resolution. Students are provided with code of image super-resolution and active learning.
Requirement : Basic knowledge about vision and machine learning; Able to program.

Supervisor(s):

Dr. Dengxin Dai, ETF C113.1, Tel.: +41 44 63 25426

Professor:

Luc Van Gool (vangool), ETF C117, Tel.: +41 44 63 26578

Description: Microtubules (MTs) are filamentous polymers of tubulin and are essential for many cellular structures and functions: cytoskeleton, intracellular transport, cell division, proliferation and motility, etc. They are characterized by a very dynamic behavior, switching stochastically between phases of growth and shrinkage taking place at the MT ends: a mechanism called dynamic instability. In vivo, this feature of MTs is highly regulated in space and in time. Perturbations of the dynamic instability or its regulation lead to serious diseases such as cancer or neurodegenerative disorders. As macromolecular cellular entities, MTs are well-studied. However, modeling MT dynamic instability on the atomic scale is yet unattainable due to the large size of the MTs and the large time-scale physiologically relevant. In addition to this drawback of modeling and simulation of large biomolecular assemblies, our understanding of dynamic instability faces a conceptual challenge: deciphering how the different spatial and temporal scales determine dynamic instability.

The research project involves development and implementation of mechano-chemical models for MT dynamic instability. The structural and mechanical properties will be based on elastic networks. The stochastic biochemical modifications of the MTs will be explicitly coupled with the structural and mechanical description. The originality of our approach lies in a genuine coupling between these two aspects, involving a careful interaction between deterministic and stochastic simulation algorithms. At a later stage of the project, connections between the atomic nanoscale level and the emerging macroscale mechanical properties of MTs will be explored by coarse- graining techniques. The project is planned to last for 6-8 months and will involve strong collaboration with internationally renowned experimental research groups from ETH Zurich and the Paul Scherrer Institute (PSI).

Requirements: solid skills in C++ programming; general-purpose programming language for prototyping (python preferred) is an advantage; experience in two or more of the following areas is an advantage: biophysical modeling, stochastic processes simulation, computational mechanics; excellent communication skills with fluent English; self-motivation and team-working spirit, with genuine curiosity for science

Goals: The goals of the project are development and testing of mechano-chemical MT models, as well as implementation of the models into an existing software package.

Application: Applicants should send a transcript and a CV by email to the address below, with the subject Application for Internship in Modeling and Simulation of Microtubules. The CV should clearly state which curriculum the candidate follows, as well as the information about the start and end dates of education and/or previous employment positions. We also need contact information of at least two potential references.

Supervisor(s):

Dr. Oliwia Szklarczyk, ETF D117, Tel.: +41 44 63 27690

Dr. Grégory Paul, ETF D114.2, Tel.: +41 44 63 26670

Professor:

Gábor Székely (szekely), ETF C117, Tel.: +41 44 63 25288

Introduction

Reconstructing the 3D geometry of a city is often achieved through multi-view photographic acquisition. This high-quality imagery is however rarely completely exploited to generate the color of the virtual city, i.e., its facades’ textures. One of the main obstacles for that is the size of the data, which can easily grow to gigapixel textures for 3D cities. It is therefore important that we strive towards maximal storage compactness while keeping quality high. Traditional image compression schemes, like the ubiquitous JPEG-2000 format, are not practical for this purpose: it lacks the random access property that make graphics cards efficient. On the other hand, there exist very efficient procedural texture synthesis methods that are both compact and fast to evaluate, making them very popular in game engines. Some procedural textures can also be learned so as to reproduce input exemplars.

Goal of this project

We aim at studying facade texture summarisation by taking advantage of modern texture synthesis methods. Gaussian textures are a limited category of textures characterized solely by their power spectrum (thus compact) and are very quick to render as a procedural texture. We want to investigate to what extent a facade texture can be summarized for realistic synthesis with a mix of procedural Gaussian texture synthesis and classical texture mapping methods. To answer this question, we will first learn to predict which parts of a facade are faithfully representable by a Gaussian texture. Later on, we will investigate how to summarize a facade given the gathered knowledge.

Interested ? Want more information ? Looking for another nice topic related to (real-time) texture synthesis ? Please contact vanhoey [AT] vision.ee.ethz.ch

Supervisor(s):

Dr. Kenneth Vanhoey, ETF C112, Tel.: +41 44 63 38394

Dr. Hayko Riemenschneider, ETF E111, Tel.: +41 44 63 20258

Professor:

Luc Van Gool (vangool), ETF C117, Tel.: +41 44 63 26578

Ultrasound elastography is a method that allows the noninvasive measurement of mechanical properties (relative stiffness, Young’s modulus, Bulk modulus, viscosity, nonlinearity) of in vivo soft tissue with emerging clinical applications from breast tissue characterization to cardiology. Harmonic elastography involves applying multiple frequency steady state vibrations to tissues in order to measure the aforementioned properties. In order to evaluate our methods, tissue mimicking phantoms (gelatin, silicon) with different properties are built and tested on. It is the aim of this project to create a simple rheometry setup for measuring the mechanical properties of different tissue mimicking phantoms. The setups can be as simple as placing the phantom on a scale, applying a weight on top of it and measuring the axial deformation to determine its Young’s modulus, or slightly more complex. Phantoms with different properties such as materials (gelatin, agar, and silicon), sizes, stiffness as well as different inclusions sizes, stiffness and position within the phantom will be created and tested in the rheometry setup. Furthermore, the effect of different scatters (flour, cellulose) will also be tested (e.g. signal damping, signal traceability). In the case of a master thesis, experimental results will be compared with finite element simulation results from ANSYS. Type of work: Semester Project or Master Thesis, 40% Theory, 25% Software implementation, 35% Experiments. Requirements: Mechanical design principles, basic Matlab / C++ knowledge, Ansys knowledge is a plus, but not necessary.

Supervisor(s):

Corin Felix Otesteanu, ETF D112, Tel.: +41 44 63 38815

Professor:

Orçun Göksel (ogoksel), ETF C107, Tel.: +41 44 63 22529

Introduction:

Facial Expression Recognition is useful to analyze human emotion, and is an interesting blend of psychology and technology. In general, a facial expression recognition algorithm detects faces within a photo or video, and analyzes the relationship between points on the face. In the end, one sequence of face motion is typically classified into one of seven main categories: Happy, Sadness, Anger, Fear, Surprise, Disgust and Neutral.

Goal of this project:

This project aims to analyze the dynamic temporal expressions of the facial data corpus from real world environment. To capture the face shape and appearance changes over a video sequence, we propose to develop a dynamic version of the Active Appearance Model. This model is then fed into the recently popular deep neural networks to achieve good classification of different facial expressions.

Student's task:

To achieve the goal, the student is expected to study the Active Appearance Model to extract the dynamic information in a face sequence. Later, these features are used (along with the deep features) within a new version of the deep networks which operates on the specified data model of the non-Euclidean domain.

Required skills:

Programming:

C/C++, Matlab and/or Python

Computer vision & math:

image processing, statistical modeling and deep learning.

Supervisor(s):

Dr. Zhiwu Huang, ETF C112, Tel.: +41 44 63 39287

Dr. Danda Pani Paudel, ETF C112, Tel.: +41 44 63 22774

Professor:

Luc Van Gool (vangool), ETF C117, Tel.: +41 44 63 26578

Intro:

Deep architectures have proven successful for various image recognition tasks, such as image classification, object recognition, segmentation, etc.

However, for commercial applications, efficiency plays a key role.

Goal:

The goal of this project is to explore new architectures that can be more efficient, either in terms of runtime or memory complexity. The objective is to automatically learn an efficient model, or to devise methods to speed up existing architectures.

Requirements:

Programming knowledge of Python and/or C/C++; familiarity with computer vision, machine learning, and deep learning; self-motivation and team-working skills.

We also have more available projects which are not listed yet. For more information, please contact Eirikur Agustsson ( aeirikur@vision.ee.ethz.ch ) or Radu Timofte ( timofter@vision.ee.ethz.ch )

Supervisor(s):

Eirikur Agustsson, ETF D115, Tel.: +41 44 63 29420

Dr. Radu Timofte, ETF C107, Tel.: +41 44 63 25279

Professor:

Luc Van Gool (vangool), ETF C117, Tel.: +41 44 63 26578

Intro:

As the saying goes: a picture is worth a thousand words .

Given a picture of a person, there are hundreds of attributes that can be predicted and analyzed: age, gender, body shape, clothes, skin color, hairstyle, attractiveness, expressions, gestures, impressions, personality traits, etc.

Demo:

www.howhot.io

Goal:

The goal of this project is to tackle new problems in this domains, using deep neural networks on recently collected datasets.

Requirements:

Programming knowledge of Python and/or C/C++; familiarity with computer vision, machine learning, and deep learning; self-motivation and team-working skills

We also have more available projects which are not listed yet. For more information, please contact us!

Supervisor(s):

Eirikur Agustsson, ETF D115, Tel.: +41 44 63 29420

Dr. Radu Timofte, ETF C107, Tel.: +41 44 63 25279

Professor:

Luc Van Gool (vangool), ETF C117, Tel.: +41 44 63 26578

Intro:

Recent years shown large improvements (due to new schemes based on dense matching of patches, total variation formulations and end-to-end deep learning) in optical flow estimation, that is the motion estimation for each pixel from one frame to another in a video sequence. The robustness and the accuracy of the estimation improved as well as the runtime efficiency. For applications, there is a large interest for low time and memory complexity methods. Unfortunately, such methods are still far from the top state-of-the-art achieved accuracies.

Goal:

The goal of this project is to assess the current status of efficient optical flow methods and to analyze ways to further improve their performance such that to bridge the gap between low complexity and high accuracy.

An optional branch of this work can be the development of techniques which allow an optical flow algorithm to assess it's own success, i.e. failure detection and/or probabilistic output.

Requirements:

Programming knowledge of Python and/or C/C++; familiarity with computer vision, machine learning, and deep learning; self-motivation and team-working skills

We also have more available projects which are not listed yet. For more information, please contact us!

Supervisor(s):

Till Kroeger, ETF C113.1, Tel.:

Dr. Radu Timofte, ETF C107, Tel.: +41 44 63 25279

Professor:

Luc Van Gool (vangool), ETF C117, Tel.: +41 44 63 26578

Intro:

Recent years shown large improvements in the upscaling of low resolution images to higher and more detailed resolutions, a task known as single image super-resolution (SISR). The videos are generally tackled by simple deployment of SISR methods frame-wise, by often high complexity methods involving motion and blur estimation at pixel level and optimization methods, or assume the presence in the video of keyframes at high resolution used to adapt the SISR model to the local video contents.

With the advances in displays, camera sensors and storing capacities there is a continuous need for computational efficient and accurate upscaling to higher resolutions of online streams and of videos recorded when displays of low resolution were the norm.

Goal:

The goal of this project is to develop an efficient video super-resolution method with low complexity and high accuracy starting from the recent advances in SISR, optical flow, and deep learning.

Requirements:

Programming knowledge of Python and/or C/C++; familiarity with computer vision, machine learning, and deep learning; self-motivation and team-working skills

We also have more available projects which are not listed yet. For more information, please contact us!

Supervisor(s):

Till Kroeger, ETF C113.1, Tel.:

Dr. Radu Timofte, ETF C107, Tel.: +41 44 63 25279

Professor:

Luc Van Gool (vangool), ETF C117, Tel.: +41 44 63 26578

Intro:

Switzerland is the home of thousands of plant species... and of millions of people and of a broad system of highways and roads.

If each individual has a residence address the plants do not. Accurate plant species identification is usually the expertise of a few specialized botanists.

Some plants are of particular interest: endangered species, species of economic value, invasive species, newly introduced species to the ecosystem.

Goal:

The goal of this project is to automatize the mapping of plant species from images and videos recorded along highways (such as Google Street View).

We will collaborate with Dr. Michael Nobis, macroecologist at WSL - Swiss Federal Institute for Forest, Snow and Landscape Research.

Requirements:

Programming knowledge of Python and/or C/C++; familiarity with computer vision, machine learning, and deep learning; self-motivation and team-working skills

We also have more available projects which are not listed yet. For more information, please contact us!

Supervisor(s):

Dr. Radu Timofte, ETF C107, Tel.: +41 44 63 25279

Professor:

Luc Van Gool (vangool), ETF C117, Tel.: +41 44 63 26578

Introduction:

Autocalibration (retrieving camera's intrinsic parameters from image correspondences) of a moving camera is a non-linear and challenging problem. Most approaches rely on the ubiquitous nature of the so-called Absolute Conic: a special conic lying on the plane at infinity. This project aims on robustly estimating the Absolute Conic within a framework of global optimization.

Goal of this project:

Camera intrinsics extraction from Fundamental matrix via Absolute Conic estimation. The first part of the project focuses on deriving various mathematical conditions suitable for convex optimization. Later, these conditions are used to develop a robust, optimal, and practical autocalibration method.

Required skills:

  • Interested in dynamic programming and convex optimization
  • Strong mathematical background
  • Good at Matlab/Python/C++

Supervisor(s):

Dr. Danda Pani Paudel, ETF C112, Tel.: +41 44 63 22774

Professor:

Luc Van Gool (vangool), ETF C117, Tel.: +41 44 63 26578

Supervisor(s):

Dr. Zhiwu Huang, ETF C112, Tel.: +41 44 63 39287

Dr. Danda Pani Paudel, ETF C112, Tel.: +41 44 63 22774

Professor:

Luc Van Gool (vangool), ETF C117, Tel.: +41 44 63 26578

Supervisor(s):

Xiaoran Chen, ETF E112, Tel.: +41 44 63 27313

Kerem Can Tezcan, ETF E112, Tel.: +41 44 63 27908

Professor:

Ender Konukoglu (kender@vision.ee.ethz.ch), ETF E113, Tel.: +41 44 63 38816

Supervisor(s):

Dr. Danda Pani Paudel, ETF C112, Tel.: +41 44 63 22774

Professor:

Luc Van Gool (vangool), ETF C117, Tel.: +41 44 63 26578

Introduction:
Collecting pixel-level annotation is a laborious and expensive process, thus exploiting synthetic data to learn deep models has attracted increasing attention in recent years. However, usually a significant performance drop has been observed when applying the learned model to real world scenarios. This is mainly due to the intrinsic difference (style, distribution, etc.) between synthetic and real data, which is also known as the domain adaptation problem.

Project Description:
Recently we have proposed an effective ROAD-Net, which dramatically improves the urban scene semantic segmentation results [Paper Link]. In this project, we aim to exploit synthetic data for other pixel-level tasks. we will improve ROAD-Net for tasks such as depth estimation, surface normal estimation etc. The emphasis of this project is on impactful research and publications in high-profile conferences and journals.

Requirements:
Programming experience(Python or/and Matlab).
Self-motivation and collaboration skills.
Familiarity with computer vision and deep learning(Tensorflow, Caffe or PyTorch) is a plus.

Supervisor(s):

Dr. Wen Li, ETF C112, Tel.: +41 44 63 25281

Yuhua Chen, ETF D115, Tel.: +41 44 63 39064

Professor:

Luc Van Gool (vangool), ETF C117, Tel.: +41 44 63 26578

Background:
Interactive image segmentation aims to segment objects from an image defined by the user inputs(see the figure), it has a wide range of applications such as image editing, human computer interaction and so on. In the current era of the Internet, immeasurable amount of images are recorded and shared every second. Therefore, algorithms for fast and accurate interactive image segmentation become crucially important for real-world applications and pleasant user experience.

Objective:
The main objective of this project to build an interactive image segmentation system based on pixel-wise deep metric learning, which response instantly to each user input, and provides segmentation results with fine details. The emphasis of the project is on impactful research and publications in high-profile conferences and journals.

Requirements:
Programming experience(Python or/and Matlab).
Self-motivation and collaboration skills.
Familiarity with computer vision and deep learning(Tensorflow, Caffe or PyTorch) is a plus.

Supervisor(s):

Yuhua Chen, ETF D115, Tel.: +41 44 63 39064

Professor:

Luc Van Gool (vangool), ETF C117, Tel.: +41 44 63 26578

Supervisor(s):

Sergi Caelles Prat, ETF D115, Tel.: +41 44 63 25735

Professor:

Luc Van Gool (vangool), ETF C117, Tel.: +41 44 63 26578

Introduction

Browsing webcams online can be useful and sometimes fun too. Therefore several websites propose to browse public webcams or links to them. Determining their location unfortunately requires manual labeling: someone needs to find the webcam provider (e.g., a city’s tourist office) and manually report that location. This hinders the rise of automatic large-scale webcam crawlers.

Goal of this project

We have recorded over a year of hourly images with timestamps of thousands of webcams whose locations are unknown. We propose to determine their location from the image’s content itself by detecting the day/night cycles in the images, and combining that with a model of luminance as a function of time and position on the earth’s surface.

Student's task

By programming this model (luminance on the earth and daytime detector from the image), we will investigate how precise such a locator can get by gradually improving it, starting from simple concepts towards more involved ones.

Required skills

Programming: C/C++, Matlab and/or Python
Computer vision & math: basic 3D geometry and image processing.

Interested ? Want more information ? Please contact vanhoey [AT] vision.ee.ethz.ch

Supervisor(s):

Dr. Danda Pani Paudel, ETF C112, Tel.: +41 44 63 22774

Professor:

Luc Van Gool (vangool), ETF C117, Tel.: +41 44 63 26578