Supervisors: Yuhua Chen
Recently, domain adaptation for semantic segmentation has attracted increasing attention. Previous works mainly address the problem in the unsupervised setting, i.e., while images in the source domain are fully labeled, only unlabeled images are given in the target domain. However, due to the lack of supervision signal in the target domain, unsupervised domain adaptation methods are usually insufficient to fully solve the domain shift issue. As a result, there is still a clear performance gap between adapted model and fully-supervised baseline. To alleviate the problem, we propose to aid domain adaptation with additional weak supervision in the target domain. We study mixed supervised domain adaptation in this thesis: the training labels come in two parts: pixel-wise full annotation in source domain, and sparse weak annotation in target domain. The objective is to achieve performance close to the fully supervised baseline while keeping a low annotation cost in the target domain. The proposed method consists of two major components: firstly we build an adversarial framework for aligning the image pixels and structured output space cross two domains. Based on this adversarial framework, we then deploy the weak supervision in the target domain with the approximated ground truths, which are refined from point-level annotations via iterative training. We evaluate the proposed method on the Cityscapes dataset while adapting from the GTA5 dataset. The experimental results indicate that our framework can achieve competitive results with the fully supervised baseline. The performance gap is reduced from 37.6% to 8.6% by only providing 6 points per class in each image on average. The experimental results clearly demonstrate the effectiveness of using weak supervision in the target domain, thus making our approach highly practical for adapting the semantic segmentation model to new environments.