Visual tracking is one of the computer visionâs longstanding challenges, with many methods as a result. While most state-of-the-art methods trade-off performance for speed, we propose PICASO, an efficient, yet strongly performing tracking scheme. The target object is modeled as a set of pixel-level templates with weak configuration constraints. The pixels of a search window are matched against those of the surrounding context and of the object model. To increase the robustness, we match also from the object to the search window, and the pairs matching in both directions are the correspondences used to localize. This localization process is robust, also against occlusions which are explicitly modeled. Another source of robustness is that the model - as in several other modern trackers - gets constantly updated over time with newly incoming information about the target appearance. Each pixel is described by its local neighborhood. The match of a pixel is taken to be the one with the largest contribution in its sparse decomposition over a set of pixels. For this soft match selection, we analyze both l1 and l2 -regularized least squares formulations and the recently proposed l1 -constrained âIterative Nearest Neighborsâ approach. We evaluate our tracker on standard videos for rigid and non-rigid object tracking. We obtain excellent performance at 42fps with Matlab on a CPU.