This article presents a visual object tracking method and applies an event-based performance evaluation metric for assessment. The proposed monocular object tracker is able to detect and track multiple object classes in non-controlled environments. The tracking framework uses Bayesian per-pixel classification to segment an image into foreground and background objects, based on observations of object appearances and motions in real-time. Furthermore, a performance evaluation method is presented and applied to different state-of-the-art trackers based on successful detections of semantically high level events. These events are extracted automatically from the different trackers an their varying types of low level tracking results. Then, a general new event metric is used to compare our tracking method with the other tracking methods against ground truth of multiple public datasets.