Dataset Analysis

Analyzing the Statistical Properties of the Segmented Annotations

Area Distribution of Annotations and Proposals

We analyze the distibution of the are of the annotations. To do so, we compute the cumulative density function as the percentage of objects that are smaller than a certain percentage of the image area.
The left plot shows that indeed the majority of objects in the three databases are small (above the uniform distribution), and that COCO has the most pronounced bias. To put a particular example, the dashed gray line highlights that while 50% of Pascal objects are below 5% of area, in COCO 80% of the annotated objects are below this threshold. In the other axis, the percentile 80% is at 20% of the area for Pascal and at 5% for COCO.

In the right plot we can observe significant differences between techniques, illustrated for instance by the percentage of proposals whose area is below 5% of the image area: 73% in case of GOP and 34% for GLS. In all cases, the percentage of small objects is even more significant in COCO, although GOP gets close.


Would you like to discuss something about these results? Let us know below!