When trying to extract 3D scene information and camera motion from an image sequence alone, it is often necessary to cope with independently moving objects. Recent research has unveiled some of the mathematical foundations of the problem, but a general and practical algorithm, which can handle long, realistic sequences, is still missing. In this paper, we identify the necessary parts of such an algorithm, highlight both unexplored theoretical issues and practical challenges, and propose solutions. Theoretical issues include proper handling of different situations, in which the number of independent motions changes: objects can enter the scene, objects previously moving together can split and follow independent trajectories, or independently moving objects can merge into one common motion. We derive model scoring criteria to handle these changes in the number of segments. A further theoretical issue is the resolution of the relative scale ambiguity between such changes. Practical issues include robust 3D reconstruction of freely moving foreground objects, which often have few and short feature tracks. The proposed framework simultaneously tracks features, groups them into rigidly moving segments, and reconstructs all segments in 3D. Such an online approach, as opposed to batch processing techniques, which first track features, and then perform segmentation and reconstruction, is vital in order to handle small foreground objects.