Paper in ECCV Workshop 2012: “Weakly Supervised Learning of Object Segmentations from Web-Scale Videos”
Paper / Citation Abstract We propose to learn pixel-level segmentations of objects from weakly labeled (tagged) internet videos. Especially, given a large collection of raw YouTube content, along with potentially noisy tags, our goal is to automatically generate spatiotemporal masks for each object, such as a “dog”, without employing any pre-trained object detectors. We formulate […]