MiVOS Project Website

      
          Your browser does not support the video tag.
        
          Your browser does not support the video tag.
        
          Your browser does not support the video tag.

      
          Your browser does not support the video tag.
        
          Your browser does not support the video tag.
        
          Your browser does not support the video tag.

- 29,989 synthetic videos using 51,300 animated 3D models
- Each video has 160 frames
- Each frame has a resolution of 768*512, with pixel-accurate annotation
- 3-5 objects per video
- Object intersections are minimized using a greedy avoidance algorithm

We break the dataset into six segments, each with approximately 5K videos. We noted that using probably half of the data is sufficient to reach full performance (although we still used all), but using less than one-sixth (5K) is insufficient.

Each segment is about 115GB in size -- 700GB in total. Google Drive is much faster in my experience. Your mileage might vary.

[Google Drive] [OneDrive]
Or you can use the Python script (download_bl30k.py) provided in [this repo] for automatic download and extraction. The script uses Google Drive links.

UST Mirror (Reliability not guaranteed, speed throttled, do not use if others are available):
ckcpu1.cse.ust.hk:8080/MiVOS/BL30K_{a-f}.tar (Replace {a-f} with the part that you need).


@inproceedings{MiVOS_2021,
  title={Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion},
  author={Cheng, Ho Kei and Tai, Yu-Wing and Tang, Chi-Keung},
  booktitle={CVPR},
  year={2021}
}

Modular Interactive Video Object Segmentation:
Interaction-to-Mask, Propagation and Difference-Aware Fusion

BL30K Dataset

BL30K:

Download:

If you use this dataset, please cite our paper: