We propose a semi-automatic annotation system for large symphonic orchestras videos.
We leverage video redundancy, image clustering, and human annotation.
Our method successfully deals with several intra-class variability issues.
Human annotation effort reduced while maintaining high level of output quality.
Comprehensive analysis of the impact of different modules on the overall performance.