Video Stabilization with SIFT

Katie Zutter Olivia Zhao Sam Erickson


We ran into several difficulties during the initial development of this project. The paper we originally selected did not have everything we need, but it got us off to a good start. It wasn’t until late in the development process that we realized we needed a more robust solution, which led us to explore the papers by Grundmann, Kwatra, and Essa and by Kulkarni.

One of our biggest struggles was overlaying images in full color rather than grayscale. The initial implementation allowed us to overlay images in grayscale and show their differences in color, but we were unable to properly blend the images into a singular colored image.

The grayscale issue led us to explore other options, and we eventually decided on the latter two referenced papers. While the solutions described in those two papers gave us what we needed, installing the necessary frameworks proved to be very difficult. The SIFT platform we are using runs on OpenCV. Each of us had to track down and install the missing pieces of functionality before OpenCV would execute.

We underestimated the time that it would take to get comfortable with current image stabilization methods. Much of our time was spent trying to understand how current methods work; we tried to gain a deep understanding on what transformations were applied on a frame-by-frame level. Although helpful to gain familiarity with the current state-of-the-art as well as this topic in general, it did stunt our progress towards our ultimate goal to develop something novel and exciting. However, now that we have a more solid base on the problem at hand, going forward from here we should be better fit to build a 3D video stabilizer with object tracking.

Future Work

While we had tentatively planned on developing an object tracking solution to run on top of the SIFT implementation, that was left out of scope for this project. Object tracking can potentially help detect intentional object movement within a stabilized video, but it is not immediately relevant to stabilization. We decided that this is something better suited for future research.

The accuracy of the stabilization algorithm is noticeably worse with 360-degree videos compared to 2-D videos. The two types of videos are certainly different, but it’s not clear what specific characteristics cause the algorithm to perform so poorly in a 360-degree environment. In future research, those characteristics must be identified and handled accordingly to achieve similar results with both video types.