Video Genome Project


Since the first full sequencing of a bacteriophage genome done in 1977 completely by hand, DNA sequences of hundreds of different species have been decoded. Compared to the tiny phage, the human genome analysis completed three decades later, required completely new computational methodologies. The growing amount of genetic data has driven the emergence of bioinformatics a field of computer science using string matching and search algorithms for automatic analysis of DNA sequences.

A similar exlosive growth has been witnessed in video data available in the public domain, driven by fast evolution of Internet technologies, which have created unprecedented challenges in the analysis, organization, management, and control of such content. The problems encountered in video analysis such as identifying a video in a large database (e.g. detecting pirated content in YouTube), putting together video fragments, finding similarities and common ancestry between different versions of a video, have analogous counterpart problems in genetic research and analysis of DNA and protein sequences.

With Video Genome technology, we exploit the analogy between genetic sequences and video. Our approach to video analysis is motivated by genomic research. Representing video information as video DNA sequences and applying bioinformatic algorithms allows to search, match, and compare videos on Internet scale.