Facebook has open-sourced some of its video matching technology. The goal of this effort is to cut down on harmful internet content such as child exploitation, terrorist propaganda, and graphic violence. Through the Temporal Match Kernel (TMK), developers can hash duplicate or near-duplicate videos so that such videos cannot be shared and spread across platforms.
With these tools open-sourced, industry partners and non-profits specifically fighting abuse can leverage the technology to further their goals. Facebook specifically applauds Microsoft and Google for their efforts in this area and discusses the partnering ability the tech giants have to stop abuse from spreading around the internet. The National Center for Missing and Exploited Children reported a 541% increase in the number of abusive videos reported from the tech industry, and it expects that the Facebook project will help grow its ability to identify and rescue children.
Videos are transcoded to MP4 within TMK. Users can run the tool to find matching content on their site. FFmpeg is required to utilize TMK. The .tmk file created includes metadata about how the has was computed, pure-average score, and cosine & sine features. Check out GitHub to learn more.
In its blog post announcement, Facebook mentioned open-sourcing photo-matching technology as well. Facebook's photo-matching technology, PDQ, came out of Facebook's Artificial Intelligence Research team in partnership with multiple universities. The same identification and hashing concept is utilized to identify matches and stop the spread of abusive content.