Rahul Sharma (Editor)

Automatic content recognition

Updated on
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

Automatic content recognition (ACR) is an identification technology to recognize content played on a media device or present in a media file. Devices containing ACR support enable users quickly obtain additional information about the content they have just experienced without any user based input or search efforts. For example, developers of the application can then provide personalized complementary content to viewers.


How it works

To start the recognition, a short audio clip is recorded by the device and sent to the identification service. Through algorithms such as Fingerprinting ACR uses, information from the audio is taken and matched to the corresponding media.The database also contains information about the content and associated information, including complementary media. If the fingerprint of the recorded audio sample is matched, the identification service returns the corresponding metadata to the client.

Fingerprint & Watermarking

Audio based ACR is commonly used in the market. The two leading methodology are acoustic fingerprinting and watermarking. There are alternative approaches that involve a focus on Video fingerprinting but augment the accuracy and scalability with other content recognition solutions running in parallel and in series.

Acoustic fingerprinting generates unique fingerprints from the content itself. Fingerprinting techniques work regardless of content format, codec, bitrate and compression techniques. This makes it possible to use across networks and channels. Therefore, it is widely used for interactive TV, second screen application and content monitoring sectors. Popular apps like Shazam, YouTube, Facebook, Thetake, Wechat and Weibo are using audio fingerprinting methodology to recognize the contented played from a TV and trigger additional features like votes, lottery, topic or purchase.

In contrast to fingerprinting, digital watermarking requires inserting digital tags containing information about the content into the content itself, prior to distribution. For example, a broadcast encoder might insert a watermark every few seconds that could be used to identify to broadcast channel, program id, and time stamp. The watermark is normally inaudible or invisible to the users. Terminal devices like phones or tablets read the watermarks instead of actually recognizing the played content. Watermarking technology is utilized in media protection field to trace where the illegal copies originate.

It is expected by Next/Market Insights that 2.5 billion devices will be integrated with ACR technology to provide synchronized live and on-demand video watching experience.


ACR technology was applied in TV content by Shazam in 2011 which captured the attention from TV industries. Shazam was previously a music recognition service which recognizes music from a sound recording. By utilizing its own fingerprint technology to identify live channels and videos, Shazam extended their business for TV. In 2012 DIRECTV partnered with Viggle which is a TV loyalty vendor to provide interactive viewing experience on the second screen. In 2013 LG partnered with Cognitive Networks (later purchased by Vizio and renamed Inscape), an ACR vendor, to provide ACR driven interaction. In 2015 ACR technology is spread widely to even more applications and smart TVs. Now, social applications and TV manufacturers like Facebook, Twitter, Google, Wechat, Weibo, LG, Samsung, Vizio TV have already used ACR technology either developed by themselves or integrated from third party ACR providers. In 2016 there are more applications and mobile OS embedded with automatic content recognition services on the market like Peach, Omusic and Mi OS to enhance the music discover experiences.

Content identification

ACR technology helps audiences easily retrieve information about the content they watched. For smart TVs and applications with ACR technology embedded the audience can check the name of the song which is played or descriptions of the movie they watched. In addition to that, the identified video and music content can be linked to internet content providers for on-demand viewing, third parties for additional background information, or complementary media.

Content enhancement

Because devices can be "aware" of content being watched or listened to, second screen devices can feed users complementary content beyond what is presented on the primary viewing screen. ACR technology can not only identify the content, but also it can identify the precise location within the content. Thus, additional information can be presented to the user. ACR can enable a variety of interactive features such as polls, coupons, lottery or purchase of goods based on timestamp.

Audience measurement

Real-time audience measurement metrics are now achievable by applying ACR technology into smart TVs, set top boxes and mobile devices such as smart phones and tables. This measurement data is highly essential to quantify audience consumption to set advertising pricing policies.

Broadcast monitoring

For advertisers and content owners, it is vital to know when and where their content has been played. Traditionally agencies or advertisers have to manually audit the presentation. At scale it only can be checked through a statistical sampling method. ACR technology enables automatic monitoring of the content played in TV. Information like the time of play, duration, frequency can be achieved without any manual effort. Many people have expressed some concern however on the information that these smart TVs are sending out to the companies collecting this data. However there is an option in almost every set to disable this feature.

The alternative approaches are video based automated content recognition technologies. These are a suite of technologies that revolve around the convergence of video and TV Everywhere which will render the audio and digital watermarking methods incapable of handling the millions of unique streams going out and billions of hours of footage to be reviewed with metadata extracted or enriched in relation to the content in real-time. Where Acoustic fingerprint fails in its reliance on a database of reference fingerprints. Digital watermarking relies on intrusive frame by frame production stage imprinting on every piece of content. The effectiveness of these techniques have been challenged based on their presumed inability to effectively scale to the amount of video being generated. In practice for monetization and other user based ACR applications the reference database or presence of watermarks only has to cover those videos that are targets of monetization. For example, a video that is hosted on YouTube and viewed only once does not need to be present in a world wide ACR database or be impressed with a watermark.

ACR technology providers

ACR service providers include ACRCloud, Audible Magic, Digimarc, Gracenote, Axwave, Kantar Media, and Shazam.


Automatic content recognition Wikipedia

Similar Topics
The Cowboy and the Blonde
Yevhen Bredun
Moshe Meron