Second-screen is THE hot topic in the industry these days : as a recent study on the US market confirmed, tablets have become the most used second-screen over laptops when people watch TV, and up to 70% of the people do use additional devices during viewing sessions, in order to get additional information on the watched content or to exchange on social networks. And the tablets adoption rate is not gonna fall down as more and more competition provides new entry level tablets at less than 200 bucks.
Tablet are becoming a commodity hardware in the households and this raises great interest in the broadcast world for at least two reasons : always connected devices provide realtime qualified and qualitative information on the viewers, and those devices allow to extend the user experience beyond the TV show with synchronized datas being streamed throughout the show – thus favoriting viewer engagement and increasing the added value of the TV channel over competitors.
Some pioneer second-screen applications have been deployed over the past year, so we now can find various use cases for content synchronization, from live TV shows to BluRay and soon VOD, for a wide range of interactive features like quizz/votes, branded merchandising, interactive ads and so on (high creativity required here)… To make the connected devices aware of the content, all of these apps use either automatic content recognition (ACR) via fingerprinting or content tagging via audio watermarking, using the tablet’s microphone as the audio capture source. There are some other techniques, more or less identified, that can also be used, we’ll see it later on. Before going into a walkthrough of the available implementations, let’s examine the two main techniques they use :
- Audio watermarking consists in analysing the audio track to reveal positions in the signal where, considering the signal masking characteristics of the human audition, we can hide some digital codes without affecting the sound quality of the original. The main advantage of this technique is that, if you are a TV channel or a content owner, you can encrypt the data you are injecting in the stream, thus making other actors unable to exploit your tags. The data injection occurs on short intervals (like 2 seconds or less), so the technique provides a good time accuracy, which is a good point if you want to use the system also for consumption feedback. Drawbacks are that the bandwidth available is not so big (only several bytes in each audio segment) and that the technique is inhibited by long periods of silence where we can’t hide anything.
- Fingerprinting consists first in taking an imprint (also referred as “signature”) of the audio track of the video contents and store it into a database. The client device takes the same type of audio imprint on short timescale (like 5 to 10 seconds) and sends it to the server which then searches for it into the stored imprints collection and returns the content ID when it’s found. This technique has a main advantage : you don’t have to be the owner of the contents to analyze it, so this is a perfect way to build synchronized services if you are not a TV channel or a cinema studio. But it has also many drawbacks : you must run a huge server infrastructure to compare the signatures, the signature doesn’t allow to identify the distribution source, multi-lingual contents require several signatures, and finally you need to have access in advance to the fresh contents to generate signatures before their air time- live fingerprinting being a kind of challenge at high scale.
In both cases you need to run a solid CMS/backend platform to prepare and serve the synchronized contents to your companion app and track the usage over time. That’s definitely something not obvious considering the large audience which come with the most popular TV shows – hence this is really a challenging point in second-screen projects. This being said, we can now examine the different ways a TV channel/content owner/3rd party actor can build its own synchronized second-screen platform…
So let’s take a look at the different options available on the market !
Audio watermarking options
Coming from the audience measurement world, Nielsen has been the first company of its kind to release a second-screen oriented version of its technology. Hence it’s surely the most packaged offer of its category, in terms of architecture completness.
Media-Sync is a SAAS platform offer including all needed components (CMS, App Server, Analytics Engine & Reporting Dashboard) to build the service in all its aspects (no API for the CMS seems to exist as of now). They provide an iOS SDK for building the tablet app – and more platforms support (presumably the Android version first) has been announced for future releases. On the production side of things, you have to use a Ross NWE-3G encoder to watermark your live broadcast signal.
As regards their references, Nielsen has been powering several ABC shows like My Generation (the first field experiment) and Grey’s Anatomy or The Weather Channel – and we can wait for more to come in the coming monthes…
Civolution’s technology is also a derivation from the initial audience measurement use-case (powering in France Médiamétrie’s services). Civolution’s offer positionning is different for Nielsen’s one as it is a technology offer, not a SAAS one. Civolution proposes the licencing of their technology as a SDK for iOS, Android and Windows Phone 7 to be used as the foundation for the tablet applications. As regards production, they use the Axon DAW77 watermarker and interestingly, they also propose a plugin for Rhozet farm transcoding as well as a software for embedding the watermarks in post-production files (MXF, GXF, PCM, MPG). Working with Civolution compared to working with Nielsen means longer time to market as you have to build the backend and risky tech position as you have to deal with the big server infrastructure needed by this type of service, but in the long run if you have a tough tech-team you’ll surely end up with a more custom platform that provides the level on flexibility (challenging point) and integration that you’ll need.
For the moment, their technology has been powering several apps or platforms : two Ex Machina apps for The Voice of Holland and Germany’s Top Model, as well as the Dexter app from Miso and the second-screen services from HyperTV. It’s a young technology offer (officially launched during IBC11), so keep a close eye on it !
Here is a newcomer on the market, direct competitor to Civolution (same market position). Intrasonics is based in Cambridge and its product range powers Ipsos MediaCell radio audience measurement system.
They provide iOS and Android decoding SDK (launched in October 2011), as well as watermarking software, and their own hardware live watermarker (exact model unknown). Their watermarking technology is the one that the BBC used to build its pilot broadcast apps for the National Lottery (read about it here and here).
This is surely a technology provider to follow as new apps will be built based on their SDK.
Other audio watermarking options
There are several other options that you can find on the market today, more or less mature/complete :
- Technicolor MediaEcho : they just released this iOS/Android technology combo which uses watermaking for live and VOD, and BD-Live technology for BluRays synchronization – so they got the most complete use case cover on the market now. The Fox app “Sons of Anarchy” is the first to feature branded show merchandising. They also release a bonus app for the King’s Speech BluRay. That’s close to Disney second-screen discs made with TVPlus technology.
- Shazam for TV : also available for iOS/Android, Shazam is trying to bring to video the same kind of magic that worked so well for music. So far they just worked with NBC on a “Covert Affairs” season 3 app for merchandising, but it’s pretty sure they’ll come along with other agressive market positions in the near future.
- EVS C-Cast + IP Director : they provide an interesting solution dedicated for sports second-screen with a combo of two products managing multi-camera setup, backed by audio watermarking (unknown technology, probably white-labeled) – see app interface on Dmitry Ivanov’s blog.
There are several recent projects based on fingerprinting which provide iPad apps adding metadata, extra content and social activity around TV shows, two in the USA ( Yahoo’s IntoNow with its Soundprint technology and Umami) and one in the UK (Zeebox, founded by Anthony Rose, the BBC iPlayer guy). All of these projects are willing to sign deals with TV channels/content owners to add extra contents, so it would be a good opportunity to easily enter the second-screen market if you’re in the target countries. Of all three actors, Zeebox is the one which shows up the more territory expansion objectives, and certainly the most advanced in the path to provide an API to build third-party widgets and plug-ins for its app.
If you are more of a control freak over your contents and synchronized datas, then your best option is to build your own ACR service using the bundle of products of Zeitera and Ensequence. Ensequence brings its iTV Manager which manages all the interactive app creation/content templating/deployment/measurement, with a new extension dedicated for companion apps. Zeitera brings its Vvid Content Identification System (the fingerprinting brick) with a client API to build the app, ingest systems with API (file or stream input) and a search platform fo storing and matching signatures. Interestingly, this platform can be licenced either as a SAAS service or as a on-premises licence – so the production workflow is quite flexible.
Another interesting option is Synchronize.TV‘s offer (with its basic synchronization API free) which charges only for advance services ans SLAs, and manages all the signatures storage/search on their servers. Without the backend side of things you can rely on Audible Magic SmartSync Media Synchronization API which works on iOS and Android, but you’ll need to build the backend in the cloud (nice dev project).
To finish, just a few words on three other options which technical basis is more or less identified…
- TvTak : they uses video recognition to match live and pre-recorded contents, and offers an iOS SDK to build the apps
- Never.no : after releasing their interesting Interactivity Suite product, they have developped an extension for companion apps. The nice thing here is that that the suite drives broadcast CG like Vizrt and Harris Inscriber in parallel to the web flow, so the integration model is quite original compared to all other offers on this page. The precise synchronization method they use is unclear (“Synchronized Companion App monitors frame-by-frame changes in broadcast programming or live production then uses the changes to trigger delivery of specified content to the second screen”), possibly they monitor frames timecodes and correlate them with the metadatas they received in IP push faster than the broadcast stream.
- Snell Morpheus : the playout engine from Snell embedds “MediaBalls” (hierachised groups of events) inside the program timeline and fires up the events accordingly to the timecodes referenced. The synchronization method at the client’s end is unknown, maybe the same kind than Never.no or a partnership with a watermark technology provider ?
When it comes to synchronized second-screen, it’s not an easy task to find the right combination of algorithm robustness, smooth economical model and quick time to market. I hope this post will have provided you good insights on which technology to go for your project, according to your market position and service requirements. Stay tuned to the second-screen section of my video curation for future updates on this exciting topic !Nicolas Weil