Complete guide to Machine learning in the media industry
For the past few years now, the increasing digitalization of customer journeys and the exponential improvement of cloud technologies and computing capacities have invited media groups to rethink the way they do business. If it sounds like I’m using long words to say “digital disruption”, trust your instinct. Many of these disruptions have been centered around the mountains of data media groups have access to, and what Artificial Intelligence (AI) (and machine learning more specifically) could do with it. Indeed, while artificial intelligence has been fully embraced by a plethora of pure players (Spotify, Netflix, Buzzfeed, Disney…) traditional actors are still lagging, and now see the technology as a shortcut to a much-needed renewed growth.
Below is an exhaustive look at the use cases currently being implemented by the old guard throughout the world.
1. Creation process optimisation
Metadata creation/indexing automation
Using machine learning, artificial intelligence can both translate data and use image recognition to automate the creation of metadata for all types of content (describing an image, for example, to make it easier to find via Google). This makes it easier for both internal and external stakeholders to discover content by allowing searches to be carried out with finer criteria, for more precise results. Automatic indexing (as well as the conversion of a multitude of data formats) speeds up the work of journalists, facilitate the verification of facts, and allows humans to concentrate on tasks with more added value. Fox, The New York Times, BBC… Dozens of media companies have already implemented these solutions across their operations.
Automation of article writing and video creation
We’re also seeing the emergence of tools automating the writing of articles with low added value, thus allowing journalists to work on subjects that require both more investigation and more specialized expertise. Syllabs, for example, offers journalists bots to automatically process election results: their Data2Content tool generates texts as soon as information arrives on the Ministry of the Interior website. The Washington Post's Heliograf has been doing about the same with sports since 2016. Other such tools can greatly speed up article writing or video creation by using voice recognition to translate audio information into text. This text is then directly “re-encoded” to the video, which makes it easier to find, verify and potentially modify.
Note that we're discussing very simple tasks : most machine learning tests going beyond factual reporting have failed (often hilariously).
Identification of emerging trends
The first journalist to break a news-story gets all the glory, so the the story goes. As such, it is paramount to control and anticipate information movements, but also track its dissemination in real time to stand out from competitors. To that effect, the R&D teams of the Reuters press agency have developed the News Tracer tool which identifies significant events on Twitter thanks to an algorithm. This tool assigns these events a “media score” which makes it possible to focus primarily on the most significant events. News Tracers is also able to generates a note of confidence on the veracity of these events.
This last point is particularly important, as it meets the paramount requirement of reliability of information expected from the media.
Indeed, some algorithms can also help journalists verify the reliability and accuracy of images and videos posted on social networks. ClaimBuster and FactMata, for example, are two start-ups using intelligent algorithms to combat false information and deep fakes. This is done by assigning confidence indices to the content, based on data provided by end users.
2. Distribution, personalisation and recommendation of content
Although the opportunities presented above will indeed have a disruptive impact on the media market, They merely represent a small fraction of the opportunities offered to the media industry by machine learning. Although the automation of content creation is eagerly awaited by science fiction enthusiasts, the major impact of A.I will be through the process by which the content is adapted and presented to audiences.
As the market produces oceans of content of very variable quality and attractiveness, an effective pairing is one that converts this mass into a set of value proposition adapted to each final consumer or group of final consumer. This granularity can vary in several ways:
It can mean the right content in the right place
Take Buzzfeed and its 400 channels of distribution, for example. Using human intelligence to put the right content in the right place would be time-consuming, and any optimisation would be deeply imperfect. As such, the company used artificial intelligence to inform the probability of the virality of an article and promote it on the channel most suitable for the public having the greatest chances of appreciate. This probability is the product of a joint effort by the product, social media, engineering and data science teams, who developed a machine learning model based on historical data of high-performance content.
It can mean the right content at the right time
The dead come back to life. Zombie, a machine learning solution used by Swiss magazine Le Temps, is able to identify the magazine’s best articles by cross-referencing its archives with data from Chartbeat (which provides data and analyzes to global publishers) and Google Analytics. The algorithm then assigns a relevance score according to qualitative indicators (reading time, audience history, engagement and debate aroused on social networks …) and advises the best time to republish and reach new audiences. This gives a second life to the content, and the ensuing revenues can go straight to the bottom-line.
It can mean the right content for the right group of people
Business Insider has more content than anyone can (or wants to) read or watch. Thanks to the solution from the start-up Sailthru, the publisher has therefore created profiles based on the history of the content consumed (traced by cookies). Depending on their profile, the readers are offered suitable content, both on the site and in theirs emails. This investment in the segmentation of its readers has enabled Business Insider to increase its click-through rates by 60% and its click-through rates in newsletters by 150%. In addition, return traffic to the site jumped by 52%.
It is paramount to note here that such tools very much have the power to further polarise our societies by providing content that aligns with existing views, and that their use in the media should not be accepted lightly.
It cannot (yet) mean the right content for the right person
An algorithm can make decisions based on contextual data and customer journeys, but will hardly be able to understand the preferences of each one of those customers : likes and dislikes are rarely stable and are usually highly contextual. The variety of human tastes still baffles machines for the time being. As such, any personalization at the level of the individual must, for now, be assisted by a curator. Nevertheless, many such roles are merely postponing the seemingly unavoidable : according to a Reuters study, almost three quarters (72%) of media players plan to actively experiment with A.I in order to improve recommendations and increase production effectiveness.
As monetization is one of the most important lever to traditional media’s survival, many use cases are starting to emerge in this area, from customer engagement to advertising space.
Many major publications use a paywall (a system whereby access is limited to users who have paid to subscribe to the site) in one form or another. It often comes in the form of blocking content after reading a finite number articles. Here too, Artificial intelligence offers a more dynamic and less coercive value proposition. Take for example, the Neue Zürcher Zeitung, a Swiss press group that uses an algorithm combining a hundred criteria to determine when the internet user is most likely to trigger his paid engagement. Once the right moment is detected, a personalized landing page appears for the prospect. With a conversion rate multiplied by 5 in three years, the results are convincing.
We can imagine that Real-Time Bidding (RTB) will find new dimensions when propped up by smart algorithms, particularly on the supplier side: a Machine Learning algorithm could optimize the selection of applicants, manage the costs of data transfer, improve visibility of performance, avoid fraud or even check the quality of advertisements. However, such examples are still very rare on the market today.
A.I will influence all parts of the media value chain, helping content creators to be more creative, content publishers to be more productive and consumers to find content that suits their interests.
Artificial intelligence is nevertheless only a part of a logic of continuity of business optimization strategies and, as such, should not be considered as a revolution. This fact in now way minimizes the work behind its implementation: as scientist Peter Skomoroch mentioned last year : "you can expect your company's transition to machine learning to be roughly 100 times more difficult as your transition to mobile. "